-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I/O completion based operations and cancellation #1278
Comments
I'm not sure I understand the exact problem. I'm not familiar with Windows IOCP, but for Linux aio I don't see any major problems. However it must be supported by the futures' executor (so many this isn't an issue for the futures crate). Take P.S. I could be missing something here, please let me know. |
Here's some docs with an example of cancelling async IO on windows. They block and wait for the cancellation to complete before continuing. I wonder if it's easy enough for the IO handle to record that it has requested cancellation, then handle waiting for it to complete the next time an IO operation happens. It could cause issues if there were errors cancelling, but maybe you don't care about those? |
If Windows functions are blocking can't I/O pools be used? Much like As for the handling of errors when cancelling we could follow the standard library when closing a file, ignore it. It's always a race when cancelling something, I wouldn't know what else could be done at that point (other then informing the caller). |
It's not that the cancellation itself is blocking, the cancellation returns an async completion and they block in the example waiting for it to complete. An I/O pool can't be used because
This is just cancelling a single operation though, the I/O handle is still open and can be re-used for another operation e.g. async {
let mut reader: impl AsyncRead = { ... };
let first: Either<io::Result<[u8; 32]>, Timeout>
= await!(reader.read_exact([0; 32]).timeout(Duration::from_secs(1)));
let second: io::Result<[u8; 32]>
= await!(reader.read_exact([0; 32]));
} If the first operation times out and is synchronously cancelled by the |
IOCP cancellation and Linux So the drop method would look somewhat like this: impl Drop for IocpFuture {
fn drop(&mut self) {
CancelIo(self.handle);
// Block here until the completion queue signaled the real completion
// Since the completion queue might be handled by an IOCP reactor thread,
// we must synchronize here somewhat with it.
self.wait_handle.wait(); // Assuming the IO reactor signals the wait_handle
// Cleanup resources
}
} Since the cancel operation is best effort, blocking here might take a long time, and thereby might block the futures executor, which is definitely not intended. The completion based networking API on the higher level might be even harder to solve: We shouldn't be able to cancel while the message/packet is half-transmitted/received, since that would corrupt the state of the messaging channel. So we would need to synchronously block here until the transmitting component has fully completed the operation. Which is not the natural thing, if the component (e.g. Websocket client, MQTT client, etc) is working asynchronously on top of futures too. With the other approach we would signal cancellation to the client, and perform the normal asynchronous wait / |
@Nemo157 @Matthias247 Good points. I no longer see any way cancellation is possible in a non-blocking synchronous way. So for memory reasons alone the buffer must be forgotten ( @Nemo157 I think the output of your example would be hard to define, because the first operation may or may not succeed after being cancelled. Which means that in some executions the |
@cramertj, @aturon, @carllerche Would be interesting to get an input from you too. I found another example of the behavior after reading the article from @sdroege about glib/gio and futures integration (which is very cool btw). Here I checked the original GTK APIs (e.g. https://developer.gnome.org/gio/stable/GOutputStream.html#g-output-stream-write-async) against their Rust wrappers ( http://gtk-rs.org/docs/gio/trait.OutputStreamExt.html#tymethod.write_bytes_async_future). It seems like to fulfill the futures contract the transmitted buffers utilize some extra boxing (and potentially copying). And the cancellation might not be fully synchronous (operation might continue to run for a while after the future was dropped, until the underlying GTK callback was delivered), which could lead to unexpected side effects when the user wants to directly start another async operation after dropping the last one. While the wrappers are already very cool, I think they are not yet as much zero-cost as Rust futures potentially could be, since they force additional heap allocation of some arguments as well as asynchronous operation state. |
@Matthias247 Thanks for bringing this up. In the context of GIO async operations there is not that much that could be improved here though, but generally for completion-based/callback-based async APIs there some opportunity and thanks for working on finding some solutions to improve the situation :) In case of GLib/GIO:
|
Thanks for your comment!
|
My summary of the situation is: In order to decrease requirements on boxing parameters to future based operations, and to better support asynchronous cancellation, mit might be helpful to enforce that some (or all) kinds of futures are driven to completion. This is already possible with the current APIs. However the current ecosystem relies on the fact that dropping futures at any point of time is possible. If a future type which relies on being driven to completion gets prematurely dropped it can cause memory issues, e.g. if the pinned future acted as storage location for the underlying IO resource while executing the requested action. So I think this would only be a safe feature if it can be avoided through either the type system or lints. I don't yet have a concrete proposal on how this could be expressed, but I'm pretty sure others would have ideas (if there is interest in it). If some futures must be driven to completion some of the currently existing combinators would obviously no longer work, since they partly on being able to cancel some operations synchronously. That could be either fixed by standardizing another kind of cancellation mechanism, or by supporting them only for the safely droppable futures. But this kind of discussion could behind the more general one (if there should be run-to-completion futures). I'm also a bit torn back and forth if these ideas should be supported in the futures ecosystem:
|
The problem here is that you don't statically know any shorter lifetime than
That's generally true but depends on the actual implementation. E.g. a socket allows a concurrent read while a write is running. But good point on this, it's a problem as you have no way of knowing when the actual operation is cancelled with the futures-based API. The non-futures API will call your callback with an error Anyway, we're getting off-topic here: https://github.com/gtk-rs/gio/issues/156 :) |
Cancellation has some interesting traits, and based on that I'm going to propose that those operations need asynchronous cancellation implement safety by owning all required objects (buffers, handles) in the futures itself. First, on cancellation:
Therefore, implementing cancellation in a way similar to garbage collection sounds good to me. If anything need to Drop asynchronously, they move the needed objects out of the struct then hand it to the executor so they complete at some time. |
I don't think
I partially agree. From a program correctness point of view one probably doesn't care whether the resource is still in use - the program potentially wants to do something else after the cancellation anyway. However there is some downside, that you already tangent with the term "Garbage Collection": I think it's then also important for implementations to avoid undefined behavior after a cancellation - e.g. by making sure that the original IO resource always gets closed (or at least is marked as closed). Otherwise the following piece of code will lead to serious undefined behavior: trait FutureBasedAsyncWrite {
type Result<'a>: Future<Output = Result<usize, Error>> + 'a;
fn write(&'a mut self, bytes: &'a [u8]) -> Self::Result<'a>;
}
async fn writeCancelWrite<IO: FutureBasedAsyncWrite>(io: IO) {
let writeFut1 = io.write(buffer);
let timer = startTimer(...);
select! {
_ = writeFut1 => { /* Not interesting here */ return; },
_ = timer => {},
}
// The operation is cancelled and the future is dropped. However the IO resource might still
// be in use.
// If we now start a new operation from the view of the underlying primitive and OS 2 async
// operations might get in progress on the same resource, which is often not supported.
io.write(buffer2).await;
} Maybe we can try to solve this issue with exhaustive documentation on how to model those Another obvious downside is that we can expose those operations only via "owned" buffers (e.g. |
I don't have plans to add any aio-specific features to futures-rs specifically, so I'm going to close this for now. that said, if anyone is working to develop Rust AIO systems that could benefit from futures-rs changes, let me know and I'll be happy to discuss them! |
TLDR:
Drop
.Future
s to be polled to completion, and add cancellation signalization via another channel thanDrop
.Longer version:
I spent some time during the last days to think about whether and how I/O completion based operations can be efficiently modeled with futures and async/await.
With I/O completion based operations, I'm e.g. referring to APIs like:
What these operations have in common:
My current thought is that the support of futures.rs for these kinds of operations might not be as good as it could be:
With approaches as described in #1272 it should e.g. be possible to store I/O operation buffers inside
Future
implementations for the whole duration of the async operation, without passing the ownership of those buffers to the sink. However this isn't possible if the underlying API does not support synchronous cancellation. The operation (e.g. IOCP) might still refer to the data after theFuture
is dropped, up until the point where we get the confirmation through the completion queue. However blocking on a completion queue inside a constructor is not advisable for an asynchronous system.The workaround is to hand over ownership of all the buffers to the implementor of the future - most likely in a refcounted fashion, so that the data still can be safely referred after the
Future
is destructed. This approach can e.g. be seen in traits likeSink
. However the drawback here is thatdrop
) the transferred item is completely lost and can no longer be referred to.In the current state of the ecosystem patterns like passing refcounted owned refcounted byte buffers had been established, and there is a preference for readiness based APIs instead of completion based ones (e.g. in the
Sink
trait). However for more resource constrained systems (embedded) and for some kinds of API types these things might not be preferable. Lets e.g. pretend I want to build a minimal resource using Websocket client, whereclient.send(message)
returns aFuture
that represents the transfer. Since the websocket framing may not be corrupted, we can not stop sending the message at arbitrary points (e.g. when only half of it's bytes had been written). So we would need to provide the client a copy of the full message, and not only provide it a reference to it. We also might not be able to go further and e.g. use the send futures state as an intrusive operation, which is queued inside the Websocket client, as thought about in #1272 .I would be interested to learn whether others have also already thought about the problem, and whether there are other good ways to model these kinds of operations in a zero-overhead fashion.
I came up with a modification to
Future
semantics, which should be able to allow us operating on borrowed data:Futures
would always need to be polled until completion. Where completion is when the I/O subsystem has signalled that the resources of the future are clearly no longer needed.Future
creating methods, like aCancellationToken
, as pioneered in other languages. The parameter could obviously be either part of a standard library, or non-standardized (like there is no standard cancellation defined for Javascript Promises, but library authors could still add extra facilities).There is definitely a lot of precedent for APIs which differentiate between signalling cancellation and waiting for it to complete, e.g. Go's
Context
APIs, .NETTask
cancellation facilities, Kotlin coroutines. It also matches well to the described C APIs. So this seems to make sense.Always waiting for things to complete also seems to match better to the "structured concurrency" methodology which currently seems to get a lot of positive tailwind.
However a change in semantics like this is obviously very invasive. It would not only require leaf
Futures
to be modified to, but also would require all combinators to be adopted (they can't drop any non-completed path anymore). So this would need to be carefully evaluated. If a model like this is chosen, it would be great ifFuture
s in the non-completed state can't even be dropped by users to avoid possible usage errors and leaks/corruptions. Maybe through something like#[must_use]
in the non-completed state.The text was updated successfully, but these errors were encountered: