-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Completing the Nim async/await implementation #304
Comments
Can you speak to the obstacles preventing Nim from adopting Chronos and the advantages of two parties maintaing separate and unequal implementations? |
The only obstacles I foresee are related to breaking backwards compatibility. I'm not sure to what extent the two have diverged, if it's just as simple as replacing Nim's current async/await with Chronos then that would be nice. But I'm not sure that is the case. Looking at https://github.com/status-im/nim-chronos/wiki/AsyncDispatch-comparison, the following seems to be the biggest divergence:
At the end of the day I would defer the question to Status, they are the ones that hard forked async/await after all and they know how the two have diverged since then. |
Looking more into CPS and async, it seems to me like CPS are actually lower levels than async/await. In particular they replace closure iterators (and potentially inline iterators). This means that a deprecation path would be to reimplement closure iterators in terms of CPS so that current async/await applications can still work as is while we figure out the path forward. Then the questions become:
Regarding the issues, as a user I see:
|
My main criticism of async / await in Nim (I don't know whether it's a documentation issue, lack of understanding on my part or something technical): If I have some That's why I never use async/await, but rather simply create a second thread that I ask if it's done via a channel. Simply because for the things I needed async/await for there isn't an I might have more constructive criticism, if my async usage went beyond exactly two (almost identical) usages of |
@Vindaar That is something I've also found to be a pain point. A classic example is even featured in my book: reading from stdin. I think that the proposal I've written about above would solve this pretty well, simply spawn the blocking proc in a new thread and await it. Job done :) @rayman22201 even made a great start on it in nim-lang/Nim#12232 and nim-lang/Nim#12372. We could get pretty far in resurrecting this.
That really depends on what you're doing. If you're writing a native Nim version of something then that should be fairly trivial. If you're wrapping a C library and want to use it in an async way then it gets trickier, and it would be nice to set up a tutorial series for this. |
Both AsyncDispatch and Chronos pay a hefty price for some fundamental inefficiencies in the current async approach: If backwards compatibility is to be broken, Nim can still try to aim higher than the minimal upgrades suggested above. Otherwise, I happen to know that @alehander92 has implemented cancellation for AsyncDispatch in a private branch. His approach is different from Chronos - the inspiration was taken from the way C# implements cancellation. A very brief explanation is that the code relies on something called "cancellation token" which is passed as a hidden argument forward in the async calls. This avoids the "backwards" links that Chronos must maintain from nested futures to the parent that spawned them. The backwards links are somewhat more problematic because they must be stored in a dynamically allocated container and they introduce cycles that will create more difficulties in an ORC/ARC scheme. |
This was discussed with Araq and one solution I proposed is Basically you can have a return value be either proc foo(a, b, c: Parameters): Future[RV] =
result.obj = create(RV) # Allocation on heap
template foo(a, b, c: Parameters): lent Future[RV] =
# template needed for alloca
# do we want a typetrait `when escapeCallerScope(result)` instead?
result.obj = alloca(RV) # Allocation on stack Paper: Herlihy et al, Well-Structured Futures and Cache Locality, https://arxiv.org/pdf/1309.5301.pdf Rust approach with readiness-based/poll-based future is another approach with other tradeoffs to ensure that futures can be allocated on the stack. I think forcing futures to be evaluated in the caller or one of its child might be too restrictive hence I prefer a borrow-checker based optimization. |
I think that for now is no need to completely switch to something which drops the compatibility with current version.
|
Exactly. This alone pretty much kills the appealing of async. Even without considering other problems. The async programming is ok in JavaScript (I used it for many years in UI and Server). The concept itself is questionable, forcing human to do machine work (writing state machines-like code to handle control flow, sugared with But adopting async concept in a language where you can choose other ways, not a good idea. |
I've been thinking about how to pass from the sync to async world and vice-versa and also how to allow 2 event loops to cooperate including asyncdispatch or Chronos but also event loops that we cannot ever hope to make compatible as a common primitive like CPS #295 cannot be built (for example Nim and JS/libuv). At first I thought we could use AsyncChannel see:
But instead I think we should build async condition variable or semaphores. Here's why: Overview of the issuesScenario 1. The async event loop is about to do a blocking operation or hands off heavy computation to another thread. It delegates work to a threadpool. It needs to subscribe to some "channel" that will hands the result back. In the sync world, this is a Flowvar. For async, that subscription should hand over control to the event loop until result is ready so Scenario 2. Three services are cooperating, an UI thread, a networking thread and a worker thread. The UI thread cannot be blocked at all cost (see recent HN all-star post Speed is the Killer Feature), the networking thread can tolerate a couple second delays but more and timeouts kicks in unless you send keep-alive, the worker thread has no latency constraints. They establish one-to-one channels for each kind of processing requests and one-to-one channels for the result.
AnalysisMaking Flowvar and Channels awaitableThis adds too much baggage to Flowvar and Channels.
Forcing to use the same type for both will very likely bite us down the line.
Adding async synchronization primitivesInstead, we need the same synchronization primitives in Async as in the threaded world, so async locks and async condition variables (or async semaphores). Then wrapper types can be type
AsyncFlowVar[T] = object
fv: FlowVar[T]
acv: AsyncConditionVariable
alock: AsyncLock
AsyncChannel[T] = object
chan: Channel[T]
acv: AsyncConditionVariable
alock: AsyncLock
AsyncRendezVousChannel[T] = object
## Unbuffered channel, both producer and consumer must wait for each other. Needed for CSP (Communicating Sequential Process)
chan: T # Notice that we don't need a Channel[T] (assuming ARC/ORC or Boehm GC)
acv_eventLoop0: AsyncConditionVariable
alock_eventLoop0: AsyncLock
acv_eventLoop1: AsyncConditionVariable
alock_eventLoop1: AsyncLock When the value is ready, the async condition variable of the consumer is signaled. This can be used with any channels and allow communication between any event loops and also any threadpool or multithreading runtimes. This also does not pollute carefully tuned multithreaded primitives with kernel syscalls except at the interface between async and threads. Implementation of async condition variable/event notifierIn all use cases where we need an async condition variable, there can be only one waiter, the event loop, but there can be multiple producers/signaler. This is unlike classic condition variables which can have multiple waiters/consumers. I happen to have a high performance MPSC threadsafe event notifier in Weave for putting waiting threads to sleep that is formally verified: proc park*(en: var EventNotifier) {.inline.} =
## Wait until we are signaled of an event
## Thread is parked and does not consume CPU resources
## This may wakeup spuriously.
if not en.signaled.load(moRelaxed): # or `while not ...` if spurious wakeups are a problem
if en.ticket == en.phase.load(moRelaxed):
when supportsFutex:
en.futex.wait(0)
else:
en.cond.wait(en.lock) # Spurious wakeup are not a problem
en.signaled.store(false, moRelaxed)
when supportsFutex:
en.futex.initialize()
proc notify*(en: var EventNotifier) {.inline.} =
## Signal a thread that it can be unparked
if en.signaled.load(moRelaxed):
# Another producer is signaling
return
en.signaled.store(true, moRelease)
discard en.phase.fetchXor(1, moRelaxed)
when supportsFutex:
en.futex.store(1, moRelease)
en.futex.wake()
else:
en.cond.signal() You just need to replace Note that in Weave this is also used in association with my internal channels. |
Sounds like a novel idea to me to align the async synchronization primitives API with concurrency primitives API in 1:1 fashion. Do you think it's doable to make the APIs not just aligned, but actually syntactically identical? I mean, why to have both |
I think even though they are very similar at API level (to not say that they are the same), concurrent programming and parallel programming solve different problems. From a communication perspective because the underlying implementation is different, the problem domains are different and even the performance expectation (latency vs throughput) are different, it's best to color them differently for the language primitives. Also from a maintenance and development perspective, the skill set for concurrent programming and parallel programming is very different. It might also be that new paradigm in either of those domains require evolving the Future or Flowvar and we don't want to be shackled by the other domain. A library on top can then easily hide the differences for convenience or it can be lifted to an |
@mratsim A few things. I stopped working on Virtual Async Event for several reasons, partially because I thought it was a dead end for some of the reasons you describe. I like your idea of adding semaphores. They do have some of the same disadvantages as threaded semaphores though. You can get dead-locks and live-locks. Even though it's not truly parallel if no threads are involved (no true race conditions), you can still get non-obvious orderings of async callbacks and run into trouble. I still like the idea, but it's just something to be aware of. |
Those are formally verified to have no deadlock and no livelock if you do: If the consumer has the result before entering "park", suspending will be cancelled. If it has the result after, the consumer will be "waken up". I agree with the rest of your points, I do think it's important though for any of the Nim event loops so that they can cooperate with any other event loop/runtime. |
I'm likely missing something, but from my perspective you have no way of getting around this. The event loop needs to be notified somehow when a channel's/lock's/whatever's state changes. This needs to be done with a file descriptor on POSIX as that is what epoll/kqueue check for readiness. Am I missing something here? As a more general point, getting @rayman22201's work into Nim shouldn't be too difficult and I believe it will work for the vast majority of cases. Call it a hack if you want but sometimes hacks can go a long way when it comes to impact. Also, please don't overload this issue with discussions around various thread/channel/semaphore concepts that are a reflection of Nim's channel/lock/threading implementation. I'm specifically referring to the many different channel variants you've listed. I think their evolution should be discussed separately. |
The event loop doesn't need to be notified of channels or flowvars changes that are scheduled within another executor, be it Chronos, a threadpool, Weave, libuv, ... It's only at the interface between both executors that there is extra FD bookkeeping needed.
I didn't criticize VirtualAsyncEvent, I criticized the end goals which is to put them in Nim channels and Nim Flowvars. There are multiple problems with that:
They are not a reflection of Nim implementation, they show that there is no one size fits all and that any solution for making asyncdispatch work with the sync/multithreaded world shouldn't assume anything about channels and flowvars implementation. I'm proposing a composable building block that makes no assumption about Flowvar or Channels you just need (SomeSyncObject + AsyncConditionVariable) and you have a way to make asyncdispatch work well SomeSyncObject. |
Reading this I think you must be misunderstanding the proposed implementation. The FD only needs to be allocated whenever async/await (asyncdispatch) wants to be notified of it's completion (i.e. the "interface between both executors"), this won't happen unless asyncdispatch is actually used. You will be able to spawn as many threads via It's not as if the file descriptor gets created whenever the FlowVar is created. You most certainly will not be forced into a situation where multithreading code suddenly needs to open 2^30 FDs. You really think Araq would allow such a change to be merged? :) The same applies to channels of course.
I see both 1 and 3 as the same point. These are fair concerns, but FlowVar only needs to store an FD (which is a fundamental building block of any event loop). It is important to note that VirtualAsyncEvent might be problematic though, but I don't think there would be a problem in making it reusable by other event loops.
I mean... that is the whole purpose behind VirtualAsyncEvent. The reason VirtualAsyncEvent was created is to make it possible to await channels/flowvars.
I'm just responding to your claims around what I and @rayman22201 proposed as it seems they are incorrect. I haven't had a chance to look at your proposal in detail yet, but it's possible that it works just as well. It doesn't inspire confidence to see what seem to be incorrect claims about our proposals though. Btw, regarding:
What about if you have an event loop per thread across a couple threads? You could get into a situation where there are multiple waiters. |
Let's get this straight:
It is perfectly fine to ask for clarifications but asking for feedback and then dismissing it without reading it is a slap in the face. And I perfectly understood what your proposal entails: polluting every single threading primitives with an asyncdispatch related FD field. Your proposal goes against everything that Nim contributors are working towards:
To sum up:
It even goes against your initial will of
If it wasn't clear, I am strongly against any proposal that requires adding a file descriptor to channels, threadpools or any multithreading runtime and then having to support them for an undefinite amount of time due to backwards compatibility concerns.
I was indeed incorrect for signaling an event at every channel send, but even then, a FD has no business in the core primitives of multithreading. Asyncdispatch should provide higher-level primitives like AsyncLock and AsyncConditionVariable.
Weave has an event loop, supports events and has no need for a file descriptor so it's not fundamental. FD is fundamental for IO, not for tasks, executors, parallelism or an event loop. FD should be kept in IO-related libraries.
You can use VirtualAsyncEvent to create composable async synchronization primitives like locks or condition variables which are used everywhere in the multithreading world.
Either you make it your synchronization primitive handle multiple consumers or you create one per consumer. Weave has 1 event loop per thread, described image and code type
WorkerState = enum
WEL_CheckTermination
WEL_CheckTask
WEL_OutOfTasks
WEL_SuccessfulTheft
WorkerEvent = enum
EV_SignaledTerminate
EV_FoundTask # Found task in our task deque
EV_StoleTask # Stole at least a task They synchronizes with single consumer channels, without file descriptors, just fine. Many other concurrent models of computation assume single waiter on synchronization primitives (and single producer!):
so it's a not a problem whether in theory, practice or even performance. |
It wasn't my intention to make you feel this way and for that I am sorry. It also wasn't my intention to dismiss your proposal. Please don't take anything I say here personally. I hope you can understand my perspective, your proposal starts by establishing problems with @rayman22201's proposal. Since I believed these problems were incorrect I decided to challenge them and since I assumed your proposal was meant to resolve those problems I wanted to clarify things before reading further. Honestly though, I think these are implementation details. Reading what you've written again you don't seem to have a problem with the high-level proposal of this RFC which is "making FlowVars awaitable", you just don't want to do it in the way that @rayman22201 suggested. So let me be clear, I agree with your points. Yes, sticking an FD into the FlowVar isn't ideal. But it's not because this will create "2^30 or 2^40 file descriptors" (a point from your original comment), rather it's because it goes against what is an elegant solution in Nim (or as you've said in the comment above this: "against everything that Nim contributors are working towards"). Maybe I'm being too pedantic, but these kinds of misunderstanding usually cause problems further down the line, so I just want to get them straightened out. So my understanding of your proposal (and I feel like I only understand it partially, I did read it this time though to be clear) is that you want to offer wrapper types for Channel and FlowVar. Sure, sounds like a good way to not need to modify FlowVar/Channel and make it depend on async. I've got nothing against this and agree that it's a better solution. I also see that you've included additional mechanisms (condition variable/lock) although I'm not quite sure in which circumstance they are necessary. We can discuss this but I don't think it's that important, it's an implementation detail that can easily be discussed in the PR implementing this proposal. |
I've documented a way to do explicit Future Cancellation here, the golang way. |
The Nim async/await implementation has been successfully used in production for many years now, both in real-world use cases (NimForum, among others) as well as in benchmarks (TechEmpower benchmarks via Jester/HttpBeast). It has also been adapted into the Chronos implementation used by Status for their Ethereum client implementation.
All of these projects have established it as a reliable and easy way to write highly concurrent applications in Nim.
As far as I know, there are three things missing, some more important than others. This RFC aims to document what those are and hopefully drum up some support for their implementation. I strongly believe that the effort needed to implement these is in general low, with the impact being high.
But I would also like to open this RFC to others, if not already covered by the list below, what do you feel is missing? Can we implement a solution within the existing async/await implementation?
Concurrency + Parallelism
A big pain point with Nim's concurrency is that it does not integrate well with Nim's parallelism. I think that a simple fix here would be to make it possible for spawned threads (FlowVars) and channels to be awaitable.
I believe that there isn't a need to head towards the Golang Goroutines which mix concurrency and parallelism. For a systems programming language, having these as separate systems is wise and ultimately gives more control. This is particularly important for something that is in the standard library.
Software such as httpbeast have proven that scaling async/await across threads can be done and that it works well. For other non-server use cases an awaitable threads implementation will be a great stand-in, until we can figure out the disadvantages of this approach (if any).
Future Cancellation
Right now, the only way to cancel a pending operation is to close the FD it is working on. This works in certain circumstances, but is somewhat hacky and not flexible enough. A formal way to cancel futures would be ideal.
Chronos already implements this so it can be used as a source of inspiration for how this should be implemented.
Future Streams
The existing implementation has its issues (just search around the Nim repo for issues relating to future streams) and needs a rework. This is an important component to enable streaming large pieces of data to clients.
A nice practical solution here could be to rely on
sendfile
.Useful Resources
The text was updated successfully, but these errors were encountered: