Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual async events #12232

Closed
wants to merge 10 commits into from
Closed

Conversation

rayman22201
Copy link
Contributor

AsyncEvents currently create a new physical file descriptor for every new event. This can quickly exhaust the file descriptors on your system if you create too many events.

This is a big rewrite of AsyncEvent to use "virtual file descriptors" that are multiplexed over a single global physical file descriptor. As a side effect, also makes AsyncEvent threadsafe.

This should be completely compatible with the old asyncEvents. I have tried to maintain complete backwards compatibility of the behavior.

The only thing an end user should notice is that they can have a lot more AsyncEvents than they used to.

This PR will finally allow #11724 to work properly.

@zevv
Copy link
Contributor

zevv commented Sep 30, 2019

@dom96 and or other stdlib people, could you give this some feedback?

I also heard people are concerned about performance because of the locking, but I'm not sure how this is a valid point at this time. There is no alternative implementation to compare performance with, so there is nothing to gain from benchmarking at this time.

If this implementation proves correct, please lets merge it so we can have this feature. then people can complain about how terrible the implementation is and improve on that - while keeping the implementation correct.

@Araq
Copy link
Member

Araq commented Oct 1, 2019

@zevv I'm actually waiting on the followup PRs.

@rayman22201
Copy link
Contributor Author

AFAIK, PR #11724 should work with this PR with no changes.

eventLock: RLock
p: PDispatcher
vFD: VirtualFD
callbacks: seq[Callback]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not threadsafe. seq are thread-local.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The callback seq should only ever be accessed by a single thread: the thread that owns the async event loop.

It would be very odd for a thread to register a callback on a different threads event loop. Even if this were made this thread safe, the callback itself might access memory from a different thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of options:

  • I could factor the seq out and create a global seq in the asyncDispatcher singleton.
  • Or I could make it a single callback. I made this a seq to make it more extensible, but it's not required. (In fact, the old system only allows a single callback to be tied to an AsyncEvent).

Copy link
Member

@cheatfate cheatfate Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old AsyncEvent is able to synchronize different threads, but proposed AsyncEvent is unable to be used in equal scenarios.

Copy link
Contributor Author

@rayman22201 rayman22201 Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cheatfate, can you give an example?

Old async events is not threadsafe. The old AsyncEvent allowed cross thread access to the asyncEvent memory with no synchronization or race protection at all. This system explicitly synchronizes threads through locking.

Copy link
Member

@cheatfate cheatfate Oct 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old async events is not threadsafe. The old AsyncEvent allowed cross thread access to the asyncEvent memory with no synchronization or race protection at all. This system explicitly synchronizes threads through locking.

Windows:
Sorry but not only locks are used to synchronize threads: https://github.com/nim-lang/Nim/blob/devel/lib/pure/asyncdispatch.nim#L810 . Windows OS has events (which in Posix often called conditional variables).
Linux:
Linux version uses eventfd https://github.com/nim-lang/Nim/blob/devel/lib/pure/ioselects/ioselectors_epoll.nim#L112-L118.
So lets check about thread safety of eventfd() via man eventfd:

ATTRIBUTES
       For an explanation of the terms used in this section, see attributes(7).

       ┌──────────┬───────────────┬─────────┐
       │Interface │ Attribute     │ Value   │
       ├──────────┼───────────────┼─────────┤
       │eventfd() │ Thread safety │ MT-Safe │
       └──────────┴───────────────┴─────────┘

BSD/MacOS and all other Posix systems:
All this systems are using OS pipes:
https://github.com/nim-lang/Nim/blob/devel/lib/pure/ioselects/ioselectors_kqueue.nim#L136
https://github.com/nim-lang/Nim/blob/devel/lib/pure/ioselects/ioselectors_poll.nim#L191

And some information about pipes form man pipe:

pipe()  creates a pipe, a unidirectional data channel that can be used for interprocess communication.

So i think if its created for interprocess communication it will be fine to use it for interthread communication.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is example of use:
https://gist.github.com/cheatfate/87b672034be5b71ab6a975266558ed8c
But if you encounter any problems its some internal asyncdispatch bugs, i'm not involved in asyncdispatch development anymore so i'm not following it. But this sample is exactly shows how it must work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cheatfate, your test actually fails on current devel Nim as follows:

/home/ray/Nim/Nim-devel/cheatfateAsync.nim(38) threadProc2
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1879) waitFor
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1569) poll
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1335) runOnce
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(210) processPendingCallbacks
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncmacro.nim(34) asyncProc2NimAsyncContinue
/home/ray/Nim/Nim-devel/cheatfateAsync.nim(28) asyncProc2Iter
/home/ray/Nim/Nim-devel/cheatfateAsync.nim(17) wait
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1556) addEvent
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/ioselects/ioselectors_epoll.nim(368) registerEvent
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/selectors.nim(279) raiseIOSelectorsError
[[reraised from:
/home/ray/Nim/Nim-devel/cheatfateAsync.nim(38) threadProc2
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1881) waitFor
/home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncfutures.nim(383) read
]]
Error: unhandled exception: File exists (code: 17)
Async traceback:
  /home/ray/Nim/Nim-devel/cheatfateAsync.nim(38)                                           threadProc2
  /home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1879)              waitFor
  /home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1569)              poll
  /home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1335)              runOnce
  /home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(210)               processPendingCallbacks
  /home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncmacro.nim(34)                   asyncProc2NimAsyncContinue
  /home/ray/Nim/Nim-devel/cheatfateAsync.nim(28)                                           asyncProc2Iter
  /home/ray/Nim/Nim-devel/cheatfateAsync.nim(17)                                           wait
  /home/ray/.choosenim/toolchains/nim-#devel/lib/pure/asyncdispatch.nim(1556)              addEvent
  /home/ray/.choosenim/toolchains/nim-#devel/lib/pure/ioselects/ioselectors_epoll.nim(368) registerEvent
  /home/ray/.choosenim/toolchains/nim-#devel/lib/pure/selectors.nim(279)                   raiseIOSelectorsError
Exception message: File exists (code: 17)
Exception type: [IOSelectorsException]
Error: execution of an external program failed: '/home/ray/Nim/Nim-devel/cheatfateAsync '

On the VirtualAsyncPR it does not have this issue but hits a race condition between the addEvent and trigger procs.
line 17 and lines 22/28 respectively.

It's possible to trigger the event in one thread, before the other thread has "re-registered" the event. I'm not sure if AsyncEvent was meant to be constantly unregistered and re-registered this way.

Copy link
Contributor Author

@rayman22201 rayman22201 Oct 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following change fixes the race condition I observe with my PR: https://gist.github.com/rayman22201/e6a9dcde22e009dbd397b8755eae93d7#file-asyncevent-nim-L22
That guarantees the event is registered before it is awaited on either thread.

Note: the test still fails with the File exists (code: 17) Exception on Devel.

Copy link
Contributor

@dom96 dom96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't delved deep into the details of how you've implemented this, but did leave a bunch of feedback about how the code is organised.

As an overall question, I wonder whether we could keep the current AsyncEvent as-is and introduce a VirtualAsyncEvent on top of it? This would allow the best of both worlds for those that don't want this virtual logic and should make the code easier to deal with. What do you think?


var data = newAsyncData()
data.readList.add(virtualMuxCB)
result.selector.registerEvent(result.virtualMuxHandle, data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of code would do well to be put in a separate procedure called initVirtualAsyncEvents or the like.


ev.callbacks.add(cb)
ev.vFD = p.nextVirtualHandle
p.nextVirtualHandle = VirtualFD( int(p.nextVirtualHandle) + 1 )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no space after ( or before it.

cast[pointer](p.ovl))
{.pop.}

template registerWaitableHandle(p, hEvent, flags, pcd, timeout,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a proc instead?

Copy link
Contributor Author

@rayman22201 rayman22201 Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK. I suppose so? It was this way when I found it. I saw no reason to change it. I just moved the template up because I needed it earlier in the code and Nim doesn't allow forward declarations.

I didn't want to change more things than I already did. This PR is enough of a bulldozer already...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, I didn't realise you moved it up. It really needs to be refactored IMO...

virtualHandles: Table[VirtualFD, AsyncEvent] # pseudo handles for custom AsyncEvents.
nextVirtualHandle: VirtualFD
virtualMuxHandle: Handle # all the virtual handles get multiplexed through a single real handle.
virtualPCD: PostCallbackDataPtr # needed for the selector callback plumbing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps all these fields could be put in a separate object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I left it all together for simplicity (and I was organically adding fields as I needed them), but I can see the organizational advantages of making a separate object.

@rayman22201
Copy link
Contributor Author

@dom96

I haven't delved deep into the details of how you've implemented this, but did leave a bunch of feedback about how the code is organised.

As an overall question, I wonder whether we could keep the current AsyncEvent as-is and introduce a VirtualAsyncEvent on top of it? This would allow the best of both worlds for those that don't want this virtual logic and should make the code easier to deal with. What do you think?

First, thanks for the feedback <3

Even without multi-threading, the old AsyncEvent is broken. The old AsyncEvent makes it way too easy to exhaust the file descriptors on your OS. It's dangerous and a waste of system resources.

I see this PR as essentially an elaborate bug fix that happens to benefit the multi-threaded use case as a side effect...

Threads and FlowVar is what exposed the bug, but there may be other cases where you might need many AsyncEvents, even in a single threaded environment.

@cheatfate
Copy link
Member

cheatfate commented Oct 3, 2019

Even without multi-threading, the old AsyncEvent is broken. The old AsyncEvent makes it way too easy to exhaust the file descriptors on your OS. It's dangerous and a waste of system resources.

I see this PR as essentially an elaborate bug fix that happens to benefit the multi-threaded use case as a side effect...

Threads and FlowVar is what exposed the bug, but there may be other cases where you might need many AsyncEvents, even in a single threaded environment.

AsyncEvent is not supposed to be used in single-threaded environment, its a tool to sync multiple thread's dispatchers, so its exactly not broken, its purpose is different from what you expected (I think documentation can be improved). To work in single-threaded environment you can use more simple mechanism like this https://github.com/cheatfate/asynctools/blob/master/asynctools/asyncsync.nim and you actually do not need any patches to existing asyncdispatch.nim codebase to start using it.

@rayman22201
Copy link
Contributor Author

@cheatfate Let me abundantly clear here:

its a tool to sync multiple thread's dispatchers, so its exactly not broken, its purpose is different from what you expected

I know exactly what it's meant to be used for. Araq, Dom96, and I used it for it's intended purpose, live on a video stream. And we found this bug.

Let me reiterate the bug for you:
"AsyncEvents currently create a new physical file descriptor for every new event. This can quickly exhaust the file descriptors on your system if you create too many events."

I know that pipes (events on windows) synchronize. There is still a pipe used in this new implementation. It provides the same level of synchronization. The difference is one pipe / event is shared to save resources.

I have added extra locking because the synchronization from the pipe is not guaranteed to protect the memory accessed in the trigger proc.

I just ran your test with this new implementation and it worked exactly as expected.
Thank you for the test, I will add it to the test suite. It's a good one to have.

@rayman22201
Copy link
Contributor Author

@dom96

I wonder whether we could keep the current AsyncEvent as-is and introduce a VirtualAsyncEvent on top of it?

To be clear, I don't think there is a need for two different solutions, because this should work exactly like the old AsyncEvent.

AFAICT, this should be a transparent change from the API point of view, only with more efficient use of system FD resources.

eventLock: RLock
p: PDispatcher
vFD: VirtualFD
cb: Callback
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Callback is closure so it still from thread-local heap, and PDispatcher is also from thread-local heap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only field that is accessed from multiple threads is the triggered bool, and it is protected by eventLock.
The other fields are never accessed from a different thread.

rayman22201 and others added 8 commits October 4, 2019 15:38
AsyncEvent used to create a new File Descriptor for every new event.
This will quickly exhaust all the file handles available on the machine.

This rewrite multiplexes all the AsyncEvents with virtual File
Descriptors over a single physical File Descriptor.
- implemented a withLockIfThreads template to reduce when
compileOptions("threads") boilerplate. Suggested by Zevv.

- Added an InvalidVirtualFD constant for semantic clarity

- Bug Fix: Triggering an AsyncEvent that is not registered would
crash with a segv. It now simply acts as a noop and returns.
AsyncEvent used to create a new File Descriptor for every new event.
This will quickly exhaust all the file handles available on the machine.

This rewrite multiplexes all the AsyncEvents with virtual File
Descriptors over a single physical File Descriptor.
@stale stale bot added the stale Staled PR/issues; remove the label after fixing them label Oct 4, 2020
@stale stale bot closed this Nov 3, 2020
@Araq Araq reopened this Nov 3, 2020
@stale stale bot removed the stale Staled PR/issues; remove the label after fixing them label Nov 3, 2020
@stale stale bot added the stale Staled PR/issues; remove the label after fixing them label Nov 3, 2021
@stale stale bot closed this Dec 3, 2021
@nim-lang nim-lang deleted a comment from stale bot Dec 3, 2021
@nim-lang nim-lang deleted a comment from stale bot Dec 3, 2021
@ringabout ringabout reopened this Dec 3, 2021
@stale stale bot removed the stale Staled PR/issues; remove the label after fixing them label Dec 3, 2021
@stale
Copy link

stale bot commented Dec 4, 2022

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

@stale stale bot added the stale Staled PR/issues; remove the label after fixing them label Dec 4, 2022
@stale stale bot closed this Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Staled PR/issues; remove the label after fixing them
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants