-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: timeout in ChunkRecordingExecutor
(ProcessPoolExecutor)
#813
Comments
Hi @miketrumpis Thanks for the report. Can you point to the changes that you made? Cheers |
I see now you added a timeout!! Do you have any idea why the parallel process is failing? |
Pretty sure I've only seen it on Ubuntu 20.04.4. Correct me if I'm wrong, I believe the 0.94 version only uses spawning for multiprocessing. The stalling scenario is either using a basic I've wondered whether it's my extensions that are abusing the process executor, but the logic is largely inherited, and I'm only setting the sparsity matrix to narrow the output. The stalling can definitely happen when multiple independent processes are each spawning the process executors. I believe it can happen in a single process, but less certain. Not sure it's relevant, but curious if anyone else sees resource warnings, as reported in this Python bug? python/cpython#90549 (Note that I see this both in MacOS and Linux, since the previous SpikeInterface release is preferring to spawn.) |
Actually it uses
Can I ask you which extensions? If you have something cool in mind, I suggest to open an issue or open a draft PR and we can definitely provide support :)
Thanks for the diffs! |
I will try to run under the git main soon and play with the multiprocessing context. I have a list of jobs that have failed, but not 100% sure the failure mode is deterministic. Unfortunately the extensions are on a private repo for my organization 😬 |
@alejoe91 : we do not use loky. It was too bugy. The ProcessPoolExecutor is in python core. @miketrumpis : I am not sure that this timout trick will be sustainable. it is very hard to predict the computation. |
I was taking another look at this too--so I presume that uses fork for linux and spawn for mac. Another reason to think spawn might change behavior.
That's a fair point -- the way I wrote it does not allow for a "None" default (current behavior). Still, it would be nice to have the option for a timeout if requested specifically, e.g. in the parameters to I will try your |
This is 2 years old. Should we close this? |
I've been having some stalls in the
ProcessPoolExecutor
when creating someWaveformExtractor
objects. Unfortunately, there aren't any factors that occur to me for debugging this. However, I made a quick fork from the v0.94.0 to include a timeout for themap
call in theProcessPoolExecutor
, which at least raises an exception instead of hanging forever.Very trivial changes. Happy to rebase this and PR
https://github.com/miketrumpis/spikeinterface/blob/multiproc/spikeinterface/core/job_tools.py
The text was updated successfully, but these errors were encountered: