-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: Do not lock EventLoop::mutex after EventLoop is done #159
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Currently, there are two cases where code may attempt to lock `EventLoop::mutex` after the `EventLoop::loop()` method has finished running. This is not a problem by itself, but is a problem if there is external code which deletes the `EventLoop` object immediately after the method returns, which is the case in Bitcoin Core. Both cases of locking after the loop method returns happen due to uses of the `Unlock()` utility function which unlocks a mutex temporarily, runs a callback function and relocks it. The first case happens in `EventLoop::removeClient` when the `EventLoop::done()` condition is reached and it calls `Unlock()` in order to be able to call `write(post_fd, ...)` without blocking and wake the EventLoop thread. In this case, since `EventLoop` execution is done there is not really any point to using `Unlock()` and relocking, so the code is changed to use a simple `lock.unlock()` call and not try to relock. The second case happens in `EventLoop::post` where `Unlock()` is used in a similar way, and depending on thread scheduling (if the thread is delayed for a long time before relocking) can result in failing to relock `EventLoop::m_mutex` after calling `write()`. In this case, since relocking the mutex is actually necessary for the code that follows, the fix is different: new `addClient`/`removeClient` calls are added to this code, so the `EventLoop::loop()` function will never exit while the `post()` function is waiting. These two changes are being labeled as a bugfix even though there is not technically a bug here in the libmultiprocess code or API. The `EventLoop` object itself was safe before these changes as long as outside code waited for `EventLoop` methods to finish executing before deleting the `EventLoop` object. Originally the multiprocess code added in bitcoin/bitcoin#10102 did this, but later as more features were added for binding and connecting to unix sockets, and unit tests were added which would immediately delete the `EventLoop` object after `EventLoop::loop()` returned, it meant this code could now start failing to relock after unlocking. A previous attempt was made to fix this bug in bitcoin/bitcoin#31815 outside of libmultiprocess code. But it only addressed the first case of a failing `Unlock()` in `EventLoop::removeClient()`, not the second case in `EventLoop::post()`. This first case described above was not observed but is theoretically possible. The second case happens intermittently on macos and was reported bitcoin-core#154
Merged
ryanofsky
added a commit
to ryanofsky/bitcoin
that referenced
this pull request
Feb 10, 2025
Bring in a few cmake changes to improve cmake support and fix compile/lint/tidy warnings. bitcoin-core/libmultiprocess#140 build: don't clobber user/superproject c++ version bitcoin-core/libmultiprocess#142 build: add option for external mpgen binary bitcoin-core/libmultiprocess#143 cleanup: initialize vars in the EventLoop constructor in the correct order bitcoin-core/libmultiprocess#146 cmake: Suppress compiler warnings from capnproto headers bitcoin-core/libmultiprocess#147 cmake: EXTERNAL_MPGEN cleanups bitcoin-core/libmultiprocess#148 util: fix -Wpessimizing-move warning bitcoin-core/libmultiprocess#145 CTest: Module must be included at the top level bitcoin-core/libmultiprocess#149 Avoid `-Wundef` compiler warnings bitcoin-core/libmultiprocess#152 refactor: Fix compiler and clang-tidy warnings bitcoin-core/libmultiprocess#155 scripted-diff: s/Libmultiprocess_EXTERNAL_MPGEN/MPGEN_EXECUTABLE/g bitcoin-core/libmultiprocess#156 refactor: Remove locale-dependent function calls bitcoin-core/libmultiprocess#157 refactor: Avoid using std::format bitcoin-core/libmultiprocess#159 bugfix: Do not lock EventLoop::mutex after EventLoop is done
This was referenced Feb 10, 2025
#160 contains some followups to try to make this code a little safer and easier to maintain. |
Sjors
pushed a commit
to Sjors/bitcoin
that referenced
this pull request
Feb 10, 2025
Bring in a few cmake changes to improve cmake support and fix compile/lint/tidy warnings. bitcoin-core/libmultiprocess#140 build: don't clobber user/superproject c++ version bitcoin-core/libmultiprocess#142 build: add option for external mpgen binary bitcoin-core/libmultiprocess#143 cleanup: initialize vars in the EventLoop constructor in the correct order bitcoin-core/libmultiprocess#146 cmake: Suppress compiler warnings from capnproto headers bitcoin-core/libmultiprocess#147 cmake: EXTERNAL_MPGEN cleanups bitcoin-core/libmultiprocess#148 util: fix -Wpessimizing-move warning bitcoin-core/libmultiprocess#145 CTest: Module must be included at the top level bitcoin-core/libmultiprocess#149 Avoid `-Wundef` compiler warnings bitcoin-core/libmultiprocess#152 refactor: Fix compiler and clang-tidy warnings bitcoin-core/libmultiprocess#155 scripted-diff: s/Libmultiprocess_EXTERNAL_MPGEN/MPGEN_EXECUTABLE/g bitcoin-core/libmultiprocess#156 refactor: Remove locale-dependent function calls bitcoin-core/libmultiprocess#157 refactor: Avoid using std::format bitcoin-core/libmultiprocess#159 bugfix: Do not lock EventLoop::mutex after EventLoop is done bitcoin-core/libmultiprocess#161 cmake: Avoid including CTest if not top level project
Sjors
pushed a commit
to Sjors/bitcoin
that referenced
this pull request
Feb 13, 2025
Bring in a few cmake changes to improve cmake support and fix compile/lint/tidy warnings. bitcoin-core/libmultiprocess#140 build: don't clobber user/superproject c++ version bitcoin-core/libmultiprocess#142 build: add option for external mpgen binary bitcoin-core/libmultiprocess#143 cleanup: initialize vars in the EventLoop constructor in the correct order bitcoin-core/libmultiprocess#146 cmake: Suppress compiler warnings from capnproto headers bitcoin-core/libmultiprocess#147 cmake: EXTERNAL_MPGEN cleanups bitcoin-core/libmultiprocess#148 util: fix -Wpessimizing-move warning bitcoin-core/libmultiprocess#145 CTest: Module must be included at the top level bitcoin-core/libmultiprocess#149 Avoid `-Wundef` compiler warnings bitcoin-core/libmultiprocess#152 refactor: Fix compiler and clang-tidy warnings bitcoin-core/libmultiprocess#155 scripted-diff: s/Libmultiprocess_EXTERNAL_MPGEN/MPGEN_EXECUTABLE/g bitcoin-core/libmultiprocess#156 refactor: Remove locale-dependent function calls bitcoin-core/libmultiprocess#157 refactor: Avoid using std::format bitcoin-core/libmultiprocess#159 bugfix: Do not lock EventLoop::mutex after EventLoop is done bitcoin-core/libmultiprocess#161 cmake: Avoid including CTest if not top level project
ryanofsky
added a commit
to ryanofsky/bitcoin
that referenced
this pull request
Feb 24, 2025
Bump libmultiprocess library to include bugfix bitcoin-core/libmultiprocess#159 which should fix intermittent CI failure reported bitcoin#31921 This change is bumping the libmultiprocess version instead of cherry picking the bugfix. It could cherry-pick the bugfix instead, but there are reasons to prefer just bumping the version: - Bugfix might interact with earlier PRs, and the latest version is better tested with testing done in many CI configurations in bitcoin#30975 and bitcoin#31802 - Even though we are in feature freeze for a release, the MULTIPROCESS=1 option is currently not enabled for release, so this PR only affect CI builds and local builds, not the release build. bitcoin-core/libmultiprocess#140 build: don't clobber user/superproject c++ version bitcoin-core/libmultiprocess#142 build: add option for external mpgen binary bitcoin-core/libmultiprocess#143 cleanup: initialize vars in the EventLoop constructor in the correct order bitcoin-core/libmultiprocess#146 cmake: Suppress compiler warnings from capnproto headers bitcoin-core/libmultiprocess#147 cmake: EXTERNAL_MPGEN cleanups bitcoin-core/libmultiprocess#148 util: fix -Wpessimizing-move warning bitcoin-core/libmultiprocess#145 CTest: Module must be included at the top level bitcoin-core/libmultiprocess#149 Avoid `-Wundef` compiler warnings bitcoin-core/libmultiprocess#152 refactor: Fix compiler and clang-tidy warnings bitcoin-core/libmultiprocess#155 scripted-diff: s/Libmultiprocess_EXTERNAL_MPGEN/MPGEN_EXECUTABLE/g bitcoin-core/libmultiprocess#156 refactor: Remove locale-dependent function calls bitcoin-core/libmultiprocess#157 refactor: Avoid using std::format bitcoin-core/libmultiprocess#159 bugfix: Do not lock EventLoop::mutex after EventLoop is done bitcoin-core/libmultiprocess#161 cmake: Avoid including CTest if not top level project bitcoin-core/libmultiprocess#164 Bump minimum required cmake to 3.12
ryanofsky
added a commit
to ryanofsky/bitcoin
that referenced
this pull request
Feb 24, 2025
Bump libmultiprocess library to include bugfix bitcoin-core/libmultiprocess#159 which should fix intermittent CI failures reported in bitcoin#31921 This change is bumping the libmultiprocess version instead of cherry picking the bugfix. It could cherry-pick the bugfix instead, but there are reasons to prefer bumping the version: - Bugfix might interact with earlier PRs, and the latest version is better tested with testing done in many CI configurations in bitcoin#30975 and bitcoin#31802 - Even though we are in feature freeze for a release, the MULTIPROCESS=1 option is currently not enabled for release, so this PR only affect CI builds and local builds, not the release build. This update brings in the following changes: bitcoin-core/libmultiprocess#140 build: don't clobber user/superproject c++ version bitcoin-core/libmultiprocess#142 build: add option for external mpgen binary bitcoin-core/libmultiprocess#143 cleanup: initialize vars in the EventLoop constructor in the correct order bitcoin-core/libmultiprocess#146 cmake: Suppress compiler warnings from capnproto headers bitcoin-core/libmultiprocess#147 cmake: EXTERNAL_MPGEN cleanups bitcoin-core/libmultiprocess#148 util: fix -Wpessimizing-move warning bitcoin-core/libmultiprocess#145 CTest: Module must be included at the top level bitcoin-core/libmultiprocess#149 Avoid `-Wundef` compiler warnings bitcoin-core/libmultiprocess#152 refactor: Fix compiler and clang-tidy warnings bitcoin-core/libmultiprocess#155 scripted-diff: s/Libmultiprocess_EXTERNAL_MPGEN/MPGEN_EXECUTABLE/g bitcoin-core/libmultiprocess#156 refactor: Remove locale-dependent function calls bitcoin-core/libmultiprocess#157 refactor: Avoid using std::format bitcoin-core/libmultiprocess#159 bugfix: Do not lock EventLoop::mutex after EventLoop is done bitcoin-core/libmultiprocess#161 cmake: Avoid including CTest if not top level project bitcoin-core/libmultiprocess#164 Bump minimum required cmake to 3.12
ryanofsky
added a commit
to ryanofsky/bitcoin
that referenced
this pull request
Feb 24, 2025
Bump libmultiprocess library to include bugfix bitcoin-core/libmultiprocess#159 which should fix intermittent CI failures reported in bitcoin#31921 This change is bumping the libmultiprocess version instead of cherry picking the bugfix. It could cherry-pick the bugfix instead, but there are reasons to prefer bumping the version: - Bugfix might interact with earlier PRs, and the latest version is better tested with testing done in many CI configurations in bitcoin#30975 and bitcoin#31802 - Even though we are in feature freeze for a release, the MULTIPROCESS=1 option is currently not enabled for release, so this PR only affect CI builds and local builds, not the release build. This update brings in the following changes: bitcoin-core/libmultiprocess#140 build: don't clobber user/superproject c++ version bitcoin-core/libmultiprocess#142 build: add option for external mpgen binary bitcoin-core/libmultiprocess#143 cleanup: initialize vars in the EventLoop constructor in the correct order bitcoin-core/libmultiprocess#146 cmake: Suppress compiler warnings from capnproto headers bitcoin-core/libmultiprocess#147 cmake: EXTERNAL_MPGEN cleanups bitcoin-core/libmultiprocess#148 util: fix -Wpessimizing-move warning bitcoin-core/libmultiprocess#145 CTest: Module must be included at the top level bitcoin-core/libmultiprocess#149 Avoid `-Wundef` compiler warnings bitcoin-core/libmultiprocess#152 refactor: Fix compiler and clang-tidy warnings bitcoin-core/libmultiprocess#155 scripted-diff: s/Libmultiprocess_EXTERNAL_MPGEN/MPGEN_EXECUTABLE/g bitcoin-core/libmultiprocess#156 refactor: Remove locale-dependent function calls bitcoin-core/libmultiprocess#157 refactor: Avoid using std::format bitcoin-core/libmultiprocess#159 bugfix: Do not lock EventLoop::mutex after EventLoop is done bitcoin-core/libmultiprocess#161 cmake: Avoid including CTest if not top level project bitcoin-core/libmultiprocess#164 Bump minimum required cmake to 3.12
fanquake
pushed a commit
to bitcoin/bitcoin
that referenced
this pull request
Feb 25, 2025
Bump libmultiprocess library to include bugfix bitcoin-core/libmultiprocess#159 which should fix intermittent CI failures reported in #31921 This change is bumping the libmultiprocess version instead of cherry picking the bugfix. It could cherry-pick the bugfix instead, but there are reasons to prefer bumping the version: - Bugfix might interact with earlier PRs, and the latest version is better tested with testing done in many CI configurations in #30975 and #31802 - Even though we are in feature freeze for a release, the MULTIPROCESS=1 option is currently not enabled for release, so this PR only affect CI builds and local builds, not the release build. This update brings in the following changes: bitcoin-core/libmultiprocess#140 build: don't clobber user/superproject c++ version bitcoin-core/libmultiprocess#142 build: add option for external mpgen binary bitcoin-core/libmultiprocess#143 cleanup: initialize vars in the EventLoop constructor in the correct order bitcoin-core/libmultiprocess#146 cmake: Suppress compiler warnings from capnproto headers bitcoin-core/libmultiprocess#147 cmake: EXTERNAL_MPGEN cleanups bitcoin-core/libmultiprocess#148 util: fix -Wpessimizing-move warning bitcoin-core/libmultiprocess#145 CTest: Module must be included at the top level bitcoin-core/libmultiprocess#149 Avoid `-Wundef` compiler warnings bitcoin-core/libmultiprocess#152 refactor: Fix compiler and clang-tidy warnings bitcoin-core/libmultiprocess#155 scripted-diff: s/Libmultiprocess_EXTERNAL_MPGEN/MPGEN_EXECUTABLE/g bitcoin-core/libmultiprocess#156 refactor: Remove locale-dependent function calls bitcoin-core/libmultiprocess#157 refactor: Avoid using std::format bitcoin-core/libmultiprocess#159 bugfix: Do not lock EventLoop::mutex after EventLoop is done bitcoin-core/libmultiprocess#161 cmake: Avoid including CTest if not top level project bitcoin-core/libmultiprocess#164 Bump minimum required cmake to 3.12
fanquake
added a commit
to bitcoin/bitcoin
that referenced
this pull request
Feb 25, 2025
01f7715 depends: Update libmultiprocess library to fix CI failure (Ryan Ofsky) Pull request description: Bump libmultiprocess library to include bugfix bitcoin-core/libmultiprocess#159 which should fix intermittent CI failures reported in #31921 This change is bumping the libmultiprocess version instead of cherry picking the bugfix. It could cherry-pick the bugfix instead, but there are reasons to prefer bumping the version: - Bugfix might interact with earlier PRs, and the latest version is better tested with testing done in many CI configurations in [#30975](#30975) and [#31802](#31802) - Even though we are in feature freeze for a release, the MULTIPROCESS=1 option is currently not enabled for release, so this PR only affect CI builds and local builds, not the release build. This update brings in the following changes: bitcoin-core/libmultiprocess#140 build: don't clobber user/superproject c++ version bitcoin-core/libmultiprocess#142 build: add option for external mpgen binary bitcoin-core/libmultiprocess#143 cleanup: initialize vars in the EventLoop constructor in the correct order bitcoin-core/libmultiprocess#146 cmake: Suppress compiler warnings from capnproto headers bitcoin-core/libmultiprocess#147 cmake: EXTERNAL_MPGEN cleanups bitcoin-core/libmultiprocess#148 util: fix -Wpessimizing-move warning bitcoin-core/libmultiprocess#145 CTest: Module must be included at the top level bitcoin-core/libmultiprocess#149 Avoid `-Wundef` compiler warnings bitcoin-core/libmultiprocess#152 refactor: Fix compiler and clang-tidy warnings bitcoin-core/libmultiprocess#155 scripted-diff: s/Libmultiprocess_EXTERNAL_MPGEN/MPGEN_EXECUTABLE/g bitcoin-core/libmultiprocess#156 refactor: Remove locale-dependent function calls bitcoin-core/libmultiprocess#157 refactor: Avoid using std::format bitcoin-core/libmultiprocess#159 bugfix: Do not lock EventLoop::mutex after EventLoop is done bitcoin-core/libmultiprocess#161 cmake: Avoid including CTest if not top level project bitcoin-core/libmultiprocess#164 Bump minimum required cmake to 3.12 --- This PR is part of the [process separation project](#28722). ACKs for top commit: fanquake: ACK 01f7715 Tree-SHA512: a6a795e4d4e13e9d35c9346f3c83d5da817f1452bdc4a9412aeadc4c652ad6e5047f4c77756570594a70ec9095cc786772a0469e306dc19bb5a508fd515c37f4
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, there are two cases where code may attempt to lock
EventLoop::mutex
after theEventLoop::loop()
method has finished running. This is not a problem by itself, but is a problem if there is external code which deletes theEventLoop
object immediately after the method returns, which is the case in Bitcoin Core. Both cases of locking after the loop method returns happen due to uses of theUnlock()
utility function which unlocks a mutex temporarily, runs a callback function and relocks it.The first case happens in
EventLoop::removeClient
when theEventLoop::done()
condition is reached and it callsUnlock()
in order to be able to callwrite(post_fd, ...)
without blocking and wake the EventLoop thread. In this case, sinceEventLoop
execution is done there is not really any point to usingUnlock()
and relocking, so the code is changed to use a simplelock.unlock()
call and not try to relock.The second case happens in
EventLoop::post
whereUnlock()
is used in a similar way, and depending on thread scheduling (if the thread is delayed for a long time before relocking) can result in failing to relockEventLoop::m_mutex
after callingwrite()
. In this case, since relocking the mutex is actually necessary for the code that follows, the fix is different: newaddClient
/removeClient
calls are added to this code, so theEventLoop::loop()
function will never exit while thepost()
function is waiting.These two changes are being labeled as a bugfix even though there is not technically a bug here in the libmultiprocess code or API. The
EventLoop
object itself was safe before these changes as long as outside code waited forEventLoop
methods to finish executing before deleting theEventLoop
object. Originally the multiprocess code added inbitcoin/bitcoin#10102 did this, but later as more features were added for binding and connecting to unix sockets, and unit tests were added which would immediately delete the
EventLoop
object afterEventLoop::loop()
returned, it meant this code could now start failing to relock after unlocking.A previous attempt was made to fix this bug in
bitcoin/bitcoin#31815 outside of libmultiprocess code. But it only addressed the first case of a failing
Unlock()
inEventLoop::removeClient()
, not the second case inEventLoop::post()
.This first case described above was not observed but is theoretically possible. The second case happens intermittently on macos and was reported #154