Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make FFI callbacks thread safe #12823

Draft
wants to merge 21 commits into
base: master
Choose a base branch
from
Draft

Conversation

smx-smx
Copy link

@smx-smx smx-smx commented Nov 29, 2023

Makes FFI callbacks work, regardless of the thread they are invoked from
Fixes #9214

$REVIEW, before merging: is there any additional handling that needs to be done for ZTS?

@bwoebi
Copy link
Member

bwoebi commented Nov 29, 2023

I'd like to see a simple test for this, passing the callback to pthread_create() (called via FFI) and having code executed on that other thread, possibly with a small sleep in the callback, showing that it's indeed blocking the main thread.

@smx-smx
Copy link
Author

smx-smx commented Nov 29, 2023

Hmm an issue i just noticed is that, if i want to use platform-independent mutexes, i cannot use tsrm_mutex_lock (because it's available only #if ZTS).
Is there an alternative?

@dstogov
Copy link
Member

dstogov commented Nov 29, 2023

Few existing FFI tests are failed.

@smx-smx
Copy link
Author

smx-smx commented Nov 29, 2023

I'd like to see a simple test for this, passing the callback to pthread_create() (called via FFI) and having code executed on that other thread, possibly with a small sleep in the callback, showing that it's indeed blocking the main thread.

The way I implemented this, it's the other way around.
The callback thread posts an interrupt request to the main thread and then goes to sleep, waiting for the callback to be serviced within the interrupt handler (within the main thread).
This means that putting a long sleep after the threads creation, such as sleep(1), will delay the execution of the callback (until the sleep ends).

Is it preferrable to have the callback serviced by the requesting thread? I guess we could do it by having the interrupt handler stall, signal the FFI thread, and wait until it finishes... but is there an advantage in doing this?

Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be merged.

Transferring execution of all callbacks to main thread can't be safe and will definitely lead to more problems, deadlocks, Thread Local Storage mess, global context pollution, etc...

Often this just doesn't make any sense by design. E.g. callback_threads.phpt starts threads, but executes their code in context of main thread.

It's better to just disable calls to FFI callbacks in context of non-main threads.

@smx-smx
Copy link
Author

smx-smx commented Nov 30, 2023

The callback_threads.phpt test case is a non typical example, just to simulate the scenario.
In real world cases, you might have a callback that is triggered by numerous (native) worker threads.
This is comparable to a Windows Forms application, where a thread wants to interact with the GUI and needs to do so from the UI Thread.
Here, instead of having a UI thread, we have the PHP interpreter thread and other threads that trigger the callback.
Sometimes you just cannot avoid this due to the design of the system where PHP is used. You might get concurrent events, triggering the issue.
The idea is that, instead of crashing, we can just queue the invocation events and execute them one at a time.
This PR doesn't implement such FIFO scheduling yet. It manages concurrent invocations by stalling the other worker threads until the current callback has been executed.

The alternative would be to manage this in the native code, but it would require wrapping all PHP callbacks in a native trampoline to manage the mutex locking/unlocking. This requires an extra (native) layers for the PHP code that wants to declare a callback and would make the callers more complicated.

In essence, I am not aiming to enable multiple PHP threads.
The goal is to acquire the existing PHP thread/context from other threads and synchronize concurrent accesses.

What I could do is reverse the flow, to have the callback thread run the callback instead.

@smx-smx
Copy link
Author

smx-smx commented Nov 30, 2023

I modified the logic so that the interrupt handler now notifies the thread that invoked the callback, and goes to sleep.
the thread executes the callback and then unlocks the interrupt handler

EDIT: i forgot to mention that i had to disable ZEND_CHECK_STACK_LIMIT, due to the stack tracking code assuming it's always within the same thread. I'll look into fixing this too
It's now fixed

Comment on lines 20 to 21
#include <ffi.h>
#include <pthread.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows build is broken. There are no <pthread.h> there.

ext/ffi/ffi.c Outdated
Comment on lines 1019 to 1020
static void zend_ffi_interrupt_function(zend_execute_data *execute_data){ /* {{{ */
pthread_mutex_lock(&FFI_G(vm_request_lock));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should check if this is FFI related interrupt. This may be a POSIX signal or something else...

Comment on lines +1071 to +1074
static void zend_ffi_callback_trampoline(ffi_cif* cif, void* ret, void** args, void* data) /* {{{ */
{
// wait for a previously initiated request to complete
zend_ffi_wait_request_barrier(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may a regular in-main-thread callback, that doesn't require any locks.
You should check FFI_G(callback_tid) == FFI_G(main_tid) first.

Ideally, we should make distinct between regular and "thread" callback

$libc->pthread_create(
		FFI::addr($tid), NULL,
		FFI::thread_callback($thread_func), FFI::addr($arg)
	);

Copy link
Member

@dstogov dstogov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a cooperative scheduler implemented on top of VM interrupts + pthread synchronisation for serialization of execution of callbacks in a single thread.

Is it possible to implement these ideas separately from FFI, provide some API, and extend FFI with new functionality using that API? This way we should achieve a better designed solution. I the current state, this looks like a hack that misses many things (e.g. exceptions and errors).

May be you should wrap "serialized" callbacks with fibers and then schedule fibers?

@smx-smx smx-smx marked this pull request as draft December 3, 2023 17:48
@smx-smx smx-smx force-pushed the ffi-ts-master branch 3 times, most recently from ca48818 to 40a6537 Compare December 3, 2023 20:19
@jcupitt
Copy link

jcupitt commented Feb 15, 2024

Hello, I think we've just been bitten by this (spurious stack overflow exceptions off the main thread): libvips/php-vips#237 . As a workaround, it looks like we're going to have to ask users to disable all stack overflow checks.

How about just disabling this check for execution off the main thread? I imagine it's a rare case, so it would only cause a small drop in the usefulness of this test.

There are probably complications I'm unaware of, of course!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FFI] Segmentation fault when calling callback function
4 participants