Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected release of a managed pointer in wrapper.call #20095

Open
mike-lischke opened this issue Aug 22, 2023 · 7 comments
Open

Unexpected release of a managed pointer in wrapper.call #20095

mike-lischke opened this issue Aug 22, 2023 · 7 comments

Comments

@mike-lischke
Copy link

mike-lischke commented Aug 22, 2023

Please include the following in your bug report:

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.44-git
clang version 17.0.0 (https://github.com/llvm/llvm-project.git a8cbd27d1f238e104a5d5ca345d93bc1f4d4ab1f)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /opt/homebrew/Cellar/emscripten/3.1.44/libexec/llvm/bin

The Problem

In my C++ library there's a class that manages a list (vector) of std::unique_ptr and hence manages the life time of the objects (in this case instances of a Token class). When any of the tokens is needed the raw pointer is passed around, but of course must not be managed elsewhere.

One of the consumers of a token pointer is a listener class, which is bound with embind and its methods can be overriden in JS. The wrapper is defined as:

class ANTLRErrorListenerWrapper : public wrapper<ANTLRErrorListenerHelper> {
public:
  EMSCRIPTEN_WRAPPER(ANTLRErrorListenerWrapper);

  virtual ~ANTLRErrorListenerWrapper() noexcept override {
  }

  virtual void syntaxError(Recognizer *recognizer, Token *offendingSymbol, size_t line, size_t charPositionInLine,
                           const std::string &msg, const RecognitionException &e) override {
    call<void>("syntaxError", recognizer, offendingSymbol, line, charPositionInLine, msg, e);
  }

This listener is overridden in JS:

import {
    ANTLRErrorListener as AEL
} from "../../src/antlr4-runtime.js";

const ANTLRErrorListener = AEL.extend<AEL>("AEL", {});
type ANTLRErrorListener = InstanceType<typeof ANTLRErrorListener>;

export class MySQLErrorListener extends ANTLRErrorListener {
    // eslint-disable-next-line @typescript-eslint/no-explicit-any
    public syntaxError<T extends Token | number>(recognizer: Recognizer<any>, offendingSymbol: T | null,
        line: number, charPositionInLine: number, msg: string, e: RecognitionException | null): void {

    }

 
}

When the syntaxError method is called (from the wrapper code) it does everything nicely, but when it returns, the token value is deleted, even though it is managed by a unique_ptr. Is that a bug, a gotcha or an error on my side and how can I prevent the premature release?

Work Around

I just found a work around to avoid the premature release, by calling offendingSymbol.clone() in the syntaxError method body, though that looks a bit hackish to me...

@sbc100
Copy link
Collaborator

sbc100 commented Aug 22, 2023

You are saying that calling syntaxError is somehow causing call to delete offendingSymbol ? I don't see how or why that might be. Do you have a backtrace to see exactly where this delete call is coming from? Give that `offendingSymbol is simply a raw pointer that seems rather odd.

@brendandahl might have more insight.

@mike-lischke
Copy link
Author

Here's the code from the generated method caller (which calls the syntaxError method in JS):

(function anonymous(retType,argType0,argType1,argType2,argType3,argType4,argType5
) {
return function methodCaller_void_$Recognizer$Internal$_Token$_unsigned$long_unsigned$long_std$$string_RecognitionException$(handle, name, destructors, args) {
    var arg0 = argType0.readValueFromPointer(args);
    var arg1 = argType1.readValueFromPointer(args+8);
    var arg2 = argType2.readValueFromPointer(args+16);
    var arg3 = argType3.readValueFromPointer(args+24);
    var arg4 = argType4.readValueFromPointer(args+32);
    var arg5 = argType5.readValueFromPointer(args+40);
    var rv = handle[name](arg0, arg1, arg2, arg3, arg4, arg5);
    argType0.deleteObject(arg0);
    argType1.deleteObject(arg1);
    argType5.deleteObject(arg5);
};

})

and checking that shows there're even more bad delete calls. The first parameter is the parser instance which is currently running, and the last parameter (arg5) is just the address to a C++ ref, which cannot be deleted. I wonder what's going on here.

@mike-lischke
Copy link
Author

@sbc100 Do you think this is a bug in emscripten? It doesn't look like it's intended, but maybe there's an extra aspect to consider, which is not (prominently) mentioned in the docs?

@sbc100
Copy link
Collaborator

sbc100 commented Aug 28, 2023

I imagine its some nuance about how smart pointers work under embind, and I would guess its working as indended, but I don't know enough about embind and smart pointers to know for sure.

@brendandahl is OOO all this week but I would imagine he will know for sure.

@brendandahl
Copy link
Collaborator

I'd say this is a bug. Usually when using val::call with raw pointers you'll get the error Implicitly binding raw pointers is illegal. It only allows copies of objects to be sent across so the behaviour of deleting the object makes sense. I'm not quite sure yet why that error doesn't also trigger when using wrapper::call since that uses val::call.

@mike-lischke
Copy link
Author

Would be good if I could say embind to not touch a raw pointer at all. No deletion, no copy, no management and so on. Usually, when a C++ library uses a raw pointer it has good idea how to manage that anyway. Things get complicated when you create a C++ instance in JS, though. Tricky ...

@brendandahl
Copy link
Collaborator

Small correction to my comment above... The problem is not unique to wrapper::call it also affects val::call e.g.:

    val window = val::global("window");
    Foo foo;
    window.call<void>("invoke", &foo);  // Fails to compile
    Foo* fooPtr = &foo;
    window.call<void>("invoke", fooPtr);  // Compiles fine

I'm not sure of the original intention, but I'm guessing that both of those lines should fail to compile since they both are using raw pointers. val::call takes a universal reference which trips up the static_assert that checks if a pointer is sent.

We should probably fix the assert and also make it so call supports policies like allow_raw_pointers().

brendandahl added a commit to brendandahl/emscripten that referenced this issue Jan 5, 2024
Before PR emscripten-core#20383 val's `call` and `operator()` had different behaviors where
one would automatically destroy arguments after the call and the other
didn't. After that PR, they both called destroy. The automatic destruction
wasn't really documented anywhere and and led to some unexpected behavior.

This changes it so neither automatically destroys the arguments which I
think is more inline with the rest of embind where it's up to the user
to handle object lifetimes.

Fixes: emscripten-core#21016 and emscripten-core#20095
brendandahl added a commit to brendandahl/emscripten that referenced this issue Jan 8, 2024
Before PR emscripten-core#20383 val's `call` and `operator()` had different behaviors where
one would automatically destroy arguments after the call and the other
didn't. After that PR, they both called destroy. The automatic destruction
wasn't really documented anywhere and and led to some unexpected behavior.

This changes it so neither automatically destroys the arguments which I
think is more inline with the rest of embind where it's up to the user
to handle object lifetimes.

Fixes: emscripten-core#21016 and emscripten-core#20095
brendandahl added a commit to brendandahl/emscripten that referenced this issue Jan 11, 2024
Before PR emscripten-core#20383 val's `call` and `operator()` had different behaviors where
one would automatically destroy arguments after the call and the other
didn't. After that PR, they both called destroy. The automatic destruction
wasn't really documented anywhere and and led to some unexpected behavior.

This changes it so neither automatically destroys the arguments which I
think is more inline with the rest of embind where it's up to the user
to handle object lifetimes.

Fixes: emscripten-core#21016 and emscripten-core#20095
brendandahl added a commit that referenced this issue Jan 11, 2024
…21022)

Before PR #20383 val's `call` and `operator()` had different behaviors where
one would automatically destroy arguments after the call and the other
didn't. After that PR, they both called destroy. The automatic destruction
wasn't really documented anywhere and and led to some unexpected behavior.

This changes it so neither automatically destroys the arguments which I
think is more inline with the rest of embind where it's up to the user
to handle object lifetimes.

Fixes: #21016 and #20095
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants