-
-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange error when running specific Node.js application #940
Comments
Hi Miha, you can get a backtrace with less junk - and more information (function arguments, and so on) using gdb - see explanation in https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb. It seems the ICU library code (ICU is the library which deals with unicode characters, and the likes) is crashing, but since I'm not at all familiar with it, I really have no idea if it's an ICU bug or some sort of OSv bug that for some reason manfests itself like this. One thing this could be is a bug in OSv's locale code, which I'm guessing that icu_58::Collator::createInstance is using somehow. I thought I fixed all these bugs long ago (see for example #715, #314) but it's possible some remain (?). |
@miha-plesko how can I reproduce this problem? I have zero knowledge on how to run nodejs code. |
Nadav,
You should be able to use node-express example (
https://github.com/cloudius-systems/osv-apps/tree/master/node-express-example).
Just build and run and it should respond on http port 3000.
However I am suspecting it has to do with some specific Node app (
@miha-plesko <https://github.com/miha-plesko> can you send us more specific
example?) as I have run node apps even pretty complicated ones and have not
seen any issues especially this one.
Waldek
PS. I have even simpler node hello app that I could add to apps repo.
…On Thu, Jan 25, 2018 at 1:07 PM, nyh ***@***.***> wrote:
@miha-plesko <https://github.com/miha-plesko> how can I reproduce this
problem? I have zero knowledge on how to run nodejs code.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#940 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFDSIdgaGQ8Ds67DqF2h-9-k-8cz0v86ks5tOML2gaJpZM4RrVcT>
.
|
Thanks for quick response. We're trying to run application with Capstan using node 6.10.2 package. However since I realize that it's difficult to debug via Capstan, I tried to prepare an OSv app example for you guys. I'm having troubles uploading symlink files from
on my machine, but I assume you know how to overcome this so I'm pasting the app anyway. All you need to do is to include this directory http://x.k00.fr/cyuqg | password: 451848 in your $ scripts/build -j6 image=node-wio-example Finnally, run the unikernel and OSv should crash like described in this issue. BTW, the original issue that was reported by Capstan user can be seen here: mikelangelo-project/capstan-packages#16 |
@miha-plesko I tried as you suggested above, and strangely, on my machine (Fedora 27 with gcc 7.2.1) I couldn't even get the "node" package to compile so I didn't get to see the symlink problem or the original bug:
Could it be that this older node.js cannot build with newer compilers? |
Interesting. I can confirm that I have following gcc installed:
so the problem may be in your newer version of gcc. I assume it wouldn't help you anything if I share my compiled stuff with you, right? |
The compilation error with gcc 7 is indeed a known bug in node.js - see nodejs/node#13574 and the patch |
@miha-plesko after fixing node's compilation, I can now
But this is just because the command line in module.py was wrong - I had to change it to "/libnode.so /wio/fuse.web.js". Then it "works" and crashes exactly like you reported originally. Good (or rather, bad ;-)). |
The gdb stack trace is not very different:
But one interesting thing is that the fault is at address 35184374185984, i.e., 0x200000200000. This is an mmapped address, and gdb's "osv mmap" shows what it is:
As you can see 0x0000200000200000 is at a single page deliberately mapped without permissions, to cause this fault, so this is why we got it when this address was touched. It looks like a stack guard. I am guessing the 12 KB stack which follows it is what overflowed - this actually means a 16KB stack was requested, and a 4K guard page was taken off from it (this subtractive behavior is what both Linux and OSv do, despite Posix specifying the guard page should be added to the stack size). If my analysis is correct, the question becomes why does node.js use a 16 KB stack for this v8 initialization (?) code. Is this even possible? And I also don't know why this problem happens in OSv but not Linux, or why it didn't happen in other node.js applications. |
I think I know where the problem is and how to fix this. In src/node.cc there is: // Don't shrink the thread's stack on FreeBSD. Said platform decided to
// follow the pthreads specification to the letter rather than in spirit:
// https://lists.freebsd.org/pipermail/freebsd-current/2014-March/048885.html
#ifndef __FreeBSD__
CHECK_EQ(0, pthread_attr_setstacksize(&attr, PTHREAD_STACK_MIN));
#endif // __FreeBSD__ This I now get a different error, probably just a configuration error in the files you gave me (?):
Anyway, I'm not sure how we'd want to fix this. We can easily add yet another patch to node.js to get it to work on OSv, but this is not pretty (it will not help other people who are using different applications). I think a better solution is to modify pthread_attr_setstacksize to change a request to set a stack size lower than some OSv-defined minimum, e.g., 64 KB, to this minimum. Opinions? |
This problem may be more complex than I though... I changed in libc/pthread.cc as follows: int pthread_attr_setstacksize(pthread_attr_t *attr, size_t stacksize)
{
// Linux considers the minimum reasonable stack, PTHREAD_STACK_MIN,
// to be 16K. However, in OSv such a tiny stack would probably not
// be enough for proper function, considering that OSv's functions
// use the application's stack as well, unlike Linux's system calls.
stacksize = std::max(stacksize, (size_t)256*1024);
from_libc(attr)->stack_size = stacksize;
return 0;
} This 256K should be more than enough (much bigger than I would have liked), and yet the code sometimes crashes (as before) and sometimes work, about 50%/50%. I checked with "osv mmap" that indeed the mapping after the crashing address grew to 256K-4K. So we may have another problem, not (or not just) a real stack overflow. @miha-plesko do you have any idea what the problematic RegisterDebugSignalHandler() code does? I wonder if the stuff it does somehow disagrees with OSv. |
Hi @nyh I've played with this a little further and it seems like app uses some kind of framework for bundling (uglyfying the code to run in production if I put it very simple) called fuse-box that is causing problems. The 50/50 behavior you've observed probably comes from multithreading nature of this framework. The 50% that crashes with another error (saying it cannot find some files to copy) can be resolved by setting
Here are some links to Node.js source code: |
I'm running a Node.js 6.10.2 application and unikernel crashes like this:
I have no idea what the stacktrace is telling me, @nyh would you be willing to enlighten me? 😄 It feels like there should be something small, but I can't tell what...
The text was updated successfully, but these errors were encountered: