Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ERROR] Futex facility returned an unexpected error code #1443

Closed
Tameem-10xE opened this issue Aug 23, 2023 · 8 comments
Closed

[ERROR] Futex facility returned an unexpected error code #1443

Tameem-10xE opened this issue Aug 23, 2023 · 8 comments

Comments

@Tameem-10xE
Copy link

Hi,

I am getting this error "The futex facility returned an unexpected error code" while trying to run llama.cpp executable on spike pk. I have used latest toolchain for cross compiling it for riscv and it is running without any issue on qemu-riscv64. But when i tried to run it with spike-pk it is giving me this error. I wanted to use it for estimating performance difference.

[Build-Spike]:

  $  ../configure --prefix=$DIR   --with-isa=RV64GCV --enable-commitlog
  $  make

[Build-Proxy Kernel]:

  $  ../configure --prefix=$DIR --host=riscv64-unknown-linux-gnu --with-arch=rv64gcv
  $  make

[Command]:
$ ./spike pk ./llama2.cpp/main -h
If anyone has encountered a similar issue or has insights into how to resolve it, I would greatly appreciate it.

Thank you.

@scottj97
Copy link
Contributor

Do you have any idea what this "futex facility" is? I'm not familiar with pk but I'm sure there's no such thing in Spike.

What version (commit hash) of Spike are you using? --enable-commitlog has been removed since #1189 (December 2022).

@Tameem-10xE
Copy link
Author

Thank you for you reply,

My guess is it is related to multi-threading, since futex is the locking mechanism. I post this issue just to confirm that whether this is related to spike or the code which I am using. Although I manage to run only the specific chunk of code from the project which does not have multithreading in it.

I am using the latest version (commit: d1680b7), I think it ignored the flag since I was not familiar with this change.

@chuanyuL
Copy link

Hi,I also meet the same Q, my friend let me replace riscv64-unknown-linux-gnu to riscv64-unknown-elf-g++, this run.
I also want to estimating performance difference in spike,but gprof does not run in spike as linux, what should I do?
Thank you!

@scottj97
Copy link
Contributor

Hi,I also meet the same Q, my friend let me replace riscv64-unknown-linux-gnu to riscv64-unknown-elf-g++, this run. I also want to estimating performance difference in spike,but gprof does not run in spike as linux, what should I do? Thank you!

I have no idea what you're saying, but it sounds like you should open a new issue for this. Describe what you're trying to do and what isn't working.

@NazerkeT
Copy link

NazerkeT commented Feb 7, 2024

Hi @Tameem-10xE,

Btw, did you get to solve this problem eventually or did you end up with running llama at qemu only for riscv?

Thanks,
Best wishes

@Tameem-10xE
Copy link
Author

Tameem-10xE commented Feb 8, 2024

Hi,
Unfortunately, I was not able to use spike for it due to the same futex error, but with QEMU I was able to run it using these instructions which you can also check here ggml-org/llama.cpp#3453

[Cross Compiling Environment]
Ubuntu: 22.10
riscv-toolchain: 2023.07.05 riscv64 linux glibc

[Scalar Version with QEMU]

make main CC="riscv64-unknown-linux-gnu-gcc -march=rv64gc -mabi=lp64d" CXX="riscv64-unknown-linux-gnu-g++ -march=rv64gc -mabi=lp64d"
qemu-riscv64 -L /path/to/sysroot/  -cpu rv64 ./main -m ./path/to/model.gguf -p "Anything" -n 100

@NazerkeT
Copy link

Hi @Tameem-10xE ,

Got it, thanks for your response. I confirm that I can run llama.cpp at qemu user space emulator as well and thanks for your pr for that!

However, it is a bit painfully slow with 1-2 minutes per token :( , do you have by chance any suggestions on how I can improve that from your past experience?

Thanks,
Best wishes

@Tameem-10xE
Copy link
Author

Hi, really sorry for the late reply
No, I haven't any idea about improving performance with QEMU, but changing weights can be an option. For some reason, the scalar version on QEMU is faster than the vector version. So you can re-compile it for a scalar. The reason for it wasn’t clear to me but I think that the vector instructions are not mapping to the actual hardware instructions, maybe that’s why it runs slower for vector with QEMU on my hardware with Intel processor (AVX2).
Another method is to use the smaller weights (although Q2 has much lower accuracy and after recent updates for some reason the output is gibberish, so I think Q4_K or Q3_K is best). If your concern is to only run model then you can also train your own model such as (100-200 MB max) (link: Train-Model-from-Scratch).

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants