[GGML] Added RISC-V Vector Intrinsics Support #2929

Tameem-10xE · 2023-08-31T10:55:53Z

Hi,

In this PR, we have added the RISC-V intrinsics for the following vector dot product functions

     ggml_vec_dot_q4_0_q4_0
     ggml_vec_dot_q4_1_q8_1
     ggml_vec_dot_q5_0_q8_0
     ggml_vec_dot_q5_1_q8_1
     ggml_vec_dot_q8_0_q8_0

In future, this will enable GGML to run efficiently on RISC-V hardware with vector support and also open a way to compare its performance with other vector processors like Intel AVX and Arm Neon. This will also led to performance improvement and speedup for application using GGML on the RISC-V hardware with vector processor

The output is tested and verified for each of the legacy ggml 7B quantize models (includes q4_0, q4_1, q5_0, q5_1 and q8_0) by using qemu-riscv64 emulator.

Edit: LLaMa.cpp has stop using GGML format after 22nd August, 2023, and instead shifted to new format GGUF. So these functions will not be effective for LLaMa.cpp after this, soon will submit new PR with support for GGUF

[Cross Compiling Environment]
Ubuntu: 22.10
riscv-toolchain: 2023.07.05 riscv64 linux glibc

On the actual hardware there will be no issue but running with qemu require to sligthly modify the Makefile, we just have to overwrite the CC and CCX variable with toolchain gcc compiler and also specify the architecture flag for the Makefile

CC := riscv64-unknown-linux-gnu-gcc
CCX := riscv64-unknown-linux-gnu-g++

And then after it run make

make   RISCV_CROSS_COMPILE=1  RISCV=1

[QEMU]

$   qemu-riscv64 -L /path/to/sysroot/  -cpu rv64,v=true,vlen=256,elen=64,vext_spec=v1.0 ./main -m ./path/to/model.gguf -p "Anything" -n 9

[Output]

If you'd like to test these changes, we've set up a cloud-v pipeline on our fork repository (main branch), which you can use to run and verify the code on riscv-qemu emulator

Any feedback is welcome, if you have any suggestions or improvements, please share.

Added RVV intrinsics for following ggml_vec_dot_q4_0_q8_0 ggml_vec_dot_q4_1_q8_1 ggml_vec_dot_q5_0_q8_0 ggml_vec_dot_q5_1_q8_1 ggml_vec_dot_q8_0_q8_0 Co-authored-by: Sharafat <[email protected]> Signed-off-by: Ahmad Tameem <[email protected]>

camel-cdr · 2023-09-03T07:14:07Z

The code structure of ggml.c doesn't work very well with a scalable vector architecture. If this course is continued I'd try to detect the optimal vtype at startup and use that instead of hoping that LMUL=1 will work.

Also, temp_1 and temp_2 can be easily synthesised with viota, no need for loads.

Tameem-10xE · 2023-09-03T09:35:48Z

Thank you for your feedback @camel-cdr.
Did you tested it with old ggml (i.e ggml_q4_0.bin) weights or new gguf weights (i.e: ggml_q4_k.gguf)?
Actually due to recent changes I misjudged this and optimization will not effect performance for new gguf type weights.
I am currently working on this and writing new functions that also includes these weights
Thanks again!

SiriEmb · 2024-02-25T14:12:11Z

Hi does RISCV port support Q2 ?

Tameem-10xE · 2024-03-04T08:09:17Z

Hi, for the legacy weights I think there were no Q2 weights (the only weights were Q4, Q5 and Q8 if I am correct)
For the newer GGUF weights yes, it does support the Q2 (you can check in this PR: #3453 )

Thank you

added support for RISCV CFLAGS & native compile + cross compile options

bc23fcd

Tameem-10xE force-pushed the llama-rvv branch from d559746 to f93af6e Compare August 31, 2023 15:51

Add RISC-V Vector Intrinsics Support

7ea2c68

Added RVV intrinsics for following ggml_vec_dot_q4_0_q8_0 ggml_vec_dot_q4_1_q8_1 ggml_vec_dot_q5_0_q8_0 ggml_vec_dot_q5_1_q8_1 ggml_vec_dot_q8_0_q8_0 Co-authored-by: Sharafat <[email protected]> Signed-off-by: Ahmad Tameem <[email protected]>

Tameem-10xE force-pushed the llama-rvv branch from f93af6e to 7ea2c68 Compare August 31, 2023 15:58

ggerganov approved these changes Sep 1, 2023

View reviewed changes

ggerganov merged commit 5aec2cf into ggml-org:master Sep 1, 2023

Tameem-10xE mentioned this pull request Oct 3, 2023

Added RISC-V Vector Support for K-Quants and improved the existing intrinsics #3453

Merged

xctan mentioned this pull request Mar 23, 2025

ggml : riscv: add 128-bit RVV support #12530

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GGML] Added RISC-V Vector Intrinsics Support #2929

[GGML] Added RISC-V Vector Intrinsics Support #2929

Tameem-10xE commented Aug 31, 2023 •

edited

Loading

camel-cdr commented Sep 3, 2023 •

edited

Loading

Tameem-10xE commented Sep 3, 2023 •

edited

Loading

SiriEmb commented Feb 25, 2024

Tameem-10xE commented Mar 4, 2024 •

edited

Loading

[GGML] Added RISC-V Vector Intrinsics Support #2929

[GGML] Added RISC-V Vector Intrinsics Support #2929

Conversation

Tameem-10xE commented Aug 31, 2023 • edited Loading

camel-cdr commented Sep 3, 2023 • edited Loading

Tameem-10xE commented Sep 3, 2023 • edited Loading

SiriEmb commented Feb 25, 2024

Tameem-10xE commented Mar 4, 2024 • edited Loading

Tameem-10xE commented Aug 31, 2023 •

edited

Loading

camel-cdr commented Sep 3, 2023 •

edited

Loading

Tameem-10xE commented Sep 3, 2023 •

edited

Loading

Tameem-10xE commented Mar 4, 2024 •

edited

Loading