Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 2a26398

Browse files
committedJul 29, 2023
Merge remote-tracking branch 'upstream/concedo'
2 parents cee2e9d + 31486eb commit 2a26398

File tree

1 file changed

+13
-9
lines changed

1 file changed

+13
-9
lines changed
 

Diff for: ‎README.md

+13-9
Original file line numberDiff line numberDiff line change
@@ -72,9 +72,13 @@ For more information, be sure to run the program with the `--help` flag.
7272
## Android (Termux) Alternative method
7373
- See https://github.com/ggerganov/llama.cpp/pull/1828/files
7474

75-
## CuBLAS?
75+
## Using CuBLAS
7676
- If you're on Windows with an Nvidia GPU you can get CUDA support out of the box using the `--usecublas` flag, make sure you select the correct .exe with CUDA support.
77-
- You can attempt a CuBLAS build with `LLAMA_CUBLAS=1` or using the provided CMake file (best for visual studio users). If you use the CMake file to build, copy the `koboldcpp_cublas.dll` generated into the same directory as the `koboldcpp.py` file. If you are bundling executables, you may need to include CUDA dynamic libraries (such as `cublasLt64_11.dll` and `cublas64_11.dll`) in order for the executable to work correctly on a different PC. Note that support for CuBLAS is limited.
77+
- You can attempt a CuBLAS build with `LLAMA_CUBLAS=1` or using the provided CMake file (best for visual studio users). If you use the CMake file to build, copy the `koboldcpp_cublas.dll` generated into the same directory as the `koboldcpp.py` file. If you are bundling executables, you may need to include CUDA dynamic libraries (such as `cublasLt64_11.dll` and `cublas64_11.dll`) in order for the executable to work correctly on a different PC.
78+
79+
## Questions and Help
80+
- **First, please check out [The KoboldCpp FAQ and Knowledgebase](https://github.com/LostRuins/koboldcpp/wiki/The-KoboldCpp-FAQ-and-Knowledgebase) which may already have answers to your questions! Also please search through past issues and discussions.**
81+
- If you cannot find an answer, open an issue on this github, or find us on the [KoboldAI Discord](https://koboldai.org/discord).
7882

7983
## Considerations
8084
- For Windows: No installation, single file executable, (It Just Works)
@@ -91,11 +95,11 @@ For more information, be sure to run the program with the `--help` flag.
9195
## Notes
9296
- Generation delay scales linearly with original prompt length. If OpenBLAS is enabled then prompt ingestion becomes about 2-3x faster. This is automatic on windows, but will require linking on OSX and Linux. CLBlast speeds this up even further, and `--gpulayers` + `--useclblast` more so.
9397
- I have heard of someone claiming a false AV positive report. The exe is a simple pyinstaller bundle that includes the necessary python scripts and dlls to run. If this still concerns you, you might wish to rebuild everything from source code using the makefile, and you can rebuild the exe yourself with pyinstaller by using `make_pyinstaller.bat`
94-
- Supported GGML models:
95-
- LLAMA (All versions including ggml, ggmf, ggjt v1,v2,v3, openllama, gpt4all). Supports CLBlast and OpenBLAS acceleration for all versions.
96-
- GPT-2 (All versions, including legacy f16, newer format + quanitzed, cerebras, starcoder) Supports CLBlast and OpenBLAS acceleration for newer formats, no GPU layer offload.
97-
- GPT-J (All versions including legacy f16, newer format + quantized, pyg.cpp, new pygmalion, janeway etc.) Supports CLBlast and OpenBLAS acceleration for newer formats, no GPU layer offload.
98-
- RWKV (all formats except Q4_1_O).
98+
- Supported GGML models (Includes backward compatibility for older versions/legacy GGML models, though some newer features might be unavailable):
99+
- LLAMA and LLAMA2 (LLaMA / Alpaca / GPT4All / Vicuna / Koala / Pygmalion 7B / Metharme 7B / WizardLM and many more)
100+
- GPT-2 / Cerebras
101+
- GPT-J
102+
- RWKV
99103
- GPT-NeoX / Pythia / StableLM / Dolly / RedPajama
100-
- MPT models (ggjt v3)
101-
- Basically every single current and historical GGML format that has ever existed should be supported, except for bloomz.cpp due to lack of demand.
104+
- MPT models
105+

0 commit comments

Comments
 (0)
Please sign in to comment.