-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLaMA.cpp returns just some weirdo texts with any model size #291
Comments
Things you can do:
|
Might be the same issue as this: #280 |
Also, do not use Either pass the prompt from a file, or do it like this: make -j && ./main -m models/7B/ggml-model-q4_0.bin -t 8 -n 1024 -s 2 -p "SQL code to create a table, that will keep CD albums data, such as album name and track:
\begin{code}
"
I llama.cpp build info:
I UNAME_S: Darwin
I UNAME_P: arm
I UNAME_M: arm64
I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread
I LDFLAGS: -framework Accelerate
I CC: Apple clang version 14.0.0 (clang-1400.0.29.202)
I CXX: Apple clang version 14.0.0 (clang-1400.0.29.202)
make: Nothing to be done for `default'.
main: seed = 2
llama_model_load: loading model from 'models/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx = 512
llama_model_load: n_embd = 4096
llama_model_load: n_mult = 256
llama_model_load: n_head = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot = 128
llama_model_load: f16 = 2
llama_model_load: n_ff = 11008
llama_model_load: n_parts = 1
llama_model_load: ggml ctx size = 4529.34 MB
llama_model_load: memory_size = 512.00 MB, n_mem = 16384
llama_model_load: loading model part 1/1 from 'models/7B/ggml-model-q4_0.bin'
llama_model_load: .................................... done
llama_model_load: model size = 4017.27 MB / num tensors = 291
system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
main: prompt: ' SQL code to create a table, that will keep CD albums data, such as album name and track:
\begin{code}
'
main: number of tokens in prompt = 29
1 -> ''
3758 -> ' SQL'
775 -> ' code'
304 -> ' to'
1653 -> ' create'
263 -> ' a'
1591 -> ' table'
29892 -> ','
393 -> ' that'
674 -> ' will'
3013 -> ' keep'
7307 -> ' CD'
20618 -> ' albums'
848 -> ' data'
29892 -> ','
1316 -> ' such'
408 -> ' as'
3769 -> ' album'
1024 -> ' name'
322 -> ' and'
5702 -> ' track'
29901 -> ':'
13 -> '
'
29905 -> '\'
463 -> 'begin'
29912 -> '{'
401 -> 'code'
29913 -> '}'
13 -> '
'
sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
SQL code to create a table, that will keep CD albums data, such as album name and track:
\begin{code}
CREATE TABLE AlBums (album_id INT NOT NULL PRIMARY KEY AUTOINCREMENT ,artist VARCHAR(250) ) ;
INSERT INTO Albums VALUES('13', 'AC/DC'), ('486','Joe Cocker'); /* etc */;// The data goes here. As you can see it's quite simple, and you don't need to specify the columns of course - that is done by designers during development phase.
\end{code} [end of text]
main: mem per token = 14434244 bytes
main: load time = 945.16 ms
main: sample time = 75.87 ms
main: predict time = 6337.46 ms / 48.38 ms per token
main: total time = 7742.50 ms Please reopen if the issue persists. |
I'm grokking with LLaMA.cpp on M1 laptop with 32GB RAM. Somehow the inference is broken for me.
Like I'm expecting something reasonable for simple prompt I've got from original LLaMA examples:
SQL code to create a table, that will keep CD albums data, such as album name and track\n\\begin{code}\n
And LLaMA.cpp returns just some weirdo texts with any model size (7B, 13B, 30B quantised down to 4bit).
What's the reason here?
The text was updated successfully, but these errors were encountered: