Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLaMA.cpp returns just some weirdo texts with any model size #291

Closed
gotzmann opened this issue Mar 19, 2023 · 3 comments
Closed

LLaMA.cpp returns just some weirdo texts with any model size #291

gotzmann opened this issue Mar 19, 2023 · 3 comments
Labels
generation quality Quality of model output need more info The OP should provide more details about the issue

Comments

@gotzmann
Copy link

I'm grokking with LLaMA.cpp on M1 laptop with 32GB RAM. Somehow the inference is broken for me.

Like I'm expecting something reasonable for simple prompt I've got from original LLaMA examples:

SQL code to create a table, that will keep CD albums data, such as album name and track\n\\begin{code}\n

And LLaMA.cpp returns just some weirdo texts with any model size (7B, 13B, 30B quantised down to 4bit).

What's the reason here?

@Green-Sky
Copy link
Collaborator

Things you can do:

  1. check your model files. Document check sums of models so that we can confirm issues are not caused by bad downloads or conversion #238
  2. always share your exact command line parameters.

@gjmulder gjmulder added need more info The OP should provide more details about the issue generation quality Quality of model output labels Mar 19, 2023
@gjmulder gjmulder changed the title Something is broken LLaMA.cpp returns just some weirdo texts with any model size Mar 19, 2023
@ukiyocode
Copy link

Might be the same issue as this: #280
Best thing to do right now is to download this version: https://github.com/ggerganov/llama.cpp/tree/4f546091102a418ffdc6230f872ac56e5cedb835 or earlier

@ggerganov
Copy link
Member

Also, do not use \n in the prompt in the command line. These are not converted to new lines, but are instead parsed as normal text.

Either pass the prompt from a file, or do it like this:

make -j && ./main -m models/7B/ggml-model-q4_0.bin -t 8 -n 1024 -s 2 -p "SQL code to create a table, that will keep CD albums data, such as album name and track:
\begin{code}
"
I llama.cpp build info: 
I UNAME_S:  Darwin
I UNAME_P:  arm
I UNAME_M:  arm64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -pthread -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread
I LDFLAGS:   -framework Accelerate
I CC:       Apple clang version 14.0.0 (clang-1400.0.29.202)
I CXX:      Apple clang version 14.0.0 (clang-1400.0.29.202)

make: Nothing to be done for `default'.
main: seed = 2
llama_model_load: loading model from 'models/7B/ggml-model-q4_0.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 512
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: ggml ctx size = 4529.34 MB
llama_model_load: memory_size =   512.00 MB, n_mem = 16384
llama_model_load: loading model part 1/1 from 'models/7B/ggml-model-q4_0.bin'
llama_model_load: .................................... done
llama_model_load: model size =  4017.27 MB / num tensors = 291

system_info: n_threads = 8 / 10 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 

main: prompt: ' SQL code to create a table, that will keep CD albums data, such as album name and track:
\begin{code}
'
main: number of tokens in prompt = 29
     1 -> ''
  3758 -> ' SQL'
   775 -> ' code'
   304 -> ' to'
  1653 -> ' create'
   263 -> ' a'
  1591 -> ' table'
 29892 -> ','
   393 -> ' that'
   674 -> ' will'
  3013 -> ' keep'
  7307 -> ' CD'
 20618 -> ' albums'
   848 -> ' data'
 29892 -> ','
  1316 -> ' such'
   408 -> ' as'
  3769 -> ' album'
  1024 -> ' name'
   322 -> ' and'
  5702 -> ' track'
 29901 -> ':'
    13 -> '
'
 29905 -> '\'
   463 -> 'begin'
 29912 -> '{'
   401 -> 'code'
 29913 -> '}'
    13 -> '
'

sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000


 SQL code to create a table, that will keep CD albums data, such as album name and track:
\begin{code}
CREATE TABLE AlBums (album_id INT NOT NULL PRIMARY KEY AUTOINCREMENT ,artist VARCHAR(250) ) ;
INSERT INTO Albums VALUES('13', 'AC/DC'), ('486','Joe Cocker'); /* etc */;// The data goes here. As you can see it's quite simple, and you don't need to specify the columns of course - that is done by designers during development phase.
\end{code} [end of text]


main: mem per token = 14434244 bytes
main:     load time =   945.16 ms
main:   sample time =    75.87 ms
main:  predict time =  6337.46 ms / 48.38 ms per token
main:    total time =  7742.50 ms

Please reopen if the issue persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
generation quality Quality of model output need more info The OP should provide more details about the issue
Projects
None yet
Development

No branches or pull requests

5 participants