Skip to content

Commit 1c28116

Browse files
committed
dont add space when using special tokens
1 parent 5974d61 commit 1c28116

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

llama.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -6727,7 +6727,7 @@ static std::vector<llama_vocab::id> llama_tokenize_internal(const llama_vocab &
67276727
// by modifying llm_tokenizer_x to operate with string offsets like pre-tokenizer
67286728
// and passing 'add space prefix' as bool argument
67296729
//
6730-
auto raw_text = " " + fragment.raw_text.substr(fragment.offset, fragment.length);
6730+
auto raw_text = (special?"":" ") + fragment.raw_text.substr(fragment.offset, fragment.length);
67316731

67326732
#ifdef PRETOKENIZERDEBUG
67336733
fprintf(stderr,"TT: (%ld %ld %ld) '%s'\n", raw_text.length(), fragment.offset, fragment.length, raw_text.c_str());

0 commit comments

Comments
 (0)