batch inference #347

liuxiaohao-xn · 2023-06-08T08:23:21Z

liuxiaohao-xn
Jun 8, 2023

I have multiple prompts, I want to feed them all at once to the model to generate the output，can you tell me how to achieve it ?

ilyalasy · 2023-06-09T10:11:41Z

ilyalasy
Jun 9, 2023

Same question here, it seems llama.cpp already supports batched inference.
See baby-llama.
See this discussion.

@abetlen It seems this usecase is very popular, what do you think?

2 replies

abetlen Jun 9, 2023
Maintainer

Currently the llama.h api does not support efficient batched inference. The babyllama example with batched inference uses the ggml api directly which this binding does not (I am working on a seperate project that does that but ggml repo is slightly outdated between ggml <-> llama.cpp). I'll take a look at some options for how to implement this though.

ilyalasy Jun 9, 2023

Alright, thanks!

origintopleft · 2023-07-07T20:54:15Z

origintopleft
Jul 7, 2023

What kind of work would need to be done, either here or upstream in llama.cpp, to get batch inference fully working? Is there a roadmap anywhere already?

I've got some code here from work that needs my help working against Vicuna, and the only real obstacle left is getting batch processing working

1 reply

origintopleft Jul 15, 2023

I should probably clarify: I was asking because I was hoping to possibly contribute some help to the project 😅

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

batch inference #347

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

batch inference #347

liuxiaohao-xn Jun 8, 2023

Replies: 2 comments · 3 replies

ilyalasy Jun 9, 2023

abetlen Jun 9, 2023 Maintainer

ilyalasy Jun 9, 2023

origintopleft Jul 7, 2023

origintopleft Jul 15, 2023

liuxiaohao-xn
Jun 8, 2023

Replies: 2 comments 3 replies

ilyalasy
Jun 9, 2023

abetlen Jun 9, 2023
Maintainer

origintopleft
Jul 7, 2023