Skip to content

Commit 2d27a9f

Browse files
authored
feat(llm): Add openailike llm mode (#1447)
This mode behaves the same as the openai mode, except that it allows setting custom models not supported by OpenAI. It can be used with any tool that serves models from an OpenAI compatible API. Implements #1424
1 parent fee9f08 commit 2d27a9f

File tree

4 files changed

+54
-3
lines changed

4 files changed

+54
-3
lines changed

fern/docs/pages/manual/llms.mdx

+20-1
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ llm:
3737
mode: openai
3838

3939
openai:
40+
api_base: <openai-api-base-url> # Defaults to https://api.openai.com/v1
4041
api_key: <your_openai_api_key> # You could skip this configuration and use the OPENAI_API_KEY env var instead
4142
model: <openai_model_to_use> # Optional model to use. Default is "gpt-3.5-turbo"
4243
# Note: Open AI Models are listed here: https://platform.openai.com/docs/models
@@ -55,6 +56,24 @@ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:80
5556
You'll notice the speed and quality of response is higher, given you are using OpenAI's servers for the heavy
5657
computations.
5758

59+
### Using OpenAI compatible API
60+
61+
Many tools, including [LocalAI](https://localai.io/) and [vLLM](https://docs.vllm.ai/en/latest/),
62+
support serving local models with an OpenAI compatible API. Even when overriding the `api_base`,
63+
using the `openai` mode doesn't allow you to use custom models. Instead, you should use the `openailike` mode:
64+
65+
```yaml
66+
llm:
67+
mode: openailike
68+
```
69+
70+
This mode uses the same settings as the `openai` mode.
71+
72+
As an example, you can follow the [vLLM quickstart guide](https://docs.vllm.ai/en/latest/getting_started/quickstart.html#openai-compatible-server)
73+
to run an OpenAI compatible server. Then, you can run PrivateGPT using the `settings-vllm.yaml` profile:
74+
75+
`PGPT_PROFILES=vllm make run`
76+
5877
### Using AWS Sagemaker
5978

6079
For a fully private & performant setup, you can choose to have both your LLM and Embeddings model deployed using Sagemaker.
@@ -82,4 +101,4 @@ or
82101
`PGPT_PROFILES=sagemaker poetry run python -m private_gpt`
83102

84103
When the server is started it will print a log *Application startup complete*.
85-
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.
104+
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.

private_gpt/components/llm/llm_component.py

+15-1
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,21 @@ def __init__(self, settings: Settings) -> None:
6262

6363
openai_settings = settings.openai
6464
self.llm = OpenAI(
65-
api_key=openai_settings.api_key, model=openai_settings.model
65+
api_base=openai_settings.api_base,
66+
api_key=openai_settings.api_key,
67+
model=openai_settings.model,
68+
)
69+
case "openailike":
70+
from llama_index.llms import OpenAILike
71+
72+
openai_settings = settings.openai
73+
self.llm = OpenAILike(
74+
api_base=openai_settings.api_base,
75+
api_key=openai_settings.api_key,
76+
model=openai_settings.model,
77+
is_chat_model=True,
78+
max_tokens=None,
79+
api_version="",
6680
)
6781
case "mock":
6882
self.llm = MockLLM()

private_gpt/settings/settings.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ class DataSettings(BaseModel):
8181

8282

8383
class LLMSettings(BaseModel):
84-
mode: Literal["local", "openai", "sagemaker", "mock"]
84+
mode: Literal["local", "openai", "openailike", "sagemaker", "mock"]
8585
max_new_tokens: int = Field(
8686
256,
8787
description="The maximum number of token that the LLM is authorized to generate in one completion.",
@@ -156,6 +156,10 @@ class SagemakerSettings(BaseModel):
156156

157157

158158
class OpenAISettings(BaseModel):
159+
api_base: str = Field(
160+
None,
161+
description="Base URL of OpenAI API. Example: 'https://api.openai.com/v1'.",
162+
)
159163
api_key: str
160164
model: str = Field(
161165
"gpt-3.5-turbo",

settings-vllm.yaml

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
llm:
2+
mode: openailike
3+
4+
embedding:
5+
mode: local
6+
ingest_mode: simple
7+
8+
local:
9+
embedding_hf_model_name: BAAI/bge-small-en-v1.5
10+
11+
openai:
12+
api_base: http://localhost:8000/v1
13+
api_key: EMPTY
14+
model: facebook/opt-125m

0 commit comments

Comments
 (0)