Skip to content

Commit 8ec7cf4

Browse files
authoredDec 17, 2023
feat(settings): Update default model to TheBloke/Mistral-7B-Instruct-v0.2-GGUF (#1415)
* Update LlamaCPP dependency * Default to TheBloke/Mistral-7B-Instruct-v0.2-GGUF * Fix API docs
1 parent c71ae7c commit 8ec7cf4

File tree

5 files changed

+1433
-1233
lines changed

5 files changed

+1433
-1233
lines changed
 
Original file line numberDiff line numberDiff line change
@@ -1 +1,14 @@
11
# API Reference
2+
3+
The API is divided in two logical blocks:
4+
5+
1. High-level API, abstracting all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation:
6+
- Ingestion of documents: internally managing document parsing, splitting, metadata extraction,
7+
embedding generation and storage.
8+
- Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt
9+
engineering and the response generation.
10+
11+
2. Low-level API, allowing advanced users to implement their own complex pipelines:
12+
- Embeddings generation: based on a piece of text.
13+
- Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested
14+
documents.

‎fern/docs/pages/overview/welcome.mdx

-15
Original file line numberDiff line numberDiff line change
@@ -32,21 +32,6 @@ The installation guide will help you in the [Installation section](/installation
3232
/>
3333
</Cards>
3434

35-
## API Organization
36-
37-
The API is divided in two logical blocks:
38-
39-
1. High-level API, abstracting all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation:
40-
- Ingestion of documents: internally managing document parsing, splitting, metadata extraction,
41-
embedding generation and storage.
42-
- Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt
43-
engineering and the response generation.
44-
45-
2. Low-level API, allowing advanced users to implement their own complex pipelines:
46-
- Embeddings generation: based on a piece of text.
47-
- Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested
48-
documents.
49-
5035
<Callout intent = "info">
5136
A working **Gradio UI client** is provided to test the API, together with a set of useful tools such as bulk
5237
model download script, ingestion script, documents folder watch, etc.

‎poetry.lock

+1,417-1,215
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

‎pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ gradio = "^4.4.1"
3636
[tool.poetry.group.local]
3737
optional = true
3838
[tool.poetry.group.local.dependencies]
39-
llama-cpp-python = "^0.2.11"
39+
llama-cpp-python = "^0.2.23"
4040
numpy = "1.26.0"
4141
sentence-transformers = "^2.2.2"
4242
# https://stackoverflow.com/questions/76327419/valueerror-libcublas-so-0-9-not-found-in-the-system-path

‎settings.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,8 @@ qdrant:
4848

4949
local:
5050
prompt_style: "llama2"
51-
llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.1-GGUF
52-
llm_hf_model_file: mistral-7b-instruct-v0.1.Q4_K_M.gguf
51+
llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.2-GGUF
52+
llm_hf_model_file: mistral-7b-instruct-v0.2.Q4_K_M.gguf
5353
embedding_hf_model_name: BAAI/bge-small-en-v1.5
5454

5555
sagemaker:

0 commit comments

Comments
 (0)
Please sign in to comment.