Skip to content

Commit 2612928

Browse files
Proger666denisovvaljaluma
authored
feat(vectorstore): Add clickhouse support as vectore store (#1883)
* Added ClickHouse vector sotre support * port fix * updated lock file * fix: mypy * fix: mypy --------- Co-authored-by: Valery Denisov <[email protected]> Co-authored-by: Javier Martinez <[email protected]>
1 parent fc13368 commit 2612928

File tree

6 files changed

+399
-5
lines changed

6 files changed

+399
-5
lines changed

fern/docs/pages/manual/vectordb.mdx

+68-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
## Vectorstores
2-
PrivateGPT supports [Qdrant](https://qdrant.tech/), [Chroma](https://www.trychroma.com/) and [PGVector](https://github.com/pgvector/pgvector) as vectorstore providers. Qdrant being the default.
2+
PrivateGPT supports [Qdrant](https://qdrant.tech/), [Chroma](https://www.trychroma.com/), [PGVector](https://github.com/pgvector/pgvector) and [ClickHouse](https://github.com/ClickHouse/ClickHouse) as vectorstore providers. Qdrant being the default.
33

4-
In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma` or `postgres`.
4+
In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma`, `postgres` and `clickhouse`.
55

66
```yaml
77
vectorstore:
@@ -101,3 +101,69 @@ Indexes:
101101
postgres=#
102102
```
103103
The dimensions of the embeddings columns will be set based on the `embedding.embed_dim` value. If the embedding model changes this table may need to be dropped and recreated to avoid a dimension mismatch.
104+
105+
### ClickHouse
106+
107+
To utilize ClickHouse as the vector store, a [ClickHouse](https://github.com/ClickHouse/ClickHouse) database must be employed.
108+
109+
To enable ClickHouse, set the `vectorstore.database` property in the `settings.yaml` file to `clickhouse` and install the `vector-stores-clickhouse` extra.
110+
111+
```bash
112+
poetry install --extras vector-stores-clickhouse
113+
```
114+
115+
ClickHouse settings can be configured by setting values to the `clickhouse` property in the `settings.yaml` file.
116+
117+
The available configuration options are:
118+
| Field | Description |
119+
|----------------------|----------------------------------------------------------------|
120+
| **host** | The server hosting the ClickHouse database. Default is `localhost` |
121+
| **port** | The port on which the ClickHouse database is accessible. Default is `8123` |
122+
| **username** | The username for database access. Default is `default` |
123+
| **password** | The password for database access. (Optional) |
124+
| **database** | The specific database to connect to. Default is `__default__` |
125+
| **secure** | Use https/TLS for secure connection to the server. Default is `false` |
126+
| **interface** | The protocol used for the connection, either 'http' or 'https'. (Optional) |
127+
| **settings** | Specific ClickHouse server settings to be used with the session. (Optional) |
128+
| **connect_timeout** | Timeout in seconds for establishing a connection. (Optional) |
129+
| **send_receive_timeout** | Read timeout in seconds for http connection. (Optional) |
130+
| **verify** | Verify the server certificate in secure/https mode. (Optional) |
131+
| **ca_cert** | Path to Certificate Authority root certificate (.pem format). (Optional) |
132+
| **client_cert** | Path to TLS Client certificate (.pem format). (Optional) |
133+
| **client_cert_key** | Path to the private key for the TLS Client certificate. (Optional) |
134+
| **http_proxy** | HTTP proxy address. (Optional) |
135+
| **https_proxy** | HTTPS proxy address. (Optional) |
136+
| **server_host_name** | Server host name to be checked against the TLS certificate. (Optional) |
137+
138+
For example:
139+
```yaml
140+
vectorstore:
141+
database: clickhouse
142+
143+
clickhouse:
144+
host: localhost
145+
port: 8443
146+
username: admin
147+
password: <PASSWORD>
148+
database: embeddings
149+
secure: false
150+
```
151+
152+
The following table will be created in the database:
153+
```
154+
clickhouse-client
155+
:) \d embeddings.llama_index
156+
Table "llama_index"
157+
№ | name | type | default_type | default_expression | comment | codec_expression | ttl_expression
158+
----|-----------|----------------------------------------------|--------------|--------------------|---------|------------------|---------------
159+
1 | id | String | | | | |
160+
2 | doc_id | String | | | | |
161+
3 | text | String | | | | |
162+
4 | vector | Array(Float32) | | | | |
163+
5 | node_info | Tuple(start Nullable(UInt64), end Nullable(UInt64)) | | | | |
164+
6 | metadata | String | | | | |
165+
166+
clickhouse-client
167+
```
168+
169+
The dimensions of the embeddings columns will be set based on the `embedding.embed_dim` value. If the embedding model changes, this table may need to be dropped and recreated to avoid a dimension mismatch.

0 commit comments

Comments
 (0)