|
1 | 1 | ## Vectorstores
|
2 |
| -PrivateGPT supports [Qdrant](https://qdrant.tech/), [Chroma](https://www.trychroma.com/) and [PGVector](https://github.com/pgvector/pgvector) as vectorstore providers. Qdrant being the default. |
| 2 | +PrivateGPT supports [Qdrant](https://qdrant.tech/), [Chroma](https://www.trychroma.com/), [PGVector](https://github.com/pgvector/pgvector) and [ClickHouse](https://github.com/ClickHouse/ClickHouse) as vectorstore providers. Qdrant being the default. |
3 | 3 |
|
4 |
| -In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma` or `postgres`. |
| 4 | +In order to select one or the other, set the `vectorstore.database` property in the `settings.yaml` file to `qdrant`, `chroma`, `postgres` and `clickhouse`. |
5 | 5 |
|
6 | 6 | ```yaml
|
7 | 7 | vectorstore:
|
@@ -101,3 +101,69 @@ Indexes:
|
101 | 101 | postgres=#
|
102 | 102 | ```
|
103 | 103 | The dimensions of the embeddings columns will be set based on the `embedding.embed_dim` value. If the embedding model changes this table may need to be dropped and recreated to avoid a dimension mismatch.
|
| 104 | +
|
| 105 | +### ClickHouse |
| 106 | +
|
| 107 | +To utilize ClickHouse as the vector store, a [ClickHouse](https://github.com/ClickHouse/ClickHouse) database must be employed. |
| 108 | +
|
| 109 | +To enable ClickHouse, set the `vectorstore.database` property in the `settings.yaml` file to `clickhouse` and install the `vector-stores-clickhouse` extra. |
| 110 | +
|
| 111 | +```bash |
| 112 | +poetry install --extras vector-stores-clickhouse |
| 113 | +``` |
| 114 | + |
| 115 | +ClickHouse settings can be configured by setting values to the `clickhouse` property in the `settings.yaml` file. |
| 116 | + |
| 117 | +The available configuration options are: |
| 118 | +| Field | Description | |
| 119 | +|----------------------|----------------------------------------------------------------| |
| 120 | +| **host** | The server hosting the ClickHouse database. Default is `localhost` | |
| 121 | +| **port** | The port on which the ClickHouse database is accessible. Default is `8123` | |
| 122 | +| **username** | The username for database access. Default is `default` | |
| 123 | +| **password** | The password for database access. (Optional) | |
| 124 | +| **database** | The specific database to connect to. Default is `__default__` | |
| 125 | +| **secure** | Use https/TLS for secure connection to the server. Default is `false` | |
| 126 | +| **interface** | The protocol used for the connection, either 'http' or 'https'. (Optional) | |
| 127 | +| **settings** | Specific ClickHouse server settings to be used with the session. (Optional) | |
| 128 | +| **connect_timeout** | Timeout in seconds for establishing a connection. (Optional) | |
| 129 | +| **send_receive_timeout** | Read timeout in seconds for http connection. (Optional) | |
| 130 | +| **verify** | Verify the server certificate in secure/https mode. (Optional) | |
| 131 | +| **ca_cert** | Path to Certificate Authority root certificate (.pem format). (Optional) | |
| 132 | +| **client_cert** | Path to TLS Client certificate (.pem format). (Optional) | |
| 133 | +| **client_cert_key** | Path to the private key for the TLS Client certificate. (Optional) | |
| 134 | +| **http_proxy** | HTTP proxy address. (Optional) | |
| 135 | +| **https_proxy** | HTTPS proxy address. (Optional) | |
| 136 | +| **server_host_name** | Server host name to be checked against the TLS certificate. (Optional) | |
| 137 | + |
| 138 | +For example: |
| 139 | +```yaml |
| 140 | +vectorstore: |
| 141 | + database: clickhouse |
| 142 | + |
| 143 | +clickhouse: |
| 144 | + host: localhost |
| 145 | + port: 8443 |
| 146 | + username: admin |
| 147 | + password: <PASSWORD> |
| 148 | + database: embeddings |
| 149 | + secure: false |
| 150 | +``` |
| 151 | +
|
| 152 | +The following table will be created in the database: |
| 153 | +``` |
| 154 | +clickhouse-client |
| 155 | +:) \d embeddings.llama_index |
| 156 | + Table "llama_index" |
| 157 | + № | name | type | default_type | default_expression | comment | codec_expression | ttl_expression |
| 158 | +----|-----------|----------------------------------------------|--------------|--------------------|---------|------------------|--------------- |
| 159 | + 1 | id | String | | | | | |
| 160 | + 2 | doc_id | String | | | | | |
| 161 | + 3 | text | String | | | | | |
| 162 | + 4 | vector | Array(Float32) | | | | | |
| 163 | + 5 | node_info | Tuple(start Nullable(UInt64), end Nullable(UInt64)) | | | | | |
| 164 | + 6 | metadata | String | | | | | |
| 165 | + |
| 166 | +clickhouse-client |
| 167 | +``` |
| 168 | + |
| 169 | +The dimensions of the embeddings columns will be set based on the `embedding.embed_dim` value. If the embedding model changes, this table may need to be dropped and recreated to avoid a dimension mismatch. |
0 commit comments