Skip to content

Commit d25d37a

Browse files
ldoguinsobychacko
authored andcommitted
GH-938: Add Couchbase vector store support
Fixes: #938 Issue link: #938 This commit integrates Couchbase as a vector store option in Spring AI, providing: - CouchbaseSearchVectorStore implementation with vector similarity search capabilities - Support for metadata filtering with SQL++ expression conversion - Spring Boot auto-configuration and starter module for easy integration - Comprehensive documentation covering setup, configuration, and usage examples - Integration tests using TestContainers with Couchbase 7.6 The implementation supports configuring dimensions, similarity functions (dot_product/l2_norm), and optimization strategies (recall/latency). Schema initialization is now opt-in via the initializeSchema property. Documentation includes both auto-configuration and manual configuration instructions, along with property configuration details. Signed-off-by: Abhiraj <[email protected]> co-authored-by: Laurent Doguin <[email protected]>
1 parent 6b25b62 commit d25d37a

File tree

22 files changed

+1777
-12
lines changed

22 files changed

+1777
-12
lines changed

Diff for: pom.xml

+3
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@
5656
<module>vector-stores/spring-ai-cassandra-store</module>
5757
<module>vector-stores/spring-ai-chroma-store</module>
5858
<module>vector-stores/spring-ai-coherence-store</module>
59+
<module>vector-stores/spring-ai-couchbase-store</module>
5960
<module>vector-stores/spring-ai-elasticsearch-store</module>
6061
<module>vector-stores/spring-ai-gemfire-store</module>
6162
<module>vector-stores/spring-ai-hanadb-store</module>
@@ -78,6 +79,7 @@
7879
<module>spring-ai-spring-boot-starters/spring-ai-starter-cassandra-store</module>
7980
<module>spring-ai-spring-boot-starters/spring-ai-starter-chroma-store</module>
8081
<module>spring-ai-spring-boot-starters/spring-ai-starter-coherence-store</module>
82+
<module>spring-ai-spring-boot-starters/spring-ai-starter-couchbase-store</module>
8183
<module>spring-ai-spring-boot-starters/spring-ai-starter-elasticsearch-store</module>
8284
<module>spring-ai-spring-boot-starters/spring-ai-starter-gemfire-store</module>
8385
<module>spring-ai-spring-boot-starters/spring-ai-starter-hanadb-store</module>
@@ -235,6 +237,7 @@
235237
<mariadb.version>3.5.1</mariadb.version>
236238
<commonmark.version>0.22.0</commonmark.version>
237239

240+
<couchbase.version>3.7.8</couchbase.version>
238241

239242
<!-- testing dependencies -->
240243
<okhttp3.version>4.12.0</okhttp3.version>

Diff for: spring-ai-bom/pom.xml

+22-10
Original file line numberDiff line numberDiff line change
@@ -291,11 +291,17 @@
291291
<version>${project.version}</version>
292292
</dependency>
293293

294-
<dependency>
295-
<groupId>org.springframework.ai</groupId>
296-
<artifactId>spring-ai-opensearch-store</artifactId>
297-
<version>${project.version}</version>
298-
</dependency>
294+
<dependency>
295+
<groupId>org.springframework.ai</groupId>
296+
<artifactId>spring-ai-opensearch-store</artifactId>
297+
<version>${project.version}</version>
298+
</dependency>
299+
300+
<dependency>
301+
<groupId>org.springframework.ai</groupId>
302+
<artifactId>spring-ai-couchbase-store</artifactId>
303+
<version>${project.version}</version>
304+
</dependency>
299305

300306
<dependency>
301307
<groupId>org.springframework.ai</groupId>
@@ -599,11 +605,17 @@
599605
<version>${project.version}</version>
600606
</dependency>
601607

602-
<dependency>
603-
<groupId>org.springframework.ai</groupId>
604-
<artifactId>spring-ai-qianfan-spring-boot-starter</artifactId>
605-
<version>${project.version}</version>
606-
</dependency>
608+
<dependency>
609+
<groupId>org.springframework.ai</groupId>
610+
<artifactId>spring-ai-qianfan-spring-boot-starter</artifactId>
611+
<version>${project.version}</version>
612+
</dependency>
613+
614+
<dependency>
615+
<groupId>org.springframework.ai</groupId>
616+
<artifactId>spring-ai-couchbase-store-spring-boot-starter</artifactId>
617+
<version>${project.version}</version>
618+
</dependency>
607619

608620
<dependency>
609621
<groupId>org.springframework.ai</groupId>

Diff for: spring-ai-core/src/main/java/org/springframework/ai/observation/conventions/VectorStoreProvider.java

+4-1
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,10 @@ public enum VectorStoreProvider {
5151
* Vector store provided by CosmosDB.
5252
*/
5353
COSMOSDB("cosmosdb"),
54-
54+
/**
55+
* Vector store provided by CosmosDB.
56+
*/
57+
COUCHBASE("couchbase"),
5558
/**
5659
* Vector store provided by Elasticsearch.
5760
*/

Diff for: spring-ai-docs/src/main/antora/modules/ROOT/nav.adoc

+1
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@
7373
** xref:api/vectordbs/azure-cosmos-db.adoc[]
7474
** xref:api/vectordbs/apache-cassandra.adoc[]
7575
** xref:api/vectordbs/chroma.adoc[]
76+
** xref:api/vectordbs/couchbase.adoc[]
7677
** xref:api/vectordbs/elasticsearch.adoc[]
7778
** xref:api/vectordbs/gemfire.adoc[GemFire]
7879
** xref:api/vectordbs/mariadb.adoc[]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,249 @@
1+
= Couchbase
2+
3+
This section will walk you through setting up the `CouchbaseSearchVectorStore` to store document embeddings and perform similarity searches using Couchbase.
4+
5+
link:https://docs.couchbase.com/server/current/vector-search/vector-search.html[Couchbase] is a distributed, JSON document database, with all the desired capabilities of a relational DBMS. Among other features, it allows users to query information using vector-based storage and retrieval.
6+
7+
== Prerequisites
8+
9+
10+
A running Couchbase instance. The following options are available:
11+
Couchbase
12+
* link:https://hub.docker.com/_/couchbase/[Docker]
13+
* link:https://cloud.couchbase.com/[Capella - Couchbase as a Service]
14+
* link:https://www.couchbase.com/downloads/?family=couchbase-server[Install Couchbase locally]
15+
* link:https://www.couchbase.com/downloads/?family=open-source-kubernetes[Couchbase Kubernetes Operator]
16+
17+
== Auto-configuration
18+
19+
Spring AI provides Spring Boot auto-configuration for the Couchbase Vector Store.
20+
To enable it, add the following dependency to your project's Maven `pom.xml` file:
21+
22+
[source,xml]
23+
----
24+
<dependency>
25+
<groupId>org.springframework.ai</groupId>
26+
<artifactId>spring-ai-couchbase-store-spring-boot-starter</artifactId>
27+
</dependency>
28+
----
29+
30+
or to your Gradle `build.gradle` build file.
31+
32+
[source,groovy]
33+
----
34+
dependencies {
35+
implementation 'org.springframework.ai:spring-ai-couchbase-store-spring-boot-starter'
36+
}
37+
----
38+
NOTE: Couchbase Vector search is only available in starting version 7.6 and Java SDK version 3.6.0"
39+
40+
41+
TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.
42+
43+
TIP: Refer to the xref:getting-started.adoc#repositories[Repositories] section to add Milestone and/or Snapshot Repositories to your build file.
44+
45+
The vector store implementation can initialize the configured bucket, scope, collection and search index for you, with default options, but you must opt-in by specifying the `initializeSchema` boolean in the appropriate constructor.
46+
47+
NOTE: This is a breaking change! In earlier versions of Spring AI, this schema initialization happened by default.
48+
49+
Please have a look at the list of <<couchbasevector-properties,configuration parameters>> for the vector store to learn about the default values and configuration options.
50+
51+
Additionally, you will need a configured `EmbeddingModel` bean. Refer to the xref:api/embeddings.adoc#available-implementations[EmbeddingModel] section for more information.
52+
53+
54+
Now you can auto-wire the `CouchbaseSearchVectorStore` as a vector store in your application.
55+
56+
[source,java]
57+
----
58+
@Autowired VectorStore vectorStore;
59+
60+
// ...
61+
62+
List <Document> documents = List.of(
63+
new Document("Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!! Spring AI rocks!!", Map.of("meta1", "meta1")),
64+
new Document("The World is Big and Salvation Lurks Around the Corner"),
65+
new Document("You walk forward facing the past and you turn back toward the future.", Map.of("meta2", "meta2")));
66+
67+
// Add the documents to Qdrant
68+
vectorStore.add(documents);
69+
70+
// Retrieve documents similar to a query
71+
List<Document> results = vectorStore.similaritySearch(SearchRequest.query("Spring").withTopK(5));
72+
----
73+
74+
[[couchbasevector-properties]]
75+
=== Configuration Properties
76+
77+
To connect to Couchbase and use the `CouchbaseSearchVectorStore`, you need to provide access details for your instance.
78+
A simple configuration can either be provided via Spring Boot's `application.properties`,
79+
80+
[application,properties]
81+
----
82+
spring.ai.openai.api-key=<key>
83+
spring.couchbase.connection-string=<conn_string>
84+
spring.couchbase.username=<username>
85+
spring.couchbase.password=<password>
86+
----
87+
88+
environment variables,
89+
90+
[source,bash]
91+
----
92+
export SPRING_COUCHBASE_CONNECTION_STRINGS=<couchbase connection string like couchbase://localhost>
93+
export SPRING_COUCHBASE_USERNAME=<couchbase username>
94+
export SPRING_COUCHBASE_PASSWORD=<couchbase password>
95+
# API key if needed, e.g. OpenAI
96+
export SPRING_AI_OPENAI_API_KEY=<api-key>
97+
----
98+
99+
or can be a mix of those.
100+
For example, if you want to store your password as an environment variable but keep the rest in the plain `application.yml` file.
101+
102+
NOTE: If you choose to create a shell script for ease in future work, be sure to run it prior to starting your application by "sourcing" the file, i.e. `source <your_script_name>.sh`.
103+
104+
Spring Boot's auto-configuration feature for the Couchbase Cluster will create a bean instance that will be used by the `CouchbaseSearchVectorStore`.
105+
106+
The Spring Boot properties starting with `spring.couchbase.*` are used to configure the Couchbase cluster instance:
107+
108+
|===
109+
|Property | Description | Default Value
110+
111+
| `spring.couchbase.connection-string` | A couchbase connection string | `couchbase://localhost`
112+
| `spring.couchbase.password` | Password for authentication with Couchbase. | -
113+
| `spring.couchbase.username` | Username for authentication with Couchbase.| -
114+
| `spring.couchbase.env.io.minEndpoints` | Minimum number of sockets per node.| 1
115+
| `spring.couchbase.env.io.maxEndpoints` | Maximum number of sockets per node.| 12
116+
| `spring.couchbase.env.io.idleHttpConnectionTimeout` | Length of time an HTTP connection may remain idle before it is closed and removed from the pool.| 1s
117+
| `spring.couchbase.env.ssl.enabled` | Whether to enable SSL support. Enabled automatically if a "bundle" is provided unless specified otherwise.| -
118+
| `spring.couchbase.env.ssl.bundle` | SSL bundle name.| -
119+
| `spring.couchbase.env.timeouts.connect` | Bucket connect timeout.| 10s
120+
| `spring.couchbase.env.timeouts.disconnect` | Bucket disconnect timeout.| 10s
121+
| `spring.couchbase.env.timeouts.key-value` | Timeout for operations on a specific key-value.| 2500ms
122+
| `spring.couchbase.env.timeouts.key-value` | Timeout for operations on a specific key-value with a durability level.| 10s
123+
| `spring.couchbase.env.timeouts.key-value-durable` | Timeout for operations on a specific key-value with a durability level.| 10s
124+
| `spring.couchbase.env.timeouts.query` | SQL++ query operations timeout.| 75s
125+
| `spring.couchbase.env.timeouts.view` | Regular and geospatial view operations timeout.| 75s
126+
| `spring.couchbase.env.timeouts.search` | Timeout for the search service.| 75s
127+
| `spring.couchbase.env.timeouts.analytics` | Timeout for the analytics service.| 75s
128+
| `spring.couchbase.env.timeouts.management` | Timeout for the management operations.| 75s
129+
|===
130+
131+
Properties starting with the `spring.ai.vectorstore.couchbase.*` prefix are used to configure `CouchbaseSearchVectorStore`.
132+
133+
|===
134+
|Property | Description | Default Value
135+
136+
|`spring.ai.vectorstore.couchbase.index-name` | The name of the index to store the vectors. | spring-ai-document-index
137+
|`spring.ai.vectorstore.couchbase.bucket-name` | The name of the Couchbase Bucket, parent of the scope. | default
138+
|`spring.ai.vectorstore.couchbase.scope-name` |The name of the Couchbase scope, parent of the collection. Search queries will be executed in the scope context.| _default_
139+
|`spring.ai.vectorstore.couchbase.collection-name` | The name of the Couchbase collection to store the Documents. | _default_
140+
|`spring.ai.vectorstore.couchbase.dimensions` | The number of dimensions in the vector. | 1536
141+
|`spring.ai.vectorstore.couchbase.similarity` | The similarity function to use. | `dot_product`
142+
|`spring.ai.vectorstore.couchbase.optimization` | The similarity function to use. | `recall`
143+
|`spring.ai.vectorstore.couchbase.initialize-schema`| whether to initialize the required schema | `false`
144+
|===
145+
146+
The following similarity functions are available:
147+
148+
* l2_norm
149+
* dot_product
150+
151+
The following index optimizations are available:
152+
153+
* recall
154+
* latency
155+
156+
More details about each in the https://docs.couchbase.com/server/current/search/child-field-options-reference.html[Couchbase Documentation] on vector searches.
157+
158+
== Metadata Filtering
159+
160+
You can leverage the generic, portable link:https://docs.spring.io/spring-ai/reference/api/vectordbs.html#_metadata_filters[metadata filters] with the Couchbase store.
161+
162+
For example, you can use either the text expression language:
163+
164+
[source,java]
165+
----
166+
vectorStore.similaritySearch(
167+
SearchRequest.defaults()
168+
.query("The World")
169+
.topK(TOP_K)
170+
.filterExpression("author in ['john', 'jill'] && article_type == 'blog'"));
171+
----
172+
173+
or programmatically using the `Filter.Expression` DSL:
174+
175+
[source,java]
176+
----
177+
FilterExpressionBuilder b = new FilterExpressionBuilder();
178+
179+
vectorStore.similaritySearch(SearchRequest.defaults()
180+
.query("The World")
181+
.topK(TOP_K)
182+
.filterExpression(b.and(
183+
b.in("author","john", "jill"),
184+
b.eq("article_type", "blog")).build()));
185+
----
186+
187+
NOTE: These filter expressions are converted into the equivalent Couchbase SQL++ filters.
188+
189+
190+
== Manual Configuration
191+
192+
Instead of using the Spring Boot auto-configuration, you can manually configure the Couchbase vector store. For this you need to add the `spring-ai-couchbase-store` to your project:
193+
194+
[source,xml]
195+
----
196+
<dependency>
197+
<groupId>org.springframework.ai</groupId>
198+
<artifactId>spring-ai-couchbase-store</artifactId>
199+
</dependency>
200+
----
201+
202+
or to your Gradle `build.gradle` build file.
203+
204+
[source,groovy]
205+
----
206+
dependencies {
207+
implementation 'org.springframework.ai:spring-ai-couchbase-store'
208+
}
209+
----
210+
211+
Create a Couchbase `Cluster` bean.
212+
Read the link:https://docs.couchbase.com/java-sdk/current/hello-world/start-using-sdk.html[Couchbase Documentation] for more in-depth information about the configuration of a custom Cluster instance.
213+
214+
[source,java]
215+
----
216+
@Bean
217+
public Cluster cluster() {
218+
Cluster cluster = Cluster.connect("couchbase://localhost",
219+
"username", "password");
220+
}
221+
----
222+
223+
and then create the `CouchbaseSearchVectorStore` bean using the builder pattern:
224+
225+
[source,java]
226+
----
227+
@Bean
228+
public VectorStore couchbaseSearchVectorStore(Cluster cluster,
229+
EmbeddingModel embeddingModel,
230+
Boolean initializeSchema) {
231+
return CouchbaseSearchVectorStore
232+
.builder(cluster, embeddingModel)
233+
.bucketName("test")
234+
.scopeName("test")
235+
.collectionName("test")
236+
.initializeSchema(initializeSchema)
237+
.build();
238+
}
239+
240+
// This can be any EmbeddingModel implementation.
241+
@Bean
242+
public EmbeddingModel embeddingModel() {
243+
return new OpenAiEmbeddingModel(OpenAiApi.builder().apiKey(this.openaiKey).build());
244+
}
245+
----
246+
247+
== Limitations
248+
249+
NOTE: It is mandatory to have the following Couchbase services activated: Data, Query, Index, Search. While Data and Search could be enough, Query and Index are necessary to support the complete metadata filtering mechanism.

Diff for: spring-ai-spring-boot-autoconfigure/pom.xml

+14-1
Original file line numberDiff line numberDiff line change
@@ -427,6 +427,13 @@
427427
<version>${project.parent.version}</version>
428428
<optional>true</optional>
429429
</dependency>
430+
<!-- Couchbase Vector Search Store -->
431+
<dependency>
432+
<groupId>org.springframework.ai</groupId>
433+
<artifactId>spring-ai-couchbase-store</artifactId>
434+
<version>${project.parent.version}</version>
435+
<optional>true</optional>
436+
</dependency>
430437

431438
<!-- test dependencies -->
432439

@@ -606,6 +613,12 @@
606613
<scope>test</scope>
607614
</dependency>
608615

609-
</dependencies>
616+
<dependency>
617+
<groupId>org.testcontainers</groupId>
618+
<artifactId>couchbase</artifactId>
619+
<scope>test</scope>
620+
</dependency>
621+
622+
</dependencies>
610623

611624
</project>

0 commit comments

Comments
 (0)