- Installation
- Getting Started
- Advanced Usage
- Shell Completion
- Hardware Acceleration
- For Developers
- LSP Mode
- MCP Server
After 0.1.3, the CLI supports Python 3.11~3.13. The recommended way of
installation is through pipx
, which will
create a virtual environment for the package itself that doesn't mess up with
your system Python or project-local virtual environments.
After installing pipx
, run:
pipx install vectorcode
in your shell. To specify a particular version of Python, use the --python
flag. For example, pipx install vectorcode --python python3.11
. For hardware
accelerated embedding, refer to the relevant section.
Note
The command only install VectorCode and SentenceTransformer
, the default
embedding engine. To use a different embedding function supported by Chromadb,
you may need to use pipx inject
to install extra dependences to the virtual
environment that pipx
creates for VectorCode. This may include OpenAI,
Ollama and other self/cloud-hosted embedding model providers.
To install from source, either git clone
this reposority and run pipx install <path_to_vectorcode_repo>
, or use the git URL:
pipx install git+https://github.com/Davidyz/VectorCode
Chromadb is the vector database used by VectorCode to store and retrieve the code embeddings. Although it is already bundled with VectorCode and you can absolutely use VectorCode just fine, it is recommended to set up a standalone local server (they provides detailed instructions through docker and systemd), because this will significantly reduce the IO overhead.
At the moment Windows users need to install a standalone chromadb server because I haven't figured out a way to reliably manage the bundled chromadb instance. This should be straightforward if you have docker installed. See Chromadb documentation for details.
If your environment doesn't support numpy
version 2.0+, the default,
unconstrained numpy picked by pipx
may not work for you. In this case, you can
try installing the package by pipx install vectorcode[legacy]
, which enforces
numpy v1.x
. If this doesn't help, please open an issue with your OS, CPU
architecture, python version and the vectorcode virtual environment
(pipx runpip vectorcode freeze
).
cd
into your project root repo, and run:
vectorcode init
This will initialise the project for VectorCode and create a .vectorcode
directory in your project root. This is where you keep your configuration file
for VectorCode, if any.
After that, you can start vectorising files for the project:
vectorcode vectorise src/**/*.py
VectorCode doesn't track file changes, so you need to re-vectorise edited files. You may automate this by a git pre-commit hook, etc.
And now, you're ready to make queries that will retrieve the relevant documents:
vectorcode query reranker -n 3
This will try to find the 3 most relevant documents in the embedding database
that are related to the query reranker
. You can pass multiple query words:
vectorcode query embedding reranking -n 3
or if you want to query a sentence, wrap them in quotation mark:
vectorcode query "How to configure reranker model"
If things are going right, you'll see some paths being printed, followed by their content. These are the selected documents that are relevant to the query.
If you want to wipe the embedding for the repository (to use a new embedding function or after an upgrade with breaking changes), use
vectorcode drop
To see a full list of CLI options and tricks to optimise the retrieval, keep
reading or use the --help
flag.
Please try the following and see if any of these fix your issue:
drop
the collection and re-index it, because there may be changes in the way embeddings are stored in the database;- upgrade/re-install the CLI (via
pipx
or however you installed VectorCode).
For each project, VectorCode creates a collection (similar to tables in
traditional databases) and puts the code embeddings in the corresponsing
collection. In the root directory of a project, you may run vectorcode init
.
This will initialise the repository with a subdirectory
project_root/.vectorcode/
. This will mark this directory a project root, a
concept that will later be used to construct the collection. You may put a
config.json
file in project_root/.vectorcode
. This file may be used to store
project-specific settings such as embedding functions and database entry point
(more on this later). If you already have a global configuration file at
~/.config/vectorcode/config.json
, it will be copied to
project_root/.vectorcode/config.json
when you run vectorcode init
. When a
project-local config file is present, the global configuration file is ignored
to avoid confusion.
If you skip vectorcode init
, VectorCode will look for a directory that
contains .git/
subdirectory and use it as the project root. In this case, the
default global configuration will be used. If .git/
does not exist, VectorCode
falls back to using the current working directory as the project root.
The JSON configuration file may hold the following values:
embedding_function
: string, one of the embedding functions supported by Chromadb (find more here and here). For example, Chromadb supports Ollama aschromadb.utils.embedding_functions.OllamaEmbeddingFunction
, and the corresponding value forembedding_function
would beOllamaEmbeddingFunction
. Default:SentenceTransformerEmbeddingFunction
;embedding_params
: dictionary, stores whatever initialisation parameters your embedding function takes. ForOllamaEmbeddingFunction
, if you setembedding_params
to:Then the embedding function object will be initialised as{ "url": "http://127.0.0.1:11434/api/embeddings", "model_name": "nomic-embed-text" }
OllamaEmbeddingFunction(url="http://127.0.0.1:11434/api/embeddings", model_name="nomic-embed-text")
. Default:{}
;host
andport
: string and integer, Chromadb server host and port. VectorCode will start an HTTP server for Chromadb at a randomly picked free port onlocalhost
if your configuredhost:port
is not accessible. This allows the use ofAsyncHttpClient
. Default:127.0.0.1:8000
;db_path
: string, Path to local persistent database. This is where the files for your database will be stored. Default:~/.local/share/vectorcode/chromadb/
;chunk_size
: integer, the maximum number of characters per chunk. A larger value reduces the number of items in the database, and hence accelerates the search, but at the cost of potentially truncated data and lost information. Default:-1
(no chunking), but it's highly recommended to set it to a positive integer that works for your model when working with large documents;overlap_ratio
: float between 0 and 1, the ratio of overlapping/shared content between 2 adjacent chunks. A larger ratio improves the coherences of chunks, but at the cost of increasing number of entries in the database and hence slowing down the search. Default:0.2
. Starting from 0.4.11, VectorCode will use treesitter to parse languages that it can automatically detect. It uses pygments to guess the language from filename, and tree-sitter-language-pack to fetch the correct parser.overlap_ratio
has no effects when treesitter works. If VectorCode fails to find an appropriate parser, it'll fallback to the legacy naive parser, in which caseoverlap_ratio
works exactly in the same way as before;query_multplier
: integer, when you use thequery
command to retrieven
documents, VectorCode will checkn * query_multplier
chunks and return at mostn
documents. A larger value ofquery_multplier
guarantees the return ofn
documents, but with the risk of including too many less-relevant chunks that may affect the document selection. Default:-1
(any negative value means selecting documents based on all indexed chunks);reranker
: string, a reranking model supported byCrossEncoder
. A list of available models is available on their documentation. The default is not to use a reranker as it increases the time needed for each query;reranker_params
: dictionary, similar toembedding_params
. The options passed toCrossEncoder
class constructor;db_settings
: dictionary, works in a similar way toembedding_params
, but for Chromadb client settings so that you can configure authentication for remote Chromadb.
Run vectorcode vectorise <path_to_your_file>
or vectorcode vectorise <directory> -r
. There are a few extra tweaks you may use:
- chunk size: embedding APIs may truncate long documents so that the documents
can be handled by the embedding models. To solve this, VectorCode implemented
basic chunking features that chunks the documents into smaller segments so
that the embeddings are more representative of the code content. To adjust the
chunk size when vectorising, you may either set the
chunk_size
option in the JSON configuration file, or use--chunk_size
/-c
parameter of thevectorise
command to specify the maximum number of characters per chunk; - overlapping ratio: when the chunk size is set to
$c$ and overlapping ratio set to$o$ , the maximum number of repeated content between 2 adjacent chunks will be$c \times o$ . This prevents loss of information due to the key characters being cut into 2 chunks. To configure this, you may either setoverlap_ratio
in JSON configuration file or use--overlap
/-o
parameter.
Note that, the documents being vectorised is not limited to source code. You can even try documentation/README, or files that are in the filesystem but not in the project directory (yes I'm talking about neovim lua runtimes).
This command also respects .gitignore
. It by default skips files in
.gitignore
. To override this, run the vectorise
command with -f
/--force
flag.
As a shorthand, you can create a file at project_root/.vectorcode/vectorcode.include
.
This file should follow the same syntax as a
gitignore
file. Files matched by this
specs will be vectorised when you run vectorcode vectorise
without specifying
files. This file has lower priority than .gitignore
, but you can override this
by the -f
flag. It also doesn't assume --recursive
, so if you want to add a
whole directory to this file, you can use dir/**
, which matches all content
of dir/
recursively.
There's also a update
subcommand, which updates the embedding for all the indexed
files and remove the embeddings for files that no longer exist.
To retrieve a list of documents from the database, you can use the following command:
vectorcode query "your query message"
The command can take an arbitrary number of query words, but make sure that full-sentences are enclosed in quotation marks. Otherwise, they may be interpreted as separated words and the embeddings may be inaccurate. The returned results are sorted by their similarity to the query message.
You may also specify how many documents should be retrieved with -n
/--number
parameter (default is 1). This is the maximum number of documents that may be
returned. Depending on a number of factors, the actual returned documents may be
less than this number but at least 1 document will be returned.
You may also set a multiplier for the queries. When VectorCode sends queries to
the database, it receives chunks, not document. It then uses some scoring
algorithms to determine which documents are the best fit. The multiplier, set by
command-line flag --multiplier
or -m
, defines how many chunks VectorCode
will request from the database. The default is -1
, which means to retrieve all
chunks. A larger multiplier guarantees the return of n
documents, but with the risk
of including too many less-relevant chunks that may affect the document selection.
The query
subcommand also supports customising chunk size and overlapping
ratio because when the query message is too long it might be necessary to chunk
it. The parameters follow the same syntax as in vectorise
command.
The CLI defaults to return the relative path of the documents from the project
root. To use absolute path, add the --absolute
flag.
If you wish to limit the output to "path only" or "document (content) only", you
can achieve this by using the --include
flag:
vectorcode query foo bar --include path
This will only include the path
in the output. This is effective for both
normal CLI usage and --pipe
mode.
You can use vectorcode ls
command to list all collections in your Chromadb.
This is useful if you want to check whether the collection has been created for
the current project or not. The output will be a table with 4 columns:
- Project Root: path to the directory where VectorCode vectorised;
- Collection Size: number of chunks in the database;
- Number of Files: number of files that have been indexed;
- Embedding Function: name of embedding function used for this collection.
You can use vectorcode drop
command to remove a collection from Chromadb. This
is useful if you want to clean up your Chromadb database, or if the project has
been deleted, and you don't need its embeddings any more.
You may run vectorcode check
command to check whether VectorCode is properly
installed and configured for your project. This currently supports only 1 check:
config
: checks whether a project-local configuration directory exists. Prints the project-root if successful, otherwise returns a non-zero exit code.
Running vectorcode check config
is faster than running vectorcode query some_message
and then getting an empty results.
VectorCode supports shell completion for bash/zsh/tcsh. You can use vectorcode -s {bash,zsh,tcsh}
or vectorcode --print-completion {bash,zsh,tcsh}
to print the completion script
for your shell of choice.
This section covers hardware acceleration when using sentence transformer as the embedding backend.
For Nvidia users this should work out of the box. If not, try setting the following options in the JSON config file:
{
"embedding_params": {
"backend": "torch",
"device": "cuda"
},
}
For Intel users, sentence transformer
supports OpenVINO
backend for supported GPU. Run pipx install vectorcode[intel]
which will
bundle the relevant libraries when you install VectorCode. After that, you will
need to configure SentenceTransformer
to use openvino
backend. In your
config.json
, set backend
key in embedding_params
to "openvino"
:
{
"embedding_params": {
"backend": "openvino",
},
}
This will run the embedding model on your GPU. This is supported even for some integrated GPUs.
When using the default embedding function, any options inside the
"embedding_params"
will go to the class constructor of SentenceTransformer
,
so you can always take a look at
their documentation
for detailed information regardless of your platform.
To develop a tool that makes use of VectorCode, you may find the --pipe
/-p
flag helpful. It formats the output into JSON and suppress other outputs so that
you can grab whatever's in the STDOUT
and parse it as a JSON document. In
fact, this is exactly what I did when I wrote the neovim plugin.
For the query command, here's the format printed in the pipe
mode:
[
{
"path": "path_to_your_code.py",
"document":"import something"
},
{
"path": "path_to_another_file.py",
"document": "print('hello world')"
}
]
Basically an array of dictionaries with 2 keys: "path"
for the path to the
document, and "document"
for the content of the document.
The output is in JSON format. It contains a dictionary with the following fields:
"add"
: number of added documents;"update"
: number of updated documents;"removed"
: number of removed documents;
A JSON array of collection information of the following format will be printed:
{
"project_root": str,
"user": str,
"hostname": str,
"collection_name": str,
"size": int,
"num_files": int,
"embedding_function": str
}
"project_root"
: the path to theproject-root
;"user"
: your *nix username, which are automatically added when vectorising to avoid collision;"hostname"
: your *nix hostname. The purpose of this field is the same as the"user"
;"collection_name"
: the unique identifier for the project used in the database;"size"
: number of chunks stored in the database;"num_files"
: number of files that have been vectorised in the project.
There's an experimental implementation of VectorCode CLI, which accepts requests
of workspace/executeCommand
from STDIO
. This allows the CLI to keep the embedding model loaded in the
memory/VRAM, and therefore speed up the query by avoiding the IO overhead of
loading the models.
The experimental language server can be installed via the lsp
dependency
group:
pipx install vectorcode[lsp]
# or if you have an existing `vectorcode` install:
pipx inject vectorcode vectorcode[lsp] --force
The LSP request for the workspace/executeCommand
is defined as follows:
{
command: str
arguments: list[Any]
}
For the vectorcode-server
, the only valid value for the command
key is
"vectorcode"
, and arguments
is any other remaining components of a valid CLI
command. For example, to execute vectorcode query -n 10 reranker
, the request
would be:
{
command: "vectorcode",
arguments: ["query", "-n", "10", "reranker"]
}
The vectorcode-server
optionally accepts a --project_root
parameter, which
specifies the default project root for this process. If not specified, it
will:
- try to find a project root by root anchors (
.vectorcode
or.git
) starting from the current working directory; - if 1 fails, but the first request contains a
--project_root
parameter, it will use that as the default project root for this process; - if 2 fails too, the process throws an error.
Note that:
- For easier parsing,
--pipe
is assumed to be enabled in LSP mode; - At the time this only work with vectorcode setup that uses a standalone ChromaDB server, which is not difficult to setup using docker;
- At the time this only work with
query
subcommand. I will consider adding support for other subcommand but first I need to figure out how to properly manageproject_root
across different requests if they change.
Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to LLMs. VectorCode provides an experimental implementation that provides the following features:
- list local collections;
- query from a given collection.
This hasn't been thoroughly tested and interfaces are subjected to changes. To
try it out, install the vectorcode[mcp]
dependency group and the MCP server is
available in the shell as vectorcode-mcp-server
. To play around with this,
make sure:
- you're using a standalone chromadb server configured in the JSON
via
host
andport
; - you start the MCP server from a VectorCode project root (with the
.vectorcode
directory and some files already vectorised).