You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Raise diskann maximum dimension from 2K to 16K (#181)
This PR fixes#100 and raises the dimension limit for pgvectorscale's
diskann index from 2000 to 16000, which is the maximum supported by the
underlying pgvector `vector` type.
The previous limit of 2000 was needed to ensure that all data structures
could be serialized onto single 8K pages. When going beyond 2000
dimensions, so long as SBQ is used for storage, quantized vectors,
neighbor lists, and other data structures will still fit on a single
page; the only thing that grows too large is `SbqMeans`. (The raw
vectors used for reranking remain in the source relation, where standard
Postgres TOAST machinery is used to read/write them). If plain storage
is used, the old limit of 2000 remains in place.
To deal with `SbqMeans`, we introduce a `ChainTape` data structure that
is similar to `Tape` but supports reads/writes of large buffers across
pages. The chained representation is considered a property of the
`PageType`, and we introduce a new `PageType` for `SbqMeans` along with
upgrade machinery from the old version. Similarly to the versioned
`MetaPage`, there are no unit tests for this, but I did ad-hoc testing
to confirm that the upgrade path works.
0 commit comments