Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: could not convert string to float when executing SELECT or ADD query with cosine_distance #114

Closed
SharkSyl opened this issue Jan 23, 2025 · 2 comments

Comments

@SharkSyl
Copy link

When running a SELECT or ADD query with cosine_distance on the embedding column, I encountered the following error:

(sqlalchemy.dialects.postgresql.asyncpg.Error) <class 'asyncpg.exceptions.DataError'>: invalid input for query argument $2: '[1.0,3.0,4.0]' (could not convert string to float: '[1.0,3.0,4.0]')
[SQL: INSERT INTO orm_embedding_3 (id, embedding) VALUES ($1::BIGINT, $2) RETURNING orm_embedding_3.created, orm_embedding_3.modified]
[parameters: (99, '[1.0,3.0,4.0]')]
(Background on this error at: https://sqlalche.me/e/20/dbapi)

The complete information is as follows:

2025-01-23 11:37:24,348 INFO sqlalchemy.engine.Engine select pg_catalog.version()
2025-01-23 11:37:24,348 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-01-23 11:37:24,351 INFO sqlalchemy.engine.Engine select current_schema()
2025-01-23 11:37:24,351 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-01-23 11:37:24,368 INFO sqlalchemy.engine.Engine show standard_conforming_strings
2025-01-23 11:37:24,368 INFO sqlalchemy.engine.Engine [raw sql] ()
Init PGVector
2025-01-23 11:37:24,447 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-01-23 11:37:24,449 INFO sqlalchemy.engine.Engine SELECT pg_catalog.pg_class.relname 
FROM pg_catalog.pg_class JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace 
WHERE pg_catalog.pg_class.relname = $1::VARCHAR AND pg_catalog.pg_class.relkind = ANY (ARRAY[$2::VARCHAR, $3::VARCHAR, $4::VARCHAR, $5::VARCHAR, $6::VARCHAR]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != $7::VARCHAR
2025-01-23 11:37:24,449 INFO sqlalchemy.engine.Engine [generated in 0.00020s] ('orm_embedding_3', 'r', 'p', 'f', 'v', 'm', 'pg_catalog')
2025-01-23 11:37:24,453 INFO sqlalchemy.engine.Engine COMMIT
2025-01-23 11:37:24,455 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-01-23 11:37:24,455 INFO sqlalchemy.engine.Engine SELECT pg_catalog.pg_class.relname 
FROM pg_catalog.pg_class JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace 
WHERE pg_catalog.pg_class.relname = $1::VARCHAR AND pg_catalog.pg_class.relkind = ANY (ARRAY[$2::VARCHAR, $3::VARCHAR, $4::VARCHAR, $5::VARCHAR, $6::VARCHAR]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != $7::VARCHAR
2025-01-23 11:37:24,455 INFO sqlalchemy.engine.Engine [cached since 0.006365s ago] ('orm_embedding_3', 'r', 'p', 'f', 'v', 'm', 'pg_catalog')
2025-01-23 11:37:24,456 INFO sqlalchemy.engine.Engine COMMIT
2025-01-23 11:37:24,458 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-01-23 11:37:24,459 INFO sqlalchemy.engine.Engine INSERT INTO orm_embedding_3 (id, embedding) VALUES ($1::BIGINT, $2) RETURNING orm_embedding_3.created, orm_embedding_3.modified
2025-01-23 11:37:24,459 INFO sqlalchemy.engine.Engine [generated in 0.00012s] (99, '[1.0,3.0,4.0]')
2025-01-23 11:37:24,470 INFO sqlalchemy.engine.Engine ROLLBACK
orm add (sqlalchemy.dialects.postgresql.asyncpg.Error) <class 'asyncpg.exceptions.DataError'>: invalid input for query argument $2: '[1.0,3.0,4.0]' (could not convert string to float: '[1.0,3.0,4.0]')
[SQL: INSERT INTO orm_embedding_3 (id, embedding) VALUES ($1::BIGINT, $2) RETURNING orm_embedding_3.created, orm_embedding_3.modified]
[parameters: (99, '[1.0,3.0,4.0]')]
(Background on this error at: https://sqlalche.me/e/20/dbapi)
2025-01-23 11:37:24,472 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-01-23 11:37:24,473 INFO sqlalchemy.engine.Engine SELECT orm_embedding_3.id, orm_embedding_3.embedding <=> $1 AS score 
FROM orm_embedding_3 ORDER BY score
2025-01-23 11:37:24,473 INFO sqlalchemy.engine.Engine [generated in 0.00011s] ('[1.0,3.0,4.0]',)
2025-01-23 11:37:24,474 INFO sqlalchemy.engine.Engine ROLLBACK
orm insert (sqlalchemy.dialects.postgresql.asyncpg.Error) <class 'asyncpg.exceptions.DataError'>: invalid input for query argument $1: '[1.0,3.0,4.0]' (could not convert string to float: '[1.0,3.0,4.0]')
[SQL: SELECT orm_embedding_3.id, orm_embedding_3.embedding <=> $1 AS score 
FROM orm_embedding_3 ORDER BY score]
[parameters: ('[1.0,3.0,4.0]',)]
(Background on this error at: https://sqlalche.me/e/20/dbapi)

My entire code is as follows:

# -*- coding:utf-8 -*-
from sqlalchemy import URL, PrimaryKeyConstraint, Index, BigInteger, TIMESTAMP, func, select, insert, text, event
from sqlalchemy.orm import Mapped, mapped_column
from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker
from sqlalchemy.orm import declarative_base
from pgvector.sqlalchemy.vector import VECTOR

url = URL.create(
    drivername='postgresql+asyncpg',
    username='postgres',
    password='vector123',
    host='localhost',
    port=5433,
    database='vector_demo',
)

async_engine_project = create_async_engine(url, echo=True, future=True, pool_pre_ping=True)
async_session_target = async_sessionmaker(bind=async_engine_project, autoflush=False, expire_on_commit=False)


@event.listens_for(async_engine_project.sync_engine, "connect")
def aconnect(dbapi_connection, connection_record):
    from pgvector.asyncpg import register_vector
    print('Init PGVector')
    dbapi_connection.run_async(register_vector)


Base = declarative_base()


class ORMEmbeddingTest(Base):
    table_name = "orm_embedding_3"
    __tablename__ = table_name
    __table_args__ = (
        PrimaryKeyConstraint('id', name=f'{table_name}_pkey'),
        Index(
            f'{table_name}_embedding_index',
            'embedding',
            postgresql_using='hnsw',
            postgresql_with={'m': 16, 'ef_construction': 64},
            postgresql_ops={'embedding': 'vector_cosine_ops'}
        )
    )
    id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
    embedding: Mapped[list] = mapped_column(VECTOR(3), nullable=False)
    created: Mapped[str] = mapped_column(TIMESTAMP, server_default=func.now(), nullable=False)
    modified: Mapped[str] = mapped_column(TIMESTAMP, server_default=func.now(), onupdate=func.now(), nullable=False)


async def create_table():
    """创建数据库表"""
    async with async_engine_project.begin() as coon:
        await coon.run_sync(
            Base.metadata.create_all,
            tables=[Base.metadata.tables[ORMEmbeddingTest.__tablename__]]
        )


async def drop_table():
    """创建数据库表"""
    async with async_engine_project.begin() as coon:
        await coon.run_sync(
            Base.metadata.create_all,
            tables=[Base.metadata.tables[ORMEmbeddingTest.__tablename__]]
        )


async def run_test():
    await drop_table()
    await create_table()

    query_embedding = [1, 3, 4]
    db_obj = ORMEmbeddingTest(
        id=99,
        embedding=query_embedding
    )

    try:
        async with async_session_target() as db:
            db.add(db_obj)
            await db.commit()
    except Exception as e:
        print('orm add', e)

    try:
        async with async_session_target() as db:
            select_test = await db.scalars(
                select(
                    ORMEmbeddingTest.id,
                    ORMEmbeddingTest.embedding.cosine_distance(query_embedding).label('score')
                ).order_by(text('score'))
            )
            print(select_test)

    except Exception as e:
        print('orm insert', e)


if __name__ == '__main__':
    import asyncio

    asyncio.run(run_test())

The traceback:

2025-01-23 11:48:01,195 INFO sqlalchemy.engine.Engine select pg_catalog.version()
2025-01-23 11:48:01,195 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-01-23 11:48:01,198 INFO sqlalchemy.engine.Engine select current_schema()
2025-01-23 11:48:01,198 INFO sqlalchemy.engine.Engine [raw sql] ()
2025-01-23 11:48:01,200 INFO sqlalchemy.engine.Engine show standard_conforming_strings
2025-01-23 11:48:01,200 INFO sqlalchemy.engine.Engine [raw sql] ()
Init PGVector
2025-01-23 11:48:01,456 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-01-23 11:48:01,458 INFO sqlalchemy.engine.Engine SELECT pg_catalog.pg_class.relname 
FROM pg_catalog.pg_class JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace 
WHERE pg_catalog.pg_class.relname = $1::VARCHAR AND pg_catalog.pg_class.relkind = ANY (ARRAY[$2::VARCHAR, $3::VARCHAR, $4::VARCHAR, $5::VARCHAR, $6::VARCHAR]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != $7::VARCHAR
2025-01-23 11:48:01,458 INFO sqlalchemy.engine.Engine [generated in 0.00016s] ('orm_embedding_3', 'r', 'p', 'f', 'v', 'm', 'pg_catalog')
2025-01-23 11:48:01,461 INFO sqlalchemy.engine.Engine COMMIT
2025-01-23 11:48:01,464 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-01-23 11:48:01,464 INFO sqlalchemy.engine.Engine SELECT pg_catalog.pg_class.relname 
FROM pg_catalog.pg_class JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace 
WHERE pg_catalog.pg_class.relname = $1::VARCHAR AND pg_catalog.pg_class.relkind = ANY (ARRAY[$2::VARCHAR, $3::VARCHAR, $4::VARCHAR, $5::VARCHAR, $6::VARCHAR]) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != $7::VARCHAR
2025-01-23 11:48:01,464 INFO sqlalchemy.engine.Engine [cached since 0.006155s ago] ('orm_embedding_3', 'r', 'p', 'f', 'v', 'm', 'pg_catalog')
2025-01-23 11:48:01,465 INFO sqlalchemy.engine.Engine COMMIT
2025-01-23 11:48:01,467 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2025-01-23 11:48:01,468 INFO sqlalchemy.engine.Engine INSERT INTO orm_embedding_3 (id, embedding) VALUES ($1::BIGINT, $2) RETURNING orm_embedding_3.created, orm_embedding_3.modified
2025-01-23 11:48:01,468 INFO sqlalchemy.engine.Engine [generated in 0.00013s] (99, '[1.0,3.0,4.0]')
2025-01-23 11:48:01,477 INFO sqlalchemy.engine.Engine ROLLBACK
Traceback (most recent call last):
  File "asyncpg\\protocol\\prepared_stmt.pyx", line 175, in asyncpg.protocol.protocol.PreparedStatementState._encode_bind_msg
  File "asyncpg\\protocol\\codecs/base.pyx", line 227, in asyncpg.protocol.protocol.Codec.encode
  File "asyncpg\\protocol\\codecs/base.pyx", line 207, in asyncpg.protocol.protocol.Codec.encode_in_python
  File "D:\01-work\pg_test\.venv\Lib\site-packages\pgvector\utils\vector.py", line 62, in _to_db_binary
    value = cls(value)
            ^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\pgvector\utils\vector.py", line 9, in __init__
    value = np.asarray(value, dtype='>f4')
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not convert string to float: '[1.0,3.0,4.0]'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 545, in _prepare_and_execute
    self._rows = deque(await prepared_stmt.fetch(*parameters))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\asyncpg\prepared_stmt.py", line 176, in fetch
    data = await self.__bind_execute(args, 0, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\asyncpg\prepared_stmt.py", line 267, in __bind_execute
    data, status, _ = await self.__do_execute(
                      ^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\asyncpg\prepared_stmt.py", line 256, in __do_execute
    return await executor(protocol)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "asyncpg\\protocol\\protocol.pyx", line 185, in bind_execute
  File "asyncpg\\protocol\\prepared_stmt.pyx", line 204, in asyncpg.protocol.protocol.PreparedStatementState._encode_bind_msg
asyncpg.exceptions.DataError: invalid input for query argument $2: '[1.0,3.0,4.0]' (could not convert string to float: '[1.0,3.0,4.0]')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1964, in _exec_single_context
    self.dialect.do_execute(
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\default.py", line 942, in do_execute
    cursor.execute(statement, parameters)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 580, in execute
    self._adapt_connection.await_(
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 132, in await_only
    return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 196, in greenlet_spawn
    value = await result
            ^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 558, in _prepare_and_execute
    self._handle_exception(error)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 508, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 792, in _handle_exception
    raise translated_error from error
sqlalchemy.dialects.postgresql.asyncpg.AsyncAdapt_asyncpg_dbapi.Error: <class 'asyncpg.exceptions.DataError'>: invalid input for query argument $2: '[1.0,3.0,4.0]' (could not convert string to float: '[1.0,3.0,4.0]')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\01-work\pg_test\laboratory\lab_pg_vector\examples\openai\orm.py", line 99, in <module>
    asyncio.run(run_test())
  File "D:\01-work\02-CodeEngine\initial\Python311-9\Lib\asyncio\runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "D:\01-work\02-CodeEngine\initial\Python311-9\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\02-CodeEngine\initial\Python311-9\Lib\asyncio\base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\laboratory\lab_pg_vector\examples\openai\orm.py", line 80, in run_test
    await db.commit()
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\ext\asyncio\session.py", line 1011, in commit
    await greenlet_spawn(self.sync_session.commit)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 203, in greenlet_spawn
    result = context.switch(value)
             ^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\session.py", line 2032, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\session.py", line 1313, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\state_changes.py", line 139, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\session.py", line 1288, in _prepare_impl
    self.session.flush()
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\session.py", line 4353, in flush
    self._flush(objects)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\session.py", line 4488, in _flush
    with util.safe_reraise():
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\util\langhelpers.py", line 146, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\session.py", line 4449, in _flush
    flush_context.execute()
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\unitofwork.py", line 466, in execute
    rec.execute(self)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\unitofwork.py", line 642, in execute
    util.preloaded.orm_persistence.save_obj(
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\persistence.py", line 93, in save_obj
    _emit_insert_statements(
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\orm\persistence.py", line 1233, in _emit_insert_statements
    result = connection.execute(
             ^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1416, in execute
    return meth(
           ^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\sql\elements.py", line 515, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1638, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1843, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1983, in _exec_single_context
    self._handle_dbapi_exception(
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 2352, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\base.py", line 1964, in _exec_single_context
    self.dialect.do_execute(
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\engine\default.py", line 942, in do_execute
    cursor.execute(statement, parameters)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 580, in execute
    self._adapt_connection.await_(
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 132, in await_only
    return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 196, in greenlet_spawn
    value = await result
            ^^^^^^^^^^^^
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 558, in _prepare_and_execute
    self._handle_exception(error)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 508, in _handle_exception
    self._adapt_connection._handle_exception(error)
  File "D:\01-work\pg_test\.venv\Lib\site-packages\sqlalchemy\dialects\postgresql\asyncpg.py", line 792, in _handle_exception
    raise translated_error from error
sqlalchemy.exc.DBAPIError: (sqlalchemy.dialects.postgresql.asyncpg.Error) <class 'asyncpg.exceptions.DataError'>: invalid input for query argument $2: '[1.0,3.0,4.0]' (could not convert string to float: '[1.0,3.0,4.0]')
[SQL: INSERT INTO orm_embedding_3 (id, embedding) VALUES ($1::BIGINT, $2) RETURNING orm_embedding_3.created, orm_embedding_3.modified]
[parameters: (99, '[1.0,3.0,4.0]')]
(Background on this error at: https://sqlalche.me/e/20/dbapi)
@SharkSyl SharkSyl changed the title Error: could not convert string to float when executing SELECT query with cosine_distance Error: could not convert string to float when executing SELECT or ADD query with cosine_distance Jan 23, 2025
@SharkSyl
Copy link
Author

I changed my code from asynchronous to synchronous, and it works normally. Why is this? Does it temporarily not support this kind of asynchronous operation?
I tried to integrate using asynchronous methods in FastAPI.

Dependency versions:

 name         : sqlalchemy                   
 version      : 2.0.37                       
 description  : Database Abstraction Library 

dependencies
 - greenlet !=0.4.17
 - typing-extensions >=4.6.0

required by
 - alembic requires >=1.3.0


 name         : asyncpg
 version      : 0.30.0
 description  : An asyncio PostgreSQL driver 

The entire code is as follows:

# -*- coding:utf-8 -*-
from sqlalchemy import URL, PrimaryKeyConstraint, Index, BigInteger, TIMESTAMP, func, select, insert, text, event
from sqlalchemy.orm import Mapped, mapped_column
from sqlalchemy import create_engine
from sqlalchemy.orm import declarative_base, sessionmaker
from pgvector.sqlalchemy.vector import VECTOR
from pgvector.utils.vector import Vector as VectorUtils

url = URL.create(
    drivername='postgresql',
    username='postgres',
    password='vector123',
    host='localhost',
    port=5433,
    database='vector_demo',
)

engine_project = create_engine(url, echo=True, future=True, pool_pre_ping=True)

session_target = sessionmaker(bind=engine_project, autoflush=False, expire_on_commit=False)


@event.listens_for(engine_project, "connect")
def connect(dbapi_connection, connection_record):
    from pgvector.psycopg2 import register_vector
    print('Init PGVector')
    register_vector(dbapi_connection, globally=False, arrays=True)


Base = declarative_base()


class ORMEmbeddingTest(Base):
    table_name = "orm_embedding_3"
    __tablename__ = table_name
    __table_args__ = (
        PrimaryKeyConstraint('id', name=f'{table_name}_pkey'),
        Index(
            f'{table_name}_embedding_index',
            'embedding',
            postgresql_using='hnsw',
            postgresql_with={'m': 16, 'ef_construction': 64},
            postgresql_ops={'embedding': 'vector_cosine_ops'}
        )
    )
    id: Mapped[int] = mapped_column(BigInteger, primary_key=True, autoincrement=True)
    embedding: Mapped[list] = mapped_column(VECTOR(3), nullable=False)
    created: Mapped[str] = mapped_column(TIMESTAMP, server_default=func.now(), nullable=False)
    modified: Mapped[str] = mapped_column(TIMESTAMP, server_default=func.now(), onupdate=func.now(), nullable=False)


def create_table():
    """创建数据库表"""
    Base.metadata.create_all(
        engine_project,
        tables=[Base.metadata.tables[ORMEmbeddingTest.__tablename__]]
    )


def drop_table():
    """创建数据库表"""
    Base.metadata.drop_all(
        engine_project,
        tables=[Base.metadata.tables[ORMEmbeddingTest.__tablename__]]
    )


def run_test():
    drop_table()
    create_table()

    query_embedding = [1, 3, 4]
    db_obj = ORMEmbeddingTest(
        id=99,
        embedding=query_embedding
    )

    with session_target() as db:
        db.add(db_obj)
        db.commit()

    try:
        with session_target() as db:
            select_test = db.execute(
                select(
                    ORMEmbeddingTest.id,
                    ORMEmbeddingTest.embedding.cosine_distance(query_embedding).label('score')
                ).order_by(text('score'))
            )
            records = select_test.fetchall()
            for record in records:
                print(record.id, record.score)

    except Exception as e:
        print(f'orm insert:{e}')


if __name__ == '__main__':
    run_test()

@ankane
Copy link
Member

ankane commented Feb 9, 2025

Hi @SharkSyl, it'll work if you remove the register_vector call, which isn't needed with SQLAlchemy. Will see if there's a good way to make it not error in this situation in 0.4.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants