You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
WL#16081 - Native Vector Embeddings Support In HeatWave
BUG#36165262 - WL#16081: Table is allowed to be partitioned on vector column
BUG#36167088 - WL#16081: Generated columns allowed on vector columns
BUG#36168511 - WL#16081: Issues with Vector column constraints
BUG#36195637 - Wl16081: Alter table not giving error when new dimension less than existing data
BUG#36194832 - WL#16081: STRING_TO_VECTOR function requires a vector type column
BUG#36168535 - [Wl16081] Select Hangs whe VECTOR_TO_STRING called on vector column
BUG#36206068 - [Wl16081] : Read-Replica broken with Vector data type
BUG#36214076 - WL#16081: Rpdserver crash - sig11 at from_string_to_vector
BUG#36225693 - WL#16081: Functions SHA1, MD5, SHA2 return error for vector data loaded to rapid
BUG#36241312 - WL#16081: Error All plans were rejected by HeatWave secondary engine
BUG#36255628 - WL#16081:setting secondary_engine of table having vector column taking long time
BUG#36265079 - WL#16081: wrong result when IS NULL applied to distance() function output
BUG#36272178 - WL#16081:virtual bool Item_func_get_user_var::propagate_type(THD*, const Type_properties&): Assertion `false' failed.
BUG#36239717 - to_base64() on vector column of miracl dataset crashes rapid
BUG#36255777 - WL#16081: Mysqld crash - Assertion `!std::isnan(nr)' failed.
BUG#36281463 - STRING_TO_VECTOR() returning error "Data cannot be converted"
BUG#36285521 - WL#16081: Mysqld crash - Assertion `!thd->is_error()' failed
BUG#36287504 - WL#16081: mysqld crash at Item_func_to_vector::val_str for ASAN
BUG#36267410 - WL#16081: mysqld crash at ParseBlob () in change_prop/rpd_binlog_parser.cc
This worklog will implement vector support in MySQL HeatWave.
- VECTOR column type: This will be an addition to the CREATE TABLE
statement as a new column data_type. Under the hood, it is mainly a
syntactic change; the VECTOR columns will be a wrapper around BLOB.
All the limitations that apply to BLOB apply for VECTOR.
More restrictions for VECTOR will also be in place, discussed below.
- DISTANCE function: This will be implemented as a component/UDF.
The function is responsible for computing the distance between two
VECTOR entries, of exactly the same dimension. Exact semantics will be
discussed as part of functional requirements.
- UTILITY functions:
-- VECTOR_DIM: Returns the dimensionality of each vector entry.
-- VECTOR_TO_STRING/FROM_VECTOR: Converts vector to human readable format.
-- STRING_TO_VECTOR/TO_VECTOR: Converts human readable format to vector.
Detailed Change Description
===========================
** VECTOR type support **
- sql/lex.h: The support for VECTOR keyword, along with VECTOR_SYM.
- sql/sql_yacc.yy: Change in directing VECTOR_SYM to PT_vector_type.
- VECTOR(N) -> N is **optional** field_length.
- If N is not provided, it will be assumed 2048 by default.
- sql/parse_tree_column_attrs.h: PT_vector_type, inherits PT_char_type.
- It has additional vector_length to store given length.
- The vector_length is computed by multiplying the given length
value N by sizeof(float), since the entry precision is always
a single-precision floating-point value.
- It overrides is_vector to return true.
- sql/field.h:
- Field_vector inherits from Field_blob
- As its real_type, it will return MYSQL_TYPE_VECTOR: a newly
introduced field type.
- sql/dd/types/column.h:
- New virtual is_vector() function.
- sql/dd/impl/types/column_impl.h:
- Column_impl overrides is_vector() -> It will return if its
m_column_type_utf8 has "vector" string in its beginning.
** Restrictions on VECTOR type **
- share/messages_to_clients.txt: Introduced following errors:
- ER_VECTOR_USED_AS_KEY
- ER_UNABLE_TO_BUILD_HISTOGRAM_VECTOR
- sql/sql_table.cc:
- At prepare_key_column, if a VECTOR typed column is found,
throw ER_VECTOR_USED_AS_KEY. This ensures the restriction on
VECTOR tyoe as PRIMARY KEY, FOREIGN KEY, UNIQUE, etc.
- sql/histograms/histogram.cc: If a histogram is being build on VECTOR
typed column, throw ER_UNABLE_TO_BUILD_HISTOGRAM_VECTOR.
** DISTANCE function support **
- new component: vector
- vector.cc:
- Implements component_vector
- Implements DISTANCE(arg_0, arg_1, <optional>distance_metric)
- It will calculate the distance of each row at arg_0 and arg_1
- If the length of either argument is not matching (evaluated
for each row), the output for that row will be NULL.
- The output precision of DISTANCE is double-precision (8 Bytes)
- vector.hpp: Standalone vectorized implementation of vector functions.
** Utility functions support **
- VECTOR_DIM
- VECTOR_TO_STRING
- STRING_TO_VECTOR
Functional requirements as described in the WL page.
** Restrictions on DISTANCE function **
- Expects 2 or 3 arguments.
- All arguments must be of type STRING_RESULT.
- If there is a 3rd argument, it must be one of these:
- "DOT": dot/inner product
- "COSINE": cosine distance
- "EUCLIDIAN": euclidian distance
- If there is no 3rd argument, "DOT" will be used by default.
** HeatWave side support **
- VECTOR type:
- For HeatWave, a vector("N") typed column is regarded as
BLOB("sizeof(float)*N"), column.
- One difference: For VECTOR typed columns, compression at load will
be disabled by default.
- DISTANCE functions:
- These will be interpreted as other Item_funcs. Same restrictions
as mysql side apply.
- Additionally, we can check if there is at least one vector typed
column in the Item_func arguments.
- The new primitive introduced uses the same distance function
implementations as in vector.hpp
** 36165262 **
Vector column as partitioning key is blocked.
** 36167088 **
Vector column as part of generated column expression is blocked.
** 36168511 **
Addressed HeatWave load constraints: DICT encoding, ZONEMAP
** 36168535 **
This was a QCOMP bug at rapid: Mismatch between vbsize setting
and the consideration of the nullbv for chunkv size setting.
No need to consider nullbv for VARLEN columns.
** 36206068 **
In rpl_utility, enable conversion between BLOB and VECTOR
** 36214076 **
Handle zero length strings at STRING_TO_VECTOR
** 36225693 **
This was an issue on the trunk (charset handling in QKRN)
** 36241312 **
QKRN stats being set conservatively for conversion functions.
** 36255628 **
Field_vector was using is_equal of parent Field_blob, which
was leading to unnecessary not equal behavior. Overridden.
** 36265079 **
Issue was in rapid primitive: Instead of comparing dims, we
were comparing varlen max lengths.
** 36272178 **
Handle session var type propagation.
** 36239717 **
This is a bug on trunk, related to QKRN stats setting for
TO_BASE64 at QKRN.
** 36255777 **
Handle NaN values in distance calculation
** 36281463 **
reset errno before starting the conversion
** 36285521 **
VECTOR type check at Item_func_equal::resolve_type
** 36287504 **
Limit the number of chars printed at ER_TO_VECTOR_CONVERSION
** 36267410 **
Vector fields are marked as memcpy-able, which is not correct.
This was causing a binlog corruption.
Change-Id: I68d562193acd0cb00823908b0398f8215345521f
0 commit comments