You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I would like to be able to use the KEEP_FIRST option for distinct. At the moment, it is set to only KEEP_ANY, with no way to change it. It would be ideal to be able to change the keep_option. This is necessary for correctly implementing array_distinct in spark -rapids.
Describe the solution you'd like
I would like changes to expose this parameter in ColumnView.java and thus also exposing it in whatever necessary parameters along the way.
Describe alternatives you've considered
Considered duplicating the kernel in spark-rapids-jni, but this makes maintainability harder and it makes more sense to expose the parameter in cudf.
The text was updated successfully, but these errors were encountered:
First, add this parameter to the detail:: API. Then add a copy of the current public API with a new parameter duplicate_keep_option before the null_equality and nan_equality. Then deprecate the existing public API, and make it call the new detail:: API with KEEP_ANY. The deprecation in the API docs should say @deprecated Deprecated in 25.04, to be removed in 25.06. Example of how to deprecate an API: #17221
Is your feature request related to a problem? Please describe.
I would like to be able to use the KEEP_FIRST option for distinct. At the moment, it is set to only KEEP_ANY, with no way to change it. It would be ideal to be able to change the keep_option. This is necessary for correctly implementing array_distinct in spark -rapids.
Describe the solution you'd like
I would like changes to expose this parameter in ColumnView.java and thus also exposing it in whatever necessary parameters along the way.
Describe alternatives you've considered
Considered duplicating the kernel in spark-rapids-jni, but this makes maintainability harder and it makes more sense to expose the parameter in cudf.
The text was updated successfully, but these errors were encountered: