Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Use more performant dtype introspection utility in select_dtypes #18219

Open
mroeschke opened this issue Mar 10, 2025 · 0 comments
Open
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@mroeschke
Copy link
Contributor

xref #18141 (comment)

select_dtypes has a internal dtype introspection function cudf_dtype_from_pydata_dtype to "normalize" include= and exclude= dtypes

def cudf_dtype_from_pydata_dtype(dtype):

This function is a bit more expensive than other dtype checking functions in cudf and has a drastic performance difference when normalizing string-specified types vs dtype objects. At minimum some caching should be implemented to not incur such a performance penalty between the two

@mroeschke mroeschke added feature request New feature or request Python Affects Python cuDF API. labels Mar 10, 2025
@mroeschke mroeschke changed the title [FEA] Use more performant dtype in select_dtypes [FEA] Use more performant dtype introspection utility in select_dtypes Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
Status: Todo
Development

No branches or pull requests

1 participant