Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Encapsulate Categorize instanceof checks inside aggs & rules #124608

Open
costin opened this issue Mar 12, 2025 · 2 comments
Open

ESQL: Encapsulate Categorize instanceof checks inside aggs & rules #124608

costin opened this issue Mar 12, 2025 · 2 comments
Assignees
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@costin
Copy link
Member

costin commented Mar 12, 2025

Description

Code dedicated to Categorize has made its way into Aggregate & replaced rules such as (ReplaceAggregateAggExpressionWithEval or ReplaceAggregateNestedExpressionWithEval). While some of it is marked with TODO, this needs to extracted into a central place (to better manage it) or even better, make it work with the existing rules either by breaking down categorize into existing primitives (e.g. Aware interfaces) or extracting these if missing.

From the logical plan perspective, Categorize is just another aggregate function - special handling is typically an indicator of wrong abstraction (which causes the rules to 'misfire').

@costin costin added the >bug label Mar 12, 2025
@costin costin added the :Analytics/ES|QL AKA ESQL label Mar 12, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 12, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@ivancea
Copy link
Contributor

ivancea commented Mar 12, 2025

Categorize is a special GroupingFunction because it's "stateful". We could add some kind of isStateful() to GroupingFunction and override it there, or a new StatefulGroupingFunction subclass.

Also, the only other GroupingFunction subclass is Bucket, which technically works as a normal function. We could make Bucket inherit from Function instead, to simplify the meaning of that subclass, and queries bucket things in EVALs too. This would be a functional change, but not a breaking change. We could keep it as a "grouping function" in the docs.
This is something I would discuss before working on this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

5 participants