Skip to content

GroupBy list of ranges #30425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yulObraz opened this issue Mar 7, 2023 · 3 comments
Closed

GroupBy list of ranges #30425

yulObraz opened this issue Mar 7, 2023 · 3 comments

Comments

@yulObraz
Copy link

yulObraz commented Mar 7, 2023

File a bug

I want make request to group values in ranges by local list. Something like this:

                var vals = new int[] { 1, 5 };
                context.Set<Product>().Select(p => new { p = p, r = vals.Where(x => x > p.DaysToManufacture).First() })
                    .GroupBy(x => x.r, (key, list) => new { r = key, count = list.Count() }).ToList();

But I receive error

System.InvalidOperationException : The LINQ expression 'x => (int?)x > EntityShaperExpression: 
    ....Product
    ValueBufferExpression: 
        ProjectionBindingExpression: EmptyProjectionMember
    IsNullable: False
.DaysToManufacture' could not be translated. ...
Stack Trace: 
RelationalSqlTranslatingExpressionVisitor.VisitLambda[T](Expression`1 lambdaExpression)
Expression`1.Accept(ExpressionVisitor visitor)
ExpressionVisitor.Visit(Expression node)
RelationalSqlTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
NpgsqlSqlTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCall)
MethodCallExpression.Accept(ExpressionVisitor visitor)
ExpressionVisitor.Visit(Expression node)
RelationalSqlTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCallExpression)
NpgsqlSqlTranslatingExpressionVisitor.VisitMethodCall(MethodCallExpression methodCall)
MethodCallExpression.Accept(ExpressionVisitor visitor)
<12 more frames...>
QueryCompiler.CompileQueryCore[TResult](IDatabase database, Expression query, IModel model, Boolean async)
<>c__DisplayClass9_0`1.<Execute>b__0()
CompiledQueryCache.GetOrAddQuery[TResult](Object cacheKey, Func`1 compiler)
QueryCompiler.Execute[TResult](Expression query)
EntityQueryProvider.Execute[TResult](Expression expression)
EntityQueryable`1.GetEnumerator()
List`1.ctor(IEnumerable`1 collection)
Enumerable.ToList[TSource](IEnumerable`1 source) 

Additional testing shown that local list support is rather unpredictable. I don't know why requests work or don't work

                var ids = new Guid[] { Guid.NewGuid(), Guid.NewGuid() };
                context.Set<Product>().Where(p => ids.Any(x => x == p.rowguid)).Count();
                context.Set<Product>().Select(p => ids.Any(x => x == p.rowguid)).ToList();
                //context.Set<Product>().Select(p => ids.Where(x => x == p.rowguid).Any()).ToList();//1
                //context.Set<Product>().Select(p => ids.Where(x => x == p.rowguid).Count()).ToList();//2
                //context.Set<Product>().Select(p => ids.Count(x => x == p.rowguid)).ToList();//3
                var idsQ = ids.AsQueryable();
                //context.Set<Product>().Where(p => idsQ.Any(x => x == p.rowguid)).Count();//4
                context.Set<Product>().Where(p => idsQ.AsEnumerable().Any(x => x == p.rowguid)).Count();
                context.Set<Product>().Select(p => idsQ.AsEnumerable().Any(x => x == p.rowguid)).Count();
                //context.Set<Product>().Select(p => idsQ.AsEnumerable().Where(x => x == p.rowguid).Any()).ToList();//5
                //context.Set<Product>().Select(p => idsQ.AsEnumerable().Where(x => x == p.rowguid).Count()).ToList();//6
                //context.Set<Product>().Select(p => idsQ.AsEnumerable().Count(x => x == p.rowguid)).ToList();//7
                var vals = new int[] { 1, 5 };
                context.Set<Product>().Select(p => vals.Any(x => x > p.DaysToManufacture)).ToList();
                //context.Set<Product>().Select(p => vals.Where(x => x > p.DaysToManufacture).Count()).ToList();//8
                //context.Set<Product>().Select(p => vals.Count(x => x > p.DaysToManufacture)).ToList();//9
                //context.Set<Product>().Where(p => vals.Where(x => x > p.DaysToManufacture).Any()).Count();//10
                //context.Set<Product>().Where(p => vals.AsEnumerable().Where(x => x > p.DaysToManufacture).Any()).Count();//11
                var valsQ = vals.AsQueryable();
                //context.Set<Product>().Select(p => valsQ.AsEnumerable().Where(x => x > p.DaysToManufacture).Count()).ToList();//12
                var valsStr = string.Join(",", vals);
                var valsQuery = context.Database.SqlQuery<int>($"select unnest(string_to_array({valsStr},',')::int[]) as value");
                //var query = context.Set<Product>().Where(p => valsQuery.Where(x => x > p.DaysToManufacture).Any()).ToList();//13
                // Npgsql.PostgresException : 42703:
                // something like "column t.Value not exists"

                //context.Set<Product>().Select(p => new { p, vals }).SelectMany(x => x.vals.Select(v => new { x.p, v }))
                //    .Where(c => c.v > c.p.DaysToManufacture).Count();//14
                object value = context.Set<Product>().SelectMany(t1 => context.Set<Product>(), (t1, t2) => new { Id1 = t1.rowguid, Id2 = t2.rowguid }).ToList();
                //context.Set<Product>().SelectMany(te => valQ.AsEnumerable(), (t1, t2) => new { Id = t1.rowguid, r = t2 }).ToList();//15
                //context.Set<Product>().SelectMany(te => valQ, (t1, t2) => new { Id = t1.rowguid, r = t2 }).ToList();//16
                //context.Set<Product>().SelectMany(te => vals, (t1, t2) => new { Id = t1.rowguid, r = t2 }).ToList();//17
                //context.Set<Product>().GroupJoin(vals, x => true, x => true, (p, ranges) =>
                //    new
                //    {
                //        p = p,
                //        r = ranges.Where(rr => rr > node.DaysToManufacture).First()
                //    }
                //).ToList(); //18
                context.Set<Product>().Select(p => vals.Where(x => x > p.DaysToManufacture).First()).Count();
                //context.Set<Product>().Select(p => vals.Where(x => x > p.DaysToManufacture).First()).ToList();//19
                //context.Set<Product>().Select(p => new { p = p, r = vals.Where(x => x > p.DaysToManufacture).First() })
                //    .GroupBy(x => x.r, (key, list) => new { r = key, count = list.Count() }).ToList();//20

Requests that don't work are commented.

Include provider and version information

EF Core version: 7.0.3
Database provider: Npgsql.EntityFrameworkCore.PostgreSQL
Target framework: .NET 6.0
Operating system: Windows 10
IDE: (e.g. Visual Studio 2022 17.4)

@roji
Copy link
Member

roji commented Mar 7, 2023

In general, you're trying to perform operations involving an array parameter in your queries; while some specific patterns have good corresponding SQL constructs and can be translated, some do not. For example, Where(p => ids.Any(x => x == p.rowguid)) can be nicely translated to WHERE rowguid = ANY (@__ids_0); but something like context.Set<Product>().Select(p => new { p = p, r = vals.Where(x => x > p.DaysToManufacture).First() }) cannot.

I've opened npgsql/efcore.pg#2677 for a generalized approach, which would translate your array to a pseudo-table via the PostgreSQL unnest function; that should allow arbitrary queries which involve array parameters.

Note that this is all highly Npgsql-specific (and so belongs in https://github.com/npgsql/efcore.pg: the Npgsql provider adds some specific functionality around PostgreSQL's support of native array types in the database.

SQL Server (and other databases) do support VALUES, which can allow us to construct a table in SQL given the array parameters' values. However, that would imply embedding the values as constants in the SQL (leading to different SQLs and therefore cache query plan pollution, see #13617 for a similar case), or at least parameterizing each array element separately (leading to different SQLs for arrays of different sizes - much better but still problematic). PostgreSQL definitely has an edge here with the native array support.

@roji
Copy link
Member

roji commented Mar 7, 2023

On second thought, opened #30426 to track doing this via OPENJSON for SQL Server.

@ajcvickers
Copy link
Contributor

Note from triage: closing in favor of #30426.

@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants