Skip to content

Have a better story for EFCore.PG and multihost #3495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
roji opened this issue Mar 10, 2025 · 7 comments
Open

Have a better story for EFCore.PG and multihost #3495

roji opened this issue Mar 10, 2025 · 7 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@roji
Copy link
Member

roji commented Mar 10, 2025

Try to have Target Session Attributes in the connection string fails the moment EFCore.PG needs to create a data source internally (full repro below):

.UseNpgsql(
	"Host=x,y;Database=test;Username=test;Password=test;Load Balance Hosts=true;Target Session Attributes=prefer-standby",
	config =>
	{
		config.ConfigureDataSource(o => {});
	})

This throws "When creating a multi-host data source, TargetSessionAttributes cannot be specified".

The Npgsql story for target sessions is to call BuildMultiHost() instead of Build(), and then to call an overload of OpenConnectionAsync() that passes TargetSessionAttributes, or to extract wrapper data sources via WithTargetSession() (docs).

EFCore.PG doesn't know whether the connection string contains multiple hosts and/or Target Session Attribute, and never calls BuildMultiHost(). It seems like the thing to do would be to add an EF-level context option for the target session, and when EF creates the NpgsqlConnection, to call OpenConnectionAsync(TargetSessionAttributes).

Note that we also have legacy mode (without NpgsqlDataSource), where we do allow Target Session Attribute in the connection string; we internally create a wrapper for the given target session and use that. We could do the same when NpgsqlDataSourceBuilder.Build() is called with a Target Session Attribute; but that would mean that the underlying NpgsqlMultiHostDataSource (which owns the actual connection pools) isn't accessible, and it's impossible to have multiple Target Session Attribute values referencing the same physical connections. In legacy mode the NpgsqlMultiHostDataSource is global, so this isn't a problem.

In the meantime, the workaround is simply to construct a multi-host data source outside EF, get a wrapper via WithTargetSession() and pass that to EFCore.PG's UseNpgsql().

@NinoFloris @vonzshik does this all make sense?

Originally raised by @fmendez89 in npgsql/doc#263 (comment)

Full repro
await using var context = new BlogContext();
await context.Database.EnsureDeletedAsync();
await context.Database.EnsureCreatedAsync();

public class BlogContext : DbContext
{
    public DbSet<Blog> Blogs { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseNpgsql(
                "Host=localhost,localhost:6432;Database=test;Username=test;Password=test;Load Balance Hosts=true;Target Session Attributes=prefer-standby",
                config =>
                {
                    // The moment we call ConfigureDataSource(), EF attempts to build a data source internally (with NpgsqlDataSourceBuilder.Build()),
                    // causing an exception because Target Session Attributes is present in the connection string. 
                    config.ConfigureDataSource(o => {});
                })
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();
}

public class Blog
{
    public int Id { get; set; }
    public string Name { get; set; }
}
@vonzshik
Copy link
Contributor

Note that we also have legacy mode (without NpgsqlDataSource), where we do allow Target Session Attribute in the connection string; we internally create a wrapper for the given target session and use that. We could do the same when NpgsqlDataSourceBuilder.Build() is called with a Target Session Attribute; but that would mean that the underlying NpgsqlMultiHostDataSource (which owns the actual connection pools) isn't accessible, and it's impossible to have multiple Target Session Attribute values referencing the same physical connections. In legacy mode the NpgsqlMultiHostDataSource is global, so this isn't a problem.

This should already (somewhat) be working (ignoring exception for a sec) as you can call NpgsqlMultiHostDataSource.OpenConnection() (note the lacking of targetSessionAttributes in args), which should open a new connection on the original NpgsqlMultiHostDataSource (and not the wrapper), at which point we'll return a connection to a host according to settings (this should be Any).

So all in all, just removing the check should be enough (I think).

@vonzshik
Copy link
Contributor

vonzshik commented Mar 11, 2025

Note that we also have legacy mode (without NpgsqlDataSource), where we do allow Target Session Attribute in the connection string; we internally create a wrapper for the given target session and use that.

BTW, are you sure it's true? Looking at the code, it seems like currently we'll throw the exact same error (barring Any).

Edit: it's kinda even worse, we do not throw for Any as long as there is only a single host, otherwise we do.

Edit2: OK, I missed that we indeed make a wrapper, though removing that shouldn't be a problem.

@vonzshik
Copy link
Contributor

Submitted npgsql/npgsql#6046

@roji
Copy link
Member Author

roji commented Mar 11, 2025

So all in all, just removing the check should be enough (I think).

I'm not sure - how would an EF user using ConfigureDataSource(), be able to use both prefer-standby (for read-only loads) and primary (for write loads), with the same underlying connection pools (i.e. same instance of NpgsqlMultiHostDataSource)? If you have two UseNpgsql() configurations in your application with ConfigureDataSource(), where the only difference between them is the value of the Target Session Attributes in the connection string, then you'd get two completely separate instances of NpgsqlMultiHostDataSource, and not share connections between them... In fact, it's maybe better to keep the exception as-is, to help users avoid falling into this pit of failure...

(this interaction of EF and Npgsql/data source is generally not simple, unfortunately)

@vonzshik
Copy link
Contributor

I'm not sure - how would an EF user using ConfigureDataSource(), be able to use both prefer-standby (for read-only loads) and primary (for write loads), with the same underlying connection pools (i.e. same instance of NpgsqlMultiHostDataSource)?

I imagine that less than 1% of developers actually need something like this, as it's mostly going to be "I want to only connect to primary". But even then, the main idea is for BuildMultiHost (and Build) to support having a default TargetSessionAttribute, which will be used by calling OpenConnection. And if you want to get a connection with other TargetSessionAttribute, we also support that. Should make the usual experience much smoother.

If you have two UseNpgsql() configurations in your application with ConfigureDataSource(), where the only difference between them is the value of the Target Session Attributes in the connection string, then you'd get two completely separate instances of NpgsqlMultiHostDataSource, and not share connections between them... In fact, it's maybe better to keep the exception as-is, to help users avoid falling into this pit of failure...

Mm, you mean in case of legacy, yeah? That indeed makes sense, kinda forgot about that.

@roji
Copy link
Member Author

roji commented Mar 11, 2025

I imagine that less than 1% of developers actually need something like this, as it's mostly going to be "I want to only connect to primary".

Interesting, I'd think the majority just want Any, not Primary, which IIRC is also the default (in both libpq and Npgsql). That's already well-supported via EF too - just don't specify Target Session Attributes and everything works.

Though I guess you're right, and there's no really good reason to block Target Session Attributes with Build(), and for applications where just one attribute is needed (e.g. Primary) it smooths things out. I'm still a bit worried that EF users will unknowingly end up having two connection pools instead of one though...

Mm, you mean in case of legacy, yeah? That indeed makes sense, kinda forgot about that.

In legacy (i.e. no data source) we don't throw (create the wrapper internally) - maybe we're talking about different things?

@vonzshik
Copy link
Contributor

vonzshik commented Mar 11, 2025

Interesting, I'd think the majority just want Any, not Primary, which IIRC is also the default (in both libpq and Npgsql).

The problem with this is, most of the apps do write stuff into the database, and since vanilla postgres doesn't support multi-master... getting a standby will most likely just completely break the app. And speaking from experience, the app I'm working on doesn't really care whether there are multiple hosts or not, as long as we connect to primary. So allowing us just calling Build will make things much-much easier.

In legacy (i.e. no data source) we don't throw (create the wrapper internally) - maybe we're talking about different things?

We might mix things a bit. I was mostly talking about npgsql/npgsql#6046 where I did remove that wrapper (it's already back + I added a test just in case).

Though I guess you're right, and there's no really good reason to block Target Session Attributes with Build(), and for applications where just one attribute is needed (e.g. Primary) it smooths things out. I'm still a bit worried that EF users will unknowingly end up having two connection pools instead of one though...

I'm not sure whether you're talking about legacy here or not. In case of NpgsqlDataSourceBuilder, each builder will create separate NpgsqlMultiHostDataSource instances, which do not affect each other. In case of legacy (connection string), right now they will create a singular instance of NpgsqlMultiHostDataSource and share it (even if through a wrapper). Now that you reminded me of that behavior, I brought it back in that pr, so pretty much the only change there (aside of tests) is that you can specify TargetSessionAttribute on connection string and then create an NpgsqlDataSource out of it, which will return connections according to TargetSessionAttribute as long as you call OpenConnection().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants