You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
It would be ideal that python-bigquery-sqlalchemy allowed for a separation between GCP projects where the data lives in one project and the jobs are run in a different projects. Often the datasets live in one project which is notionally owned by a different team. Allowing for this provides:
better isolation of concerns and therefore a better understanding of security roles and permissions
a clear separation of usage where the visualisation cost of a product (namely BQ slots) can be clearly recognised and budgeted for
simplified incident resolution in case of 'slowness' where BQ backend usage can be correlated to a given product instance
This has mainly come from using superset, where it heavily relies on sqlachemy for its support for different data platforms.
Describe the solution you'd like
Ideally, there is a mechanism where a job-project-id has been attached as an optional engine parameter and if present, it will use this in preference to the project_id present in the bigquery:// URL.
There are two list_dataset() calls in the codebase at present which do not specify the target project. By passing the project explicitly most of the work is done.
Describe alternatives you've considered
The superset issue here apache/superset#32789 outlines the work I have attempted to try to hotwire this logic into superset. However all of this logic feels very brittle and is unlikely to work reliably. The scope of the changes required in superset would be significant.
Also, implementing it within the library makes it available not only to superset but to any other project that relies on sqlalchemy to support pluggable data backends.
I've attached a PR for discussion since it's a pretty small change.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
It would be ideal that python-bigquery-sqlalchemy allowed for a separation between GCP projects where the data lives in one project and the jobs are run in a different projects. Often the datasets live in one project which is notionally owned by a different team. Allowing for this provides:
This has mainly come from using superset, where it heavily relies on sqlachemy for its support for different data platforms.
Describe the solution you'd like
Ideally, there is a mechanism where a
job-project-id
has been attached as an optional engine parameter and if present, it will use this in preference to the project_id present in the bigquery:// URL.There are two
list_dataset()
calls in the codebase at present which do not specify the target project. By passing the project explicitly most of the work is done.Describe alternatives you've considered
The superset issue here apache/superset#32789 outlines the work I have attempted to try to hotwire this logic into superset. However all of this logic feels very brittle and is unlikely to work reliably. The scope of the changes required in superset would be significant.
Also, implementing it within the library makes it available not only to superset but to any other project that relies on sqlalchemy to support pluggable data backends.
I've attached a PR for discussion since it's a pretty small change.
The text was updated successfully, but these errors were encountered: