Skip to content
This repository was archived by the owner on Oct 15, 2022. It is now read-only.

Arrow Flight and/or API support #13

Open
ghost opened this issue May 27, 2021 · 4 comments
Open

Arrow Flight and/or API support #13

ghost opened this issue May 27, 2021 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@ghost
Copy link

ghost commented May 27, 2021

Hi @fabrice-etanchaud ,

Has there been any work on arrow flight support? And is there a timeline on it?

We're considering forking and adding the support ourselves but if there is existing work, that seems like the better option.

Thanks!

@fabrice-etanchaud
Copy link
Owner

fabrice-etanchaud commented May 27, 2021

Hello @mrietveld-leap , Thank you for your interest in the adapter !
No, I had no time to work on the project these last months, sorry.
I am starting a rewriting to keep close to the spark (fishtown maintained) adapter conventions.

As dbt does not process data, and only uses the connection to send server commands, what are your needs for using arrow flight ? getting rid of odbc would be a good thing, for sure.

By the way, I strongly feel the need to remove the view overlay on the table materialization. What do you think ?

Looking forward.

@fabrice-etanchaud fabrice-etanchaud added the enhancement New feature or request label Jan 12, 2022
@fabrice-etanchaud
Copy link
Owner

fabrice-etanchaud commented Apr 1, 2022

As dbt does not process data by itself (the only data flows I can think of are upstream when creating a seed, or downstream when consulting the information_schema), switching to Arrow flight would not bring performance benefits, but would trim the fat of all the odbc required stuff.

Adding API support instead would bring space/folders management (CREATE SCHEMA), and external tables (creation of sources from scratch).

@fabrice-etanchaud fabrice-etanchaud changed the title Arrow Flight support Arrow Flight and/or API support Apr 28, 2022
@fabrice-etanchaud
Copy link
Owner

Started having a look at https://github.com/rymurr/dremio_client to switch to an API connection.
Even if the project is read only, it would be a good starting point !
I am currently trying to wire the sql query execution to the API.
Then I would add folder creation.

Last, we could envision :

  • source creation by configuration (à la dbt-external-tables)
  • documenting dremio (tags and wiki) from dbt

@fabrice-etanchaud fabrice-etanchaud self-assigned this Jun 8, 2022
@fabrice-etanchaud
Copy link
Owner

started implementing a flight connection using :
https://docs.dremio.com/software/client-applications/python/

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant