Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sqlalchemy schema support #328

Open
johalnes opened this issue Feb 2, 2024 · 4 comments
Open

Sqlalchemy schema support #328

johalnes opened this issue Feb 2, 2024 · 4 comments

Comments

@johalnes
Copy link

johalnes commented Feb 2, 2024

When reading tables from the SQLAlchemy plugin, one has to use pd.read_sql_query in order to specify schema.

When using table with pd.read_sql_table, one has to use the login schema. Could I change the table part to look for a schema in the source_config?

That is just change the table part to :

          if "table" in source_config:
              table = source_config["table"]
          else:
              table = source_config.table_name()
          
          schema = source_config["schema"] if "schema" in source_config else None
          
          with self.engine.connect() as conn:
              return pd.read_sql_table(table, con=conn, schema=schema)

Or could this cause trouble for other kind of databases than Oracle?

@jwills
Copy link
Collaborator

jwills commented Feb 2, 2024

I am honestly not sure— let me ask chatgpt

@jwills
Copy link
Collaborator

jwills commented Feb 3, 2024

yes that looks fine!

@johalnes
Copy link
Author

johalnes commented Feb 4, 2024

@jwills I've made change for read_sql_table now, but also wondered about the write part.

I tried to just add schema to config(options= {"schema": schema_name}) but got an error that this was not an option for write_parquet.
Added db_options instead, and made this being picked up. Worked like a charm. That is {{ config(materialized='external', plugin='oracle_db', db_options = {"schema":"ORACLE_SANDBOX"}) }}

def store(self, target_config: TargetConfig):
    # first, load the data frame from the external location
    df = pd_utils.target_to_df(target_config)
    table_name = target_config.relation.identifier
    db_options = target_config.config.get("db_options", {})
    schema = db_options.get("schema", None)

    # then, write it to the database
    df.to_sql(table_name, self.engine, if_exists="replace", index=False, schema=schema)

Is this a welcoming change? Or is config something one should try to keep with few possibilities and high consistency between plugins?

@jwills
Copy link
Collaborator

jwills commented Feb 5, 2024

Hey @johalnes I would be inclined not to add that since the overarching idea of plugins is for people to be able to customize things as they see fit; like, my goal is for Python-oriented dbt developers to be able to create and load their own plugins like I do in my jaffle shop example project here: https://github.com/jwills/jaffle_shop_duckdb/blob/duckdb/profiles.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants