Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-977384: write_pandas does not seem to re-use the provided open connection. #1814

Open
ChannyClaus opened this issue Nov 23, 2023 · 7 comments
Assignees
Labels
enhancement The issue is a request for improvement or a new feature triaged

Comments

@ChannyClaus
Copy link

ChannyClaus commented Nov 23, 2023

Python version

3.8.17

Operating system and processor architecture

macOS-13.5-arm64-arm-64bit

Installed packages

asn1crypto==1.5.1
attrs==23.1.0
certifi==2023.11.17
cffi==1.16.0
charset-normalizer==3.3.2
cryptography==41.0.5
fancycompleter==0.9.1
filelock==3.13.1
idna==3.4
numpy==1.24.4
packaging==23.2
pandas==2.0.3
pdbpp==0.10.3
platformdirs==3.11.0
pyarrow==14.0.1
pycparser==2.21
Pygments==2.17.2
PyJWT==2.8.0
pyOpenSSL==23.3.0
pyrepl==0.9.0
python-dateutil==2.8.2
pytz==2023.3.post1
requests==2.31.0
six==1.16.0
snowflake-connector-python==3.5.0
sortedcontainers==2.4.0
tomlkit==0.12.3
typing_extensions==4.8.0
tzdata==2023.3
urllib3==1.26.18
wmctrl==0.5

What did you do?

$ cat good.py 
import os

import snowflake.connector

conn = snowflake.connector.connect(
    user=os.environ.get("SF_USERNAME"),
    password=os.environ.get("SF_PASSWORD"),
    account=os.environ.get("SF_ACCOUNT"),
    warehouse=os.environ.get("SF_WAREHOUSE"),
    database=os.environ.get("SF_DATABASE"),
    schema=os.environ.get("SF_SCHEMA"),
    role=os.environ.get("SF_ROLE"),
)

conn.cursor().execute(f"USE DATABASE {os.environ.get('SF_DATABASE')}")
conn.cursor().execute(f"USE SCHEMA {os.environ.get('SF_SCHEMA')}")
conn.cursor().execute("CREATE TEMPORARY TABLE test (a int, b int, c int)")
conn.cursor().execute("INSERT INTO test VALUES (1, 2, 3), (4, 5, 6)")
print(conn.cursor().execute("SELECT * FROM test").fetchall())

the simple script above behaves as expected, i.e. it is able to:

  1. create the temporary table named test.
  2. insert the dummy data into the aforementioned temporary table.
  3. and pull the data from it.

however, when the write portion of this script has been switched out with write_pandas it fails with this:

Traceback (most recent call last):
  File "bad.py", line 34, in <module>
    write_pandas(conn, df, "test", table_type="temp")
  File "/Users/chan.kang/test/snowflake-connector/venv/lib/python3.8/site-packages/snowflake/connector/pandas_tools.py", line 433, in write_pandas
    copy_results = cursor.execute(copy_into_sql, _is_internal=True).fetchall()
  File "/Users/chan.kang/test/snowflake-connector/venv/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 920, in execute
    Error.errorhandler_wrapper(self.connection, self, error_class, errvalue)
  File "/Users/chan.kang/test/snowflake-connector/venv/lib/python3.8/site-packages/snowflake/connector/errors.py", line 290, in errorhandler_wrapper
    handed_over = Error.hand_to_other_handler(
  File "/Users/chan.kang/test/snowflake-connector/venv/lib/python3.8/site-packages/snowflake/connector/errors.py", line 345, in hand_to_other_handler
    cursor.errorhandler(connection, cursor, error_class, error_value)
  File "/Users/chan.kang/test/snowflake-connector/venv/lib/python3.8/site-packages/snowflake/connector/errors.py", line 221, in default_errorhandler
    raise error_class(
snowflake.connector.errors.ProgrammingError: 001757 (42601): 01b0876d-0403-d123-0030-1f830aa032c6: SQL compilation error:
Table '"test"' does not exist

it should be reproducible via:

$ cat bad.py
import os

import pandas as pd
import snowflake.connector
from snowflake.connector.pandas_tools import write_pandas

conn = snowflake.connector.connect(
    user=os.environ.get("SF_USERNAME"),
    password=os.environ.get("SF_PASSWORD"),
    account=os.environ.get("SF_ACCOUNT"),
    warehouse=os.environ.get("SF_WAREHOUSE"),
    database=os.environ.get("SF_DATABASE"),
    schema=os.environ.get("SF_SCHEMA"),
    role=os.environ.get("SF_ROLE"),
)

conn.cursor().execute(f"USE DATABASE {os.environ.get('SF_DATABASE')}")
conn.cursor().execute(f"USE SCHEMA {os.environ.get('SF_SCHEMA')}")
conn.cursor().execute("CREATE TEMPORARY TABLE test (a int, b int, c int)")
df = pd.DataFrame(columns=["a", "b", "c"], data=[[1, 2, 3], [4, 5, 6]])
write_pandas(conn, df, "test")
print(conn.cursor().execute("SELECT * FROM test").fetchall())

write_pandas call fails to find the temporary table that was created right before.

What did you expect to see?

included in the "what did you do"

@github-actions github-actions bot changed the title write_pandas does not seem to re-use the provided open connection. SNOW-977384: write_pandas does not seem to re-use the provided open connection. Nov 23, 2023
@ChannyClaus
Copy link
Author

seems somewhat related to #1313 but not entirely sure if it's the same issue given the level of details provided.

@ChannyClaus
Copy link
Author

ChannyClaus commented Nov 23, 2023

i did just discover from looking at the source that table_type argument exists (is it intentionally missing from the documentation at https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-api#id13?) but setting it to 'temp' doesn't seem to actually create a temporary table either? (not super confident about this though)

EDIT: seems... that there are auto_create_table, create_temp_table boolean args as well?

@tekumara
Copy link
Contributor

The Table '"foobar"' does not exist occurs for me even when its not a temporary table, eg:

import pandas as pd
import snowflake.connector.pandas_tools

cur.execute("create or replace table foobar (id int)")
# confirm it exists
cur.execute("select * from foobar").fetchall()

df = pd.DataFrame.from_records(
    [
        {"ID": 1},
        {"ID": 2},
    ]
)
snowflake.connector.pandas_tools.write_pandas(conn, df, "foobar")

The issue is write_pandas is quoting the table name during the copy, ie: its doing a

COPY INTO "foobar" 

By default, Snowflake treats quoted identifiers as case-sensitive. So its looking for a lowercase "foobar" table name, but the table was created as FOOBAR. As a workaround try uppercasing the table name when calling write_pandas.

It would be nice to make the write_pandas behaviour less surprising.

@ChannyClaus
Copy link
Author

ChannyClaus commented Jan 21, 2024

ah sweet - capitalizing the table name and column names does seem to make my script from above work, thanks!

but as you point out it is somewhat confusing to be able to create the table via a statement like CREATE TEMPORARY TABLE test (a int, b int, c int) where both the table names and column names are specified in lower case letters but unable to query that table using exactly the same table name and columns.

would be helpful if we could either:

  1. fail the CREATE statement if the table / column names aren't in upper case letters.
  2. make the SELECT query and others be able to refer to the table with lower letters.

seems like 2 is how it works on Snowflake Worksheets UI so likely is the preferable solution?

@tekumara
Copy link
Contributor

Ah so there's a quote_identifiers argument:

If False, prevents the connector from putting double quotes around identifiers before sending the identifiers to the server. By default, the connector puts double quotes around identifiers.

Seems like using quote_identifiers=False is more aligned with the behaviour expected here (and would be a better default).

@sfc-gh-aling
Copy link
Collaborator

it seems like you have figured out the root cause and found the parameter to control the quotes.
I'm afraid we can not make the change as it would be a breaking change, but I think we can make some improves in the doc to make it clear.

is there any other help you need?

@sfc-gh-aling sfc-gh-aling added enhancement The issue is a request for improvement or a new feature triaged and removed bug needs triage labels Mar 5, 2024
@ChannyClaus
Copy link
Author

all good on my end - used some other workaround even prior to reporting this issue :.)

would it be a breaking change, though? i thought making this change would not break any of the current usage (it would become more permissive, right?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement The issue is a request for improvement or a new feature triaged
Projects
None yet
Development

No branches or pull requests

3 participants