Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1025489: inconsistent timestamp downscaling #1868

Open
jwyang-qraft opened this issue Jan 31, 2024 · 1 comment
Open

SNOW-1025489: inconsistent timestamp downscaling #1868

jwyang-qraft opened this issue Jan 31, 2024 · 1 comment
Assignees

Comments

@jwyang-qraft
Copy link

jwyang-qraft commented Jan 31, 2024

Python version

Python 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0]

Operating system and processor architecture

Linux-5.4.0-165-generic-x86_64-with-glibc2.31

Installed packages

numba==0.58.1
numpy @ file:///work/mkl/numpy_and_numpy_base_1682953417311/work
pandas==2.1.4
python-dateutil @ file:///tmp/build/80754af9/python-dateutil_1626374649649/work
pytz==2022.7.1
requests==2.31.0
snowballstemmer @ file:///tmp/build/80754af9/snowballstemmer_1637937080595/work
snowflake-connector-python==3.7.0
snowflake-sqlalchemy==1.5.1
SQLAlchemy==1.4.50
tqdm==4.66.1

What did you do?

TIMESTAMP_NTZ(9) column with values that overflow int64 with ns precision (eg. '9999-12-31 00:00:00.000')

What did you expect to see?

I tried to fetch a column with TIMESTAMP_NTZ(9) dtype and the max datetime is '9999-12-31 00:00:00.000' and minimum is '1987-01-30 23:59:59.000'.

I get following error when I select from that column.

  File "/home/jwyang/anaconda3/lib/python3.11/site-packages/snowflake/connector/result_batch.py", line 79, in _create_nanoarrow_iterator
    else PyArrowTableIterator(
         ^^^^^^^^^^^^^^^^^^^^^
  File "src/snowflake/connector/nanoarrow_cpp/ArrowIterator/nanoarrow_arrow_iterator.pyx", line 239, in snowflake.connector.nanoarrow_arrow_iterator.PyArrowTableIterator.__cinit__
  File "pyarrow/table.pxi", line 4116, in pyarrow.lib.Table.from_batches
  File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
  
  pyarrow.lib.ArrowInvalid: Schema at index 2 was different:
  DT: timestamp[us]
  vs
  DT: timestamp[ns]

Because '9999-12-31 00:00:00.000' doesn't fit in int64 with ns precision, it seems like it is downcast to us precision on a batch basis in

I am guessing downcasting is not applied to all batches and it results in different data types between batches which pyarrow does not allow.

Can you set logging to DEBUG and collect the logs?

import logging
import os

for logger_name in ('snowflake.connector',):
    logger = logging.getLogger(logger_name)
    logger.setLevel(logging.DEBUG)
    ch = logging.StreamHandler()
    ch.setLevel(logging.DEBUG)
    ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
    logger.addHandler(ch)
@github-actions github-actions bot changed the title inconsistent timestamp downscaling SNOW-1025489: inconsistent timestamp downscaling Jan 31, 2024
@sfc-gh-aling
Copy link
Collaborator

thanks @jwyang-qraft for reaching out! we will look into the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants