You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I keep getting this error when trying to query a table created from dask dataframe reading a csv file. A couple of columns in the csv file are strings. I've tried multiple ways to convert the pyarrow string type but none of them worked and the type remained unchanged. How should I proceed?
df = dd.read_csv("../sales.csv")
print(df.dtypes)
c = Context()
c.create_table("sales", df)
result = c.sql("SELECT * FROM sales").compute()
print(result)
/ArrowFlightService/lib/python3.9/site-packages/dask_sql/mappings.py", line 120, in python_to_sql_type raise NotImplementedError( NotImplementedError: The python type string is not implemented (yet)
The text was updated successfully, but these errors were encountered:
My assumption here is that we're getting bitten by Dask's eager conversion of object columns to pyarrow strings, which we haven't be able to fully support yet (working on this in #1220); are you able to disable this eager conversion with dask.config.set({"dataframe.convert-string": False})? Would be interested in if that unblocks things here for you
As discussed in Discourse, the basic documentation example reproduces this error, but disabling eager conversion fixes it.
importdask.datasetsdf=dask.datasets.timeseries()
fromdask_sqlimportContextc=Context()
c.create_table("timeseries", df, persist=True)
result=c.sql(""" SELECT name, SUM(x) AS "sum" FROM timeseries WHERE x > 0.5 GROUP BY name""")
result.compute()
What is your question?
I keep getting this error when trying to query a table created from dask dataframe reading a csv file. A couple of columns in the csv file are strings. I've tried multiple ways to convert the pyarrow string type but none of them worked and the type remained unchanged. How should I proceed?
/ArrowFlightService/lib/python3.9/site-packages/dask_sql/mappings.py", line 120, in python_to_sql_type raise NotImplementedError( NotImplementedError: The python type string is not implemented (yet)
The text was updated successfully, but these errors were encountered: