Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] on Starter example #1315

Open
tiraldj opened this issue Mar 15, 2024 · 1 comment
Open

[BUG] on Starter example #1315

tiraldj opened this issue Mar 15, 2024 · 1 comment
Labels
bug Something isn't working documentation Improvements or additions to documentation good first issue Good for newcomers

Comments

@tiraldj
Copy link

tiraldj commented Mar 15, 2024

<Easy to fix. On https://dask-sql.readthedocs.io/en/latest/ in the Example code the last line is

# ...or use it for another computation
result.sum.mean().compute()

this throws an error because 'sum' was accidentally left in.

the code should be:

# ...or use it for another computation
result.mean().compute()
@tiraldj tiraldj added bug Something isn't working needs triage Awaiting triage by a dask-sql maintainer labels Mar 15, 2024
@ayushdg
Copy link
Collaborator

ayushdg commented Mar 15, 2024

Thanks for raising @tiraldj, it looks like the sql query creates a column called sum but it cannot be accessed via result.sum since that's a reserved for the function sum.
Like you mentioned result.mean().compute() or result["sum"].mean().compute() probably accomplish what the example is going for.
@tiraldj is this something you would be comfortable in submitting a PR for? It's okay if you can't as well 😄 .

@ayushdg ayushdg added documentation Improvements or additions to documentation good first issue Good for newcomers and removed needs triage Awaiting triage by a dask-sql maintainer labels Mar 15, 2024
filabrazilska added a commit to filabrazilska/dask-sql that referenced this issue Apr 10, 2024
The code would throw an error because of the `sum` being left in.
When replaced with `result["sum"].mean().compute()` it computes
the mean correctly:

```python
>>> import dask.datasets
>>> dask.config.set({"dataframe.convert-string": False})
<dask.config.set object at 0x7feccbf70a10>
>>> from dask_sql import Context
>>> df = dask.datasets.timeseries()
>>> c = Context()
>>> c.create_table("timeseries", df)
>>> result = c.sql("""
... SELECT name, SUM(x) as "sum" FROM timeseries GROUP BY name
... """)
>>> result["sum"].mean().compute()
7.956598887824248
```

Also eager string converting needs to be turned off in order to silence
the following error:
```python
>>> result = c.sql("""
... SELECT name, SUM(x) as "sum" FROM timeseries GROUP BY name
... """)
Traceback (most recent call last):
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/mappings.py", line 118, in python_to_sql_type
    return DaskTypeMap(_PYTHON_TO_SQL[python_type])
                       ~~~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: string[pyarrow]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/context.py", line 517, in sql
    rel, _ = self._get_ral(sql)
             ^^^^^^^^^^^^^^^^^^
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/context.py", line 826, in _get_ral
    schemas = self._prepare_schemas()
              ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/context.py", line 773, in _prepare_schemas
    column_type_mapping = list(
                          ^^^^^
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/mappings.py", line 120, in python_to_sql_type
    raise NotImplementedError(
NotImplementedError: The python type string is not implemented (yet)
```
charlesbluca added a commit that referenced this issue Apr 17, 2024
The code would throw an error because of the `sum` being left in.
When replaced with `result["sum"].mean().compute()` it computes
the mean correctly:

```python
>>> import dask.datasets
>>> dask.config.set({"dataframe.convert-string": False})
<dask.config.set object at 0x7feccbf70a10>
>>> from dask_sql import Context
>>> df = dask.datasets.timeseries()
>>> c = Context()
>>> c.create_table("timeseries", df)
>>> result = c.sql("""
... SELECT name, SUM(x) as "sum" FROM timeseries GROUP BY name
... """)
>>> result["sum"].mean().compute()
7.956598887824248
```

Also eager string converting needs to be turned off in order to silence
the following error:
```python
>>> result = c.sql("""
... SELECT name, SUM(x) as "sum" FROM timeseries GROUP BY name
... """)
Traceback (most recent call last):
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/mappings.py", line 118, in python_to_sql_type
    return DaskTypeMap(_PYTHON_TO_SQL[python_type])
                       ~~~~~~~~~~~~~~^^^^^^^^^^^^^
KeyError: string[pyarrow]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/context.py", line 517, in sql
    rel, _ = self._get_ral(sql)
             ^^^^^^^^^^^^^^^^^^
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/context.py", line 826, in _get_ral
    schemas = self._prepare_schemas()
              ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/context.py", line 773, in _prepare_schemas
    column_type_mapping = list(
                          ^^^^^
  File "/home/fila/miniconda3/envs/dask-sql/lib/python3.11/site-packages/dask_sql/mappings.py", line 120, in python_to_sql_type
    raise NotImplementedError(
NotImplementedError: The python type string is not implemented (yet)
```

Co-authored-by: Charles Blackmon-Luca <20627856+charlesbluca@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants