Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: explore(): skip if fields/index are Timestamp #2378

Closed
raybellwaves opened this issue Mar 14, 2022 · 12 comments · Fixed by #3261
Closed

ENH: explore(): skip if fields/index are Timestamp #2378

raybellwaves opened this issue Mar 14, 2022 · 12 comments · Fixed by #3261
Milestone

Comments

@raybellwaves
Copy link
Contributor

Is your feature request related to a problem?

Running gdf.explore and getting TypeError: Object of type Timestamp is not JSON serializable when a dataframe has a dtype datetime64

Describe the solution you'd like

Raise a warning and have output behave like explore without the datetime fields.

skip_datetime_fields arg in explore?

API breaking implications

None

Describe alternatives you've considered

This one liner may fix it:

import geopandas as gpd
from datetime import datetime

gdf = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
gdf["date"] = datetime(2022, 1, 1)
gdf.explore() # Traceback below

no_datetime_cols = list(set(gdf.columns) - set(gdf.select_dtypes(include="datetime").columns))
gdf[no_datetime_cols].explore()

Additional context

Goes into folium/ipython traceback so may be worth upstreaming the issue (or finding the issue if it exists)

Traceback:

TypeError                                 Traceback (most recent call last)
File ~/miniconda3/envs/main/lib/python3.9/site-packages/IPython/core/formatters.py:343, in BaseFormatter.__call__(self, obj)
    341     method = get_real_method(obj, self.print_method)
    342     if method is not None:
--> 343         return method()
    344     return None
    345 else:

File ~/miniconda3/envs/main/lib/python3.9/site-packages/folium/folium.py:299, in Map._repr_html_(self, **kwargs)
    297     self._parent = None
    298 else:
--> 299     out = self._parent._repr_html_(**kwargs)
    300 return out

File ~/miniconda3/envs/main/lib/python3.9/site-packages/branca/element.py:331, in Figure._repr_html_(self, **kwargs)
    322 def _repr_html_(self, **kwargs):
    323     """Displays the Figure in a Jupyter notebook.
    324 
    325     Percent-encoded HTML is stored in data-html attribute, which is used to populate
   (...)
    329 
    330     """
--> 331     html = urllib.parse.quote(self.render(**kwargs))
    332     onload = (
    333         'this.contentDocument.open();'
    334         'this.contentDocument.write('
   (...)
    337         'this.contentDocument.close();'
    338     )
    340     if self.height is None:

File ~/miniconda3/envs/main/lib/python3.9/site-packages/branca/element.py:319, in Figure.render(self, **kwargs)
    317 """Renders the HTML representation of the element."""
    318 for name, child in self._children.items():
--> 319     child.render(**kwargs)
    320 return self._template.render(this=self, kwargs=kwargs)

File ~/miniconda3/envs/main/lib/python3.9/site-packages/folium/folium.py:368, in Map.render(self, **kwargs)
    349 figure.header.add_child(Element(
    350     '<style>html, body {'
    351     'width: 100%;'
   (...)
    355     '}'
    356     '</style>'), name='css_style')
    358 figure.header.add_child(Element(
    359     '<style>#map {'
    360     'position:absolute;'
   (...)
    365     '}'
    366     '</style>'), name='map_style')
--> 368 super(Map, self).render(**kwargs)

File ~/miniconda3/envs/main/lib/python3.9/site-packages/folium/elements.py:21, in JSCSSMixin.render(self, **kwargs)
     18 for name, url in self.default_css:
     19     figure.header.add_child(CssLink(url), name=name)
---> 21 super().render(**kwargs)

File ~/miniconda3/envs/main/lib/python3.9/site-packages/branca/element.py:643, in MacroElement.render(self, **kwargs)
    639     figure.script.add_child(Element(script(self, kwargs)),
    640                             name=self.get_name())
    642 for name, element in self._children.items():
--> 643     element.render(**kwargs)

File ~/miniconda3/envs/main/lib/python3.9/site-packages/folium/features.py:626, in GeoJson.render(self, **kwargs)
    623     if self.highlight:
    624         self.highlight_map = mapper.get_highlight_map(
    625             self.highlight_function)
--> 626 super(GeoJson, self).render()

File ~/miniconda3/envs/main/lib/python3.9/site-packages/branca/element.py:639, in MacroElement.render(self, **kwargs)
    637 script = self._template.module.__dict__.get('script', None)
    638 if script is not None:
--> 639     figure.script.add_child(Element(script(self, kwargs)),
    640                             name=self.get_name())
    642 for name, element in self._children.items():
    643     element.render(**kwargs)

File ~/miniconda3/envs/main/lib/python3.9/site-packages/jinja2/runtime.py:814, in Macro.__call__(self, *args, **kwargs)
    808 elif len(args) > self._argument_count:
    809     raise TypeError(
    810         f"macro {self.name!r} takes not more than"
    811         f" {len(self.arguments)} argument(s)"
    812     )
--> 814 return self._invoke(arguments, autoescape)

File ~/miniconda3/envs/main/lib/python3.9/site-packages/jinja2/runtime.py:828, in Macro._invoke(self, arguments, autoescape)
    825 if self._environment.is_async:
    826     return self._async_invoke(arguments, autoescape)  # type: ignore
--> 828 rv = self._func(*arguments)
    830 if autoescape:
    831     rv = Markup(rv)

File <template>:238, in macro(l_1_this, l_1_kwargs)

File ~/miniconda3/envs/main/lib/python3.9/site-packages/jinja2/filters.py:1673, in do_tojson(eval_ctx, value, indent)
   1670     kwargs = kwargs.copy()
   1671     kwargs["indent"] = indent
-> 1673 return htmlsafe_json_dumps(value, dumps=dumps, **kwargs)

File ~/miniconda3/envs/main/lib/python3.9/site-packages/jinja2/utils.py:736, in htmlsafe_json_dumps(obj, dumps, **kwargs)
    732 if dumps is None:
    733     dumps = json.dumps
    735 return markupsafe.Markup(
--> 736     dumps(obj, **kwargs)
    737     .replace("<", "\\u003c")
    738     .replace(">", "\\u003e")
    739     .replace("&", "\\u0026")
    740     .replace("'", "\\u0027")
    741 )

File ~/miniconda3/envs/main/lib/python3.9/json/__init__.py:234, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
--> 234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
    238     **kw).encode(obj)

File ~/miniconda3/envs/main/lib/python3.9/json/encoder.py:199, in JSONEncoder.encode(self, o)
    195         return encode_basestring(o)
    196 # This doesn't pass the iterator directly to ''.join() because the
    197 # exceptions aren't as detailed.  The list call should be roughly
    198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
    200 if not isinstance(chunks, (list, tuple)):
    201     chunks = list(chunks)

File ~/miniconda3/envs/main/lib/python3.9/json/encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot)
    252 else:
    253     _iterencode = _make_iterencode(
    254         markers, self.default, _encoder, self.indent, floatstr,
    255         self.key_separator, self.item_separator, self.sort_keys,
    256         self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)

File ~/miniconda3/envs/main/lib/python3.9/json/encoder.py:179, in JSONEncoder.default(self, o)
    160 def default(self, o):
    161     """Implement this method in a subclass such that it returns
    162     a serializable object for ``o``, or calls the base implementation
    163     (to raise a ``TypeError``).
   (...)
    177 
    178     """
--> 179     raise TypeError(f'Object of type {o.__class__.__name__} '
    180                     f'is not JSON serializable')

TypeError: Object of type Timestamp is not JSON serializable
@martinfleis
Copy link
Member

This is related to #1906. I think that we should make sure these dtypes can be serialised as explored in #1920 rather than adding a keyword here. You can always specify which columns you want to avoid it.

@raybellwaves
Copy link
Contributor Author

Thanks for linking the issue and the WIP PR. Closing here as a dup

@raybellwaves
Copy link
Contributor Author

Apologies not related to the API here but it seems other interactive plotting options handle the date dtype ok. Linking as one could dig into source code there to help here. I understand this is outside of json encoding but may help with the plotting issue.

import plotly.express as px
px.choropleth_mapbox(
    gdf,
    geojson=gdf.geometry,
    locations=gdf.index,
    mapbox_style="carto-positron",
    hover_data=["date"],
)

import geoviews as gv
gv.Polygons(gdf, vdims=["index", "date"]).opts(tools=['hover'])

import hvplot.pandas
gdf.hvplot(geo=True, hover_cols="all")
# note just using date in hover_cols and in vdims above fails as it tries to create a colorbar from it

@martinfleis
Copy link
Member

@raybellwaves can you move this to one of the open issues linked above? It'll get lost here.

@raybellwaves
Copy link
Contributor Author

@martinfleis if ok. I'll repopen this.

To summarize: this issue is for discussion on how to currently use explore() if the geodataframe has a datetime field.

#1906 is how to handle low level json serialization for datetime and geometry fields (the core underlying issue here).

@raybellwaves raybellwaves reopened this Mar 15, 2022
@olsgaard
Copy link

@raybellwaves why do you propose that timestamps should be removed?

It would make more sense if they are converted to numeric if the column argument points to a Timestamp column and otherwise, they should be converted to a string before serialization (or during serialization).

This way, you can read the date on the pop-up, and you can get a color change indicating how far along the timeline you are.

I have started to use the following work-around:

def explore(gdf: gpd.GeoDataFrame, *args, column=None, **kwargs) -> folium.Map:
    """Version of `GeoDataFrame.explore()` method that handles columns of 
    DateTime64 somewhat reasonably. All arguments are passed on to `.explore()` after
    handling DateTime64 columns in `gdf` or `column`.
    
    Can be used in a pipe like so: 
        gdf.pipe(explore, m=m, column="datetime64_col", name="time")"""

    dt_columns = gdf.select_dtypes(["datetime64", "timedelta64"]).columns

    if isinstance(column, str):
        column = gdf[column]
    if pd.api.types.is_datetime64_any_dtype(column):
        column = column.astype("int64")
        # this keeps the branca from encountering an overflow in long_scalars
        column = column-column.min() 
        column = column / column.max() 
    
    gdf = gdf[:]
    gdf[dt_columns] = gdf[dt_columns].astype("str")
    return gdf.explore(*args, column=column, **kwargs)
xs, ys =np.random.rand(20)+np.linspace(0,5, 20), np.random.rand(20)+np.linspace(0,5, 20)

gdf = gpd.GeoDataFrame({
    "id": "a", 
    "timestamp1": pd.date_range("2020-01-01", "2020-10-10", 20),
    "timestamp2": pd.date_range("2020-01-01", "2020-10-10", 20),
    "geometry": gpd.points_from_xy(xs, ys)
    }
)
gdf.pipe(explore, column="timestamp1")

image

As you can see in the screenshot, additional arguments are passed on without problems.

Of course things like the colorbar needs better labels, but it is good enough that one can use .pipe(explore) instead of .explore() in day-to-day eda.

@mjohenneken
Copy link

The same situation applies to other non-serializable types. In my case it was a column of uuid values. Skipping fields hides to the user of the function why fields are missing. How about conversion to string if data is not json serializable?

@martinfleis
Copy link
Member

How about conversion to string if data is not json serializable?

That would be ideal. Would you, or anyone else, be willing to do a PR for that?

@dshean
Copy link

dshean commented Sep 14, 2023

Just ran into this issue again, working with GeoDataFrame with a DateTimeIndex.

Any more recent efforts to fix? The explore functionality is great, but this is unfortunate limitation.

@martinfleis
Copy link
Member

@dshean sorry, not yet. It will be fixed by 1.0.

@anitagraser
Copy link

@martinfleis is it fixed in 1.0?

@martinfleis
Copy link
Member

@anitagraser Will be once #3261 is merged. It is not in the alpha but will be in the RC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants