What summarization method is used in geopandas.GeoDataFrame.plot for multiple values? #2980
-
I was making a distance analysis using geopandas using a hexagon planar subdivision of a region of interest (hexagons_gdf) and points of interest (poi_gdf). The first thing I did was to calculate the distances between the hexagons centroids and the points. With this I created a heatmap_data like this:
Then, I plot this heatmap_data with this code:
So, I was wondering how geopandas determine the value to be used in the plot as heatmap_data has many distance values for each hexagon. Is it the mean value? Where can I find this? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @lcoandrade, thanks for the question. I hope I'm understanding it right based on the snippets you've shown, but no summation is done when plotting - all duplicates are layed on top, with the last record shown on top. This might help illustrate: from geodatasets import get_path
import pandas as pd
import geopandas as gpd
gdf = gpd.read_file(get_path('nybb'))
gdf2 = pd.concat([gdf.iloc[[0]].assign(distance=i) for i in range(5)])
gdf2.plot(column='distance', cmap='viridis', legend=True)
gdf2.sort_values(by='distance', ascending=False).plot(column='distance', cmap='viridis', legend=True) So you would need to take care of duplication / aggregation yourself before plotting. |
Beta Was this translation helpful? Give feedback.
Hi @lcoandrade, thanks for the question. I hope I'm understanding it right based on the snippets you've shown, but no summation is done when plotting - all duplicates are layed on top, with the last record shown on top. This might help illustrate:
So you would need to take care of duplication / aggregation yourself before plotting.