-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: The dissolve feature generates a new geometry #3230
Comments
Simpler code snippet: from geopandas import GeoDataFrame
from shapely import LineString
gdf1 = GeoDataFrame([
{"geometry": LineString([ (355041.15, 6688781.25, 0), (355040.9629213488, 6688781.437078651, 9.7) ])},
{"geometry": LineString([ (355041.1500000001, 6688781.25, 0), (354841.1500000001, 6688781.25, 0) ])}
], crs="epsg:2154")
gdf2 = gdf1.dissolve()
assert (len(gdf2.loc[0, "geometry"].geoms) == len(gdf1)), "Bug" |
@thomas-leduc I took a quick look, and will aim to return later. It seems to be a rounding error happening somewhere, resulting in a tiny small extra snippet. If I truncate the first 5 digits from the x/y coords of the line string: gdf1 = GeoDataFrame([
{"geometry": LineString([ (1.15, 81.25, 0), (0.9629213488, 81.437078651, 9.7) ])},
{"geometry": LineString([ (1.1500000001, 1.25, 0), (1.1500000001, 81.25, 0) ])}
], crs="epsg:2154")
gdf2 = gdf1.dissolve()
print(len(gdf2.loc[0, "geometry"].geoms) == len(gdf1))
# True With the original geometry where the bug occurs: gdf3.geometry.length
gdf1 = GeoDataFrame([
{"geometry": LineString([ (355041.15, 6688781.25), (355040.9629213488, 6688781.437078651) ])},
{"geometry": LineString([ (355041.1500000001, 6688781.25), (354841.1500000001, 6688781.25) ])}
], crs="epsg:2154")
gdf2 = gdf1.dissolve()
gdf3 = gpd.GeoDataFrame(geometry=[g for g in gdf2.loc[0, 'geometry'].geoms])
print(gdf3.geometry.length)
0 2.645692e-01
1 5.820766e-11
2 2.000000e+02
dtype: float64 |
The same behaviour as in shapely/shapely#2028 is happening here, because under the hood from geopandas import GeoDataFrame
import shapely
from shapely import LineString
gdf1 = GeoDataFrame([
{"geometry": LineString([ (355041.15, 6688781.25, 0), (355040.9629213488, 6688781.437078651, 9.7) ])},
{"geometry": LineString([ (355041.15, 6688781.25, 0), (354841.1500000001, 6688781.25, 0) ])}
], crs="epsg:2154")
gdf2 = gdf1.dissolve()
print(f"{gdf2.loc[0, 'geometry']=}")
# gdf2.loc[0, 'geometry']=<MULTILINESTRING Z ((355041.15 6688781.25 0, 355040.963 6688781.437 9.7), (3...>
merged = shapely.line_merge(gdf1.geometry)
print(f"{merged[0]=}")
# merged[0]=<LINESTRING Z (355041.15 6688781.25 0, 355040.963 6688781.437 9.7)> |
@theroggy Thanks for tracing this upstream. I'll wait on discussion in libgeos/geos#1063 for now. |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of geopandas.
(optional) I have confirmed this bug exists on the main branch of geopandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Problem description
The dissolve feature produces a MultiLineString with not 2 but 3 LineStrings.
Expected Output
The expected length is 2, not 3.
Output of
geopandas.show_versions()
SYSTEM INFO
python : 3.10.13 | packaged by conda-forge | (main, Oct 26 2023, 18:07:37) [GCC 12.3.0]
executable : /home/tleduc/miniconda3/envs/gpd/bin/python3.10
machine : Linux-6.5.0-26-generic-x86_64-with-glibc2.35
GEOS, GDAL, PROJ INFO
GEOS : 3.12.1
GEOS lib : None
GDAL : 3.8.4
GDAL data dir: /home/tleduc/miniconda3/envs/gpd/share/gdal
PROJ : 9.3.1
PROJ data dir: /home/tleduc/miniconda3/envs/gpd/share/proj
PYTHON DEPENDENCIES
geopandas : 0.14.3
numpy : 1.26.4
pandas : 2.2.1
pyproj : 3.6.1
shapely : 2.0.3
fiona : 1.9.6
geoalchemy2: None
geopy : 2.4.1
matplotlib : 3.8.3
mapclassify: 2.6.1
pygeos : None
pyogrio : None
psycopg2 : 2.9.9 (dt dec pq3 ext lo64)
pyarrow : None
rtree : 1.2.0
The text was updated successfully, but these errors were encountered: