Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Write Unknown cartesian CRS when saving gdf without a CRS to GPKG #368

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

theroggy
Copy link
Member

@theroggy theroggy commented Mar 1, 2024

Questions/remarks:

  • I only put it in the code path for GeoDataFrames and not in general as we only know for these that it should be cartesian I think?
  • If I remember correctly I think it was decided in the meeting not to set the crs to None for that crs when reading?

resolves #367

@theroggy theroggy added this to the 0.8.0 milestone Mar 1, 2024
@theroggy theroggy marked this pull request as ready for review March 1, 2024 19:09
elif driver == "GPKG":
# In GPKG, None crs must be replaced by "Undefined Cartesian SRS", otherwise
# the default "Undefined geographic SRS" will be used.
crs = 'LOCAL_CS["Undefined Cartesian SRS"]'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enough? My understanding was that this is required:

ENGCRS["Undefined Cartesian SRS with unknown unit",
    EDATUM["Unknown engineering datum"],
    CS[Cartesian,2],
        AXIS["X",unspecified,
            ORDER[1],
            LENGTHUNIT["unknown",0]],
        AXIS["Y",unspecified,
            ORDER[2],
            LENGTHUNIT["unknown",0]]]

See the dicsussion in r-spatial/sf#2049.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert on projections, but as far as I can tell both are equivalent: LOCAL_CS["Undefined Cartesian SRS"] is just a shorter name for the full wkt.

I did a manual check on this before, but to be sure I added an explicit check in the test that the srs_id in the GPKG written is -1, which is the main thing we are trying to do here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is that LOCAL_CS["Undefined Cartesian SRS"] assumes meters as units and axes as easting and northing, which is incorrect:

<Engineering CRS: LOCAL_CS["Undefined Cartesian SRS",UNIT["Meter",1] ...>
Name: Undefined Cartesian SRS
Axis Info [cartesian]:
- [east]: Easting (Meter)
- [north]: Northing (Meter)

The WKT above is explicit about making no assumptions about anything, because we don't know anything.

Copy link
Member Author

@theroggy theroggy Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using that WKT does not comply to the GeoPackage specs: https://www.geopackage.org/spec/#r11. Using that WKT leads to a new custom CRS being "created" in the GeoPackage with srs_id = 100.000 (or whatever is the first unused srs_id in that range in the geopackage) instead of using srs_id = -1, as should be used according to the specs.

I wonder what is the rational that they went for this solution in r-spatial? I don't really found one in the post above. Maybe it should be changed in GDAL that ENGCRS["Undefined Cartesian SRS with unknown unit" is (also) mapped to srs_id = -1 in geopackage?

I did some extra tests and proj does seem to treat them all as equivalent, so I suppose that in practice it probably won't matter to much with the different crs's being interused? Would this lead to it being relatively unimportant that r-spatial choose something different than the gpkg specs (and supposedly other libraries following that) and that the end user should have trouble with those differences, or is that a bit naive? Possibly other libraries than proj don't treat them as equal?

from pyproj import CRS

crs_undefined = CRS('LOCAL_CS["Undefined Cartesian SRS"]')
crs_undefined_unknown = CRS('LOCAL_CS["Undefined Cartesian SRS with unknown unit"]')
try:
    crs_undefined_unknown_eng = CRS(
        'ENGCRS["Undefined Cartesian SRS with unknown unit"]'
    )
except Exception as ex:
    print(f"Exception raised: {ex}")

crs_undefined_unknown_eng_wkt = """
    ENGCRS["Undefined Cartesian SRS with unknown unit",
        EDATUM["Unknown engineering datum"],
        CS[Cartesian,2],
            AXIS["X",unspecified,
                ORDER[1],
                LENGTHUNIT["unknown",0]],
            AXIS["Y",unspecified,
                ORDER[2],
                LENGTHUNIT["unknown",0]]]
"""
crs_undefined_unknown_eng2 = CRS(crs_undefined_unknown_eng_wkt)

# Check differences
print(f"{crs_undefined_unknown.equals(crs_undefined_unknown_eng2)=}")
print(f"{crs_undefined_unknown.to_wkt() == crs_undefined_unknown_eng2.to_wkt()=}")

print(f"{crs_undefined_unknown.equals(crs_undefined)=}")
print(f"{crs_undefined_unknown.to_wkt() == crs_undefined.to_wkt()=}")

print(f"{crs_undefined_unknown_eng2.equals(crs_undefined)=}")
print(f"{crs_undefined_unknown_eng2.to_wkt() == crs_undefined.to_wkt()=}")

Output:

Exception raised: Invalid projection: ENGCRS["Undefined Cartesian SRS with unknown unit"]: (Internal Proj Error: proj_create: Missing EDATUM / ENGINEERINGDATUM node)
crs_undefined_unknown.equals(crs_undefined_unknown_eng2)=True
crs_undefined_unknown.to_wkt() == crs_undefined_unknown_eng2.to_wkt()=False
crs_undefined_unknown.equals(crs_undefined)=True
crs_undefined_unknown.to_wkt() == crs_undefined.to_wkt()=False
crs_undefined_unknown_eng2.equals(crs_undefined)=True
crs_undefined_unknown_eng2.to_wkt() == crs_undefined.to_wkt()=False

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tricky. When using the WKT, it correctly roundtrips with all the fields Unknown but the srid is 100000. I am not sure which is better.

@edzer are you sure the way this is handled in {sf} is actually correct? Meaning it correctly roundtrips and follows the spec?

Copy link
Member Author

@theroggy theroggy Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference to GDAL code handling srs_id=-1, as looked for during the community meeting by @jorisvandenbossche :

@edzer
Copy link

edzer commented Mar 28, 2024

@edzer are you sure the way this is handled in {sf} is actually correct? Meaning it correctly roundtrips and follows the spec?

@martinfleis which spec are you referring to?

@martinfleis
Copy link
Member

@edzer I think I meant GeoPackage spec. From what I understand when @theroggy explained that, undefined SRS shall be saved as -1. But if we replicate the WKT from the sf issue linked above, it is not the case and it is saved as a custom SRS instead with ID 10000 or something like that (@theroggy please correct me here).

@edzer
Copy link

edzer commented Mar 28, 2024

IIRC, the problem with the undefined SRS in GeoPackages is that they are assumed (and, after roundtrip read as) geographic coordinates with undefined datum. If you want to specify they are Cartesian but otherwise unknown, you can use the WKT

ENGCRS["Undefined Cartesian SRS with unknown unit",
  EDATUM["Unknown engineering datum"],
  CS[Cartesian, 2],
  AXIS["X",unspecified,ORDER[1],LENGTHUNIT["unknown",0]],   
  AXIS["Y",unspecified,ORDER[2],LENGTHUNIT["unknown",0]]
]

which is what R package sf writes to geopackages when the CRS is NA: https://github.com/r-spatial/sf/blob/main/R/read.R#L548

@edzer
Copy link

edzer commented Mar 28, 2024

See also r-spatial/sf#2049 (comment)

@martinfleis
Copy link
Member

the undefined SRS in GeoPackages is that they are assumed (and, after roundtrip read as) geographic coordinates with undefined datum. If you want to specify they are Cartesian but otherwise unknown, you can use the WKT

Undefined in sense of SRS=0 is assumed geographic but SRS=-1 is assumed Cartesian. Citing Requirement 11 from the version 1.4 of the spec:

The record with an srs_id of -1 SHALL be used for undefined Cartesian coordinate reference systems. The record with an srs_id of 0 SHALL be used for undefined geographic coordinate reference systems.

That is where I think that the sf solution with custom WKT is not exactly reflecting the spec.

GDAL correctly understands -1 as Cartesian but, for unknown reasons, sets units to meters here. We discusses on the dev call earlier today that this should probably be fixed to resolve as Undefined Cartesian SRS with unknown units rather than with meters. Then setting SRS to -1 should do what we want it to without specifying custom WKT.

@edzer
Copy link

edzer commented Mar 28, 2024

So, maybe raise an issue with GDAL?

…artesian-CRS-when-saving-gdf-without-a-CRS-to-GPKG
@theroggy
Copy link
Member Author

As planned in the community meeting and suggested by @edzer , I opened an issue to discuss this further in GDAL: OSGeo/gdal#9580

…artesian-CRS-when-saving-gdf-without-a-CRS-to-GPKG
@brendan-ward
Copy link
Member

Just checking back on the status of this PR; it looks like this is potentially waiting on upstream changes to GDAL possibly in 3.9? But then what is the path forward on this for GDAL < 3.9?

@brendan-ward brendan-ward removed this from the 0.8.0 milestone Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write Unknown cartesian CRS when saving gdf without a CRS to GPKG
4 participants