Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow gdal config via GDAL_INCLUDE/LIBRARY_PATH env variables (Simplify building on Windows with environment variables) #47

Merged
merged 6 commits into from Mar 9, 2022

Conversation

jorisvandenbossche
Copy link
Member

Related to #8

Currently, for installing on source on Windows, we indicate that you need to pass additional flags (to the include and library dirs): https://pyogrio.readthedocs.io/en/latest/install.html#windows. Passing those additional flags gets complicated if not using python setup.py build_ext, but eg pip, as noted in #8

So this PR adds an alternative way to do this, by specifying environment variables.
This is similar to how pygeos/shapely do it (eg https://github.com/pygeos/pygeos/blob/0835379a5bc534be63037696b1b70cf4337e3841/setup.py#L66-L72)

As a result, I can build locally on Windows with

set GDAL_INCLUDE_PATH=%CONDA_PREFIX%\Library\include
set GDAL_LIBRARY_PATH=%CONDA_PREFIX%\Library\lib

python setup.py build_ext -i

instead of

python setup.py build_ext -i -I"%CONDA_PREFIX%\Library\include" -lgdal_i -L"%CONDA_PREFIX%\Library\lib"

(and with pip it's even more a simplification, because of not needing the --install-option)

Still need to update the docs if we want this.

setup.py Outdated
@@ -89,7 +89,7 @@ def read_response(cmd):
try:
gdalinfo_path = None
for path in os.getenv("PATH", "").split(os.pathsep):
matches = list(Path(path).glob("**/gdalinfo*"))
matches = list(Path(path).glob("**/gdalinfo.exe"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change because in my latest conda env, this was not working (while gdalinfo is available in a conda env on windows). The reason for this is that the current glob also found other gdalinfo occurences:

> c:\users\joris\scipy\pyogrio\setup.py(93)<module>()
-> matches = list(Path(path).glob("**/gdalinfo*"))
(Pdb) path
'C:\\Users\\joris\\miniconda3\\envs\\geo-dev'
(Pdb) n
> c:\users\joris\scipy\pyogrio\setup.py(94)<module>()
-> if matches:
(Pdb) matches
[WindowsPath('C:/Users/joris/miniconda3/envs/geo-dev/Lib/site-packages/osgeo_utils/samples/gdalinfo.py'), WindowsPath('C:/Users/joris/miniconda3/envs/geo-dev/Lib/site-packages/osgeo_utils/samples/__pycache__/gdalinfo.cpython-310.pyc'), WindowsPath('C:/Users/joris/miniconda3/envs/geo-dev/Library/bin/gdalinfo.exe')]

@jorisvandenbossche
Copy link
Member Author

To make this more generic, we might also want to allow setting those variables on linux / mac ? (pygeos does that as well)

@martinfleis
Copy link
Member

To make this more generic, we might also want to allow setting those variables on linux / mac ? (pygeos does that as well)

Yeah, I can imagine it can be useful in some cases.

@brendan-ward
Copy link
Member

Thanks for working on this @jorisvandenbossche ! This will be helpful.

allow setting those variables on linux / mac

Yes, this is a good idea, and in terms of precedence we should probably check those before and instead of using gdal-config.

Presumably we'd want precedence in this order?

  1. via flags to python setup.py
  2. via environment variables
  3. via gdal-config

But we should also think about trying to keep those as an atomic unit, to simplify the logic. Meaning, GDAL_INCLUDE_PATH, GDAL_LIBRARY_PATH, and GDAL_VERSION should all be set; otherwise if we derive some via env vars and some via gdal-config or gdalinfo, they might not be consistent with each other.

@jorisvandenbossche
Copy link
Member Author

That precedence order looks good. I am only not sure if we can control the precedence for the flags passed to setup.py or pip. From looking at setup.py, I assume that setuptools adds those automatically (as we don't have code for that, only to specify extra options from the gdal-config or env variables). But in practice that might already mean that they take precedence (or actually get overridden, if we want to be sure, probably need to test this).

But we should also think about trying to keep those as an atomic unit, to simplify the logic. Meaning, GDAL_INCLUDE_PATH, GDAL_LIBRARY_PATH, and GDAL_VERSION should all be set;

Yes, I think that is fine (I now already did that for the two path variables, to only use them if both are specified. And could add a warning if only some are specified)


There is one issue for that, and that's the name of the GDAL library. On Linux (and Mac?), this is gdal (so the flag you need to pass is -lgdal), while on Windows this is gdal_i (which I hardcoded in the current version of this PR).
Are we fine with keeping this hardcoded (platform-dependent)? (can it ever be something else?) Adding yet another environment variable for this seems a bit annoying for the default case where it's always gdal or gdal_i.

@jorisvandenbossche
Copy link
Member Author

Is it actually needed to require to specify GDAL_VERSION as well if GDAL_INCLUDE_PATH/GDAL_LIBRARY_PATH are specified? Because on Windows, that could be taken from gdalinfo, which could be available on the path.

Comment on lines +50 to +58
include_dir = os.environ.get("GDAL_INCLUDE_PATH")
library_dir = os.environ.get("GDAL_LIBRARY_PATH")

if include_dir and library_dir:
return {
"include_dirs": [include_dir],
"library_dirs": [library_dir],
"libraries": ["gdal_i" if platform.system() == "Windows" else "gdal"],
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the question is whether in this case we want to require that GDAL_VERSION env variable is set as well? (in which case we can check that to raise an error it too old)

Or do we only want to require this linux/mac? (and on windows still try gdalinfo --version first)

@brendan-ward
Copy link
Member

I don't know how the file name for GDAL library on Windows is determined, so I'm OK with that being hard-coded here. Can always be addressed later if we discover another variant of the name.

My concern with not requiring GDAL_VERSION if the other two are set is having a disconnect between the version of GDAL that you'd retrieve via gdalinfo vs the version to which the other two point - assuming that it is theoretically possible to have multiple installations of GDAL at the same time.

However, we're really only using GDAL_VERSION to check that it is at least our minimum supported version before attempting to do any compilation, since compilation may fail in bad and confusing ways if trying to use an unsupported version of GDAL. This originally was adapted from Fiona, but there, knowing the version was critical at compile time to determine which set of files to build (wrappers for GDAL 1.x vs 2.x).

We have a few options:

  1. always require it if the other two env vars are present, just to be explicit.
  2. use it if present, but allow to compilation to proceed without it being set. Maybe issue a warning to user that version is unknown and compilation may fail if version is too old.
  3. stop using altogether, and just treat minimum supported version as documentation. It is user's responsibility to ensure correct version.
  4. use ctypes to obtain GDAL version from the GDAL library in setup.py (based on the env vars, flags, etc to determine the correct library - so it will always be the one we compile against) and verify that it is correct version.

At this point, I'm wondering if we want to simplify and go with option 3; none of our actions (i.e., what files to include) at compile time depend on this.

@jorisvandenbossche
Copy link
Member Author

Option 2 might also still be useful? Or at least check the GDAL version if it can be obtained from the gdal-config / gdalinfo, and otherwise raise a warning if GDAL_VERSION is not specified.
Code-wise, option 3 would certainly be simpler though.

@brendan-ward
Copy link
Member

I like the idea of simplifying, esp. since we're not tightly coupled to only the latest version of GDAL. The likelihood of a user trying to install on GDAL <2.4 is hopefully low, and that's really all taht GDAL_VERSION is going to catch.

Since you are already making changes to setup.py here, would you mind ripping out GDAL_VERSION handling?

@jorisvandenbossche
Copy link
Member Author

Yes, I can certainly do that.
Do we still want to keep the version check in case of linux where we get the flags for gdal-config, so its only a small effort to check the version?

@brendan-ward
Copy link
Member

Do we still want to keep the version check in case of linux

Sure, I guess we can keep this for now; probably slightly more likely to encounter incompatible GDAL versions there without expecting it, vs the more deliberate effort required to install GDAL on Windows...

@jorisvandenbossche
Copy link
Member Author

I suppose I should actually have asked whether to keep the version check for windows in case of a gdalinfo being somewhere on the path (as this is much more lines of code compared to the linux version)

@brendan-ward
Copy link
Member

whether to keep the version check for windows

I think we can drop this, and just let minimum version be stated in documentation.

Copy link
Member

@brendan-ward brendan-ward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor docstring comment, but otherwise looks good to me. I don't have a way of testing on Windows at the moment.

Thanks for your work on this, @jorisvandenbossche !

setup.py Outdated
def get_gdal_paths():
"""Obtain the paths for compiling and linking with the GDAL C-API

First the presence of the GDAL_INCLUDE_PATH and GDAL_LIBRARY_PATH environment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

GDAL_INCLUDE_PATH and GDAL_LIBRARY_PATH environment variables are used if both are present.

@jorisvandenbossche jorisvandenbossche changed the title Simplify building on Windows with environment variables Allow gdal config via GDAL_INCLUDE/LIBRARY_PATH env variables (Simplify building on Windows with environment variables) Mar 9, 2022
@jorisvandenbossche jorisvandenbossche merged commit d517938 into geopandas:main Mar 9, 2022
@jorisvandenbossche jorisvandenbossche deleted the build-windows branch March 9, 2022 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants