Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force the InVEST python application to run in UTF-8 mode #1268

Open
wants to merge 42 commits into
base: main
Choose a base branch
from

Conversation

phargogh
Copy link
Member

This PR addresses issues arising from #1167, in which a cryptic UnicodeDecodeError would present in some users computers but we were unable to reproduce it locally.

My best understanding about the problem is that there is a discrepancy in the user's locale and the user's filesystem encoding, which, on Windows, would cause python to use the locale's encoding as the filesystem encoding unless we specify otherwise. This would ultimately cause problems when filepaths made it down to GDAL, where paths must be UTF-8. Although I cannot pinpoint precisely which GDAL function is failing, I strongly suspect that it is this difference between expected and real encodings that is causing the error.

This is only an issue on Windows. Mac and Linux are unaffected by the bug, as these are both POSIX systems.

As a solution, I have the workbench forcing the python application to start up in UTF-8 mode on Windows. This does have a side effect of causing text files to be opened as UTF-8 files by default, but this should not be a problem for InVEST because we generally tell people to use UTF-8 anyways. UTF-8 mode is scheduled to become the default in another 4 years, with the release of python 3.15.

While debugging this, I also discovered how to enable GDAL's debug logging, so I have added a workbench setting to enable this in case it's useful in the future. Note that to make full use of this logging, the user will also need to lower the logging threshold to DEBUG, which is then passed to the python application as an environment variable.

@emlys since the GDAL setting adds a new string to translate, is there anything else I will need to do to make sure it is ready for future translation?

@davemfish @emlys I took a stab at implementing this new setting, but please let me know if there is a better way to do any of this!

Fixes #1167

Checklist

  • Updated HISTORY.rst and link to any relevant issue (if these changes are user-facing)
  • Updated the user's guide (if needed)
  • Tested the affected models' UIs (if relevant)

I'm hoping that this will allow us to capture more information about
what's going on.  RE:natcap#1167
Also adding CPL_DEBUG (copied from GDAL logging example) to increase
verbosity of the gdal logging, and mirroring the log levels that GDAL
has. RE:natcap#1167
The user is re-running the model on the same workspace and while the
error message is consistent, I'm not getting it in the same point of
execution in the model.

RE:natcap#1167
There is a possibility that the issue here is UTF-16 strings being
provided to GDAL.
RE:natcap#1167
Now we don't return if there are an invalid number of parameters, just
log what we know and put in an obvious placeholder on what we don't
know. RE:natcap#1167
It turns out this can be configured via an environment variable.
RE:natcap#1167
Copy link
Member

@emlys emlys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, I like that the GDAL debug setting is now exposed in the workbench! The translated string looks correct, we don't need to do anything else with it. I just had one minor code style suggestion. Also looks like there are a couple of files in the diff that are unrelated to this issue?

import hashlib
import inspect
import itertools
import logging
import os
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there aren't any functional changes to this file?

Copy link
Member Author

@phargogh phargogh Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops! Yes, these were left over from a few changes I made to pollinaton, and isort sorted the imports. Unpatched in 0565160

workbench/src/main/setupInvestHandlers.js Outdated Show resolved Hide resolved
Copy link
Contributor

@davemfish davemfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @phargogh , thanks for explaining all this in so much detail. I think this is a nice solution!

I'm not sure if the changes in utils.py are causing this somehow, but test_utils.py seems like it never properly exits. I re-ran the jobs once and they all seem to be hanging again.

onChange={this.handleChange}
>
<option value="NORMAL" key="NORMAL">NORMAL</option>
<option value="DEBUG" key="DEBUG">DEBUG</option>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wrap the displayed strings to get translated <...>{t('NORMAL')}</...>

Copy link
Member Author

@phargogh phargogh Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

patched in bd9a9b6

"Skipping.")
return
except Exception as e:
LOGGER.exception(str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we get here, some improperly typed args or kwargs? If so, do we want to proceed to with logging, or should we return/raise here?

Copy link
Member Author

@phargogh phargogh Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is super unclear because I'm having a tough time reproducing it on my system, but the user I'm working with on the forums was seeing something in len raising a UnicodeDecodeError. And because of the nature of this error, I'm having a really hard time actually logging the object itself, or even the type! It's very bizarre.

I added a note about this in c883dc1 and am now returning from the logging function. You're right, if we're seeing this same error, we probably won't be able to do much about the rest of the logging.

phargogh and others added 3 commits March 30, 2023 09:31
This will fix the tests hanging.

RE:natcap#1167
Refactor of how we create the arguments for spawning the InVEST process for clarity and brevity. RE:natcap#1167

Co-authored-by: Emily Soth <43770515+emlys@users.noreply.github.com>
@phargogh
Copy link
Member Author

phargogh commented Mar 31, 2023

@davemfish @emlys thank you both for your review and good comments! I have addressed them and fixed the builds hanging issue, but I'm still trying to understand why the user's getting the error she's getting. So I'll re-request your reviews once we've figured that out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GDAL-level SystemError for some users
3 participants