New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jsonschema should not change its behavior depending on which packages are installed #1035
Comments
It sounds like the schema upstream was invalid -- if anything that's a reason format validation should be more strict is it not?
I suspect you'll find many other cases of this unfortunately -- optionally enabling functionality when another package is installed is not uncommon in Python, and whilst I agree it can lead to this kind of annoying downstream issue, you're pinning your dependencies right? So you should at least not be encountering it until you've manually made a change that should tip you off. Regardless though, this behavior is essentially constrained partly by the specification, which says that format validation is off by default, but even when on, unknown formats must be ignored. -- and partly by backwards compatiblity, where this behavior has worked like this forever. I agree that stricter behavior would be less error prone, but it's only now that In short: yes the behavior around format isn't ideal, but there isn't really a good alternative that's backwards compatible, spec compliant, and future-proof. |
Pinging @binste as he's the one working on a fix for Altair in vega/altair#2771. Several people got bit by this before I was, apparently.
I don't disagree. The problem is not how strict the validation is. The problem is that the level of strictness suddenly changed for no apparent reason just because some package happened to be installed on the system. The level of strictness should be under the control of the caller code, not a side effect of the package manager.
I'll admit I'm not that familiar with what's idiomatic in the Python ecosystem, but my understanding is that, usually, something starts working when a package is installed, which seems fine. Something breaking because a package is installed is way more objectionable. Because of the way jsonschema is used, increasing strictness because a package happens to be present is likely to do more harm than good in many situations.
I don't see how that relates to the issue at hand? What I'm suggesting is a change in the jsonschema user API where validators have to be explicitly specified instead of popping up by surprise through the set of installed packages. It's a change in the details of the mechanics of how jsonschema format validators are enabled - what does that has to do with the spec, and what does that has to do with ignoring unknown formats? |
This library is a validation library, it's meant to help developers catch problems -- "starts working" in this context is precisely "raises errors", the definitions are backwards from the usual ones --a buggy schema which was silently passing suddenly started raising errors it sounds like.
I don't think this is an accurate characterization. Right now, other than for schema validation (your case), format validation is opt-in already. For schema validation, indeed it's on by default if packages are installed, and otherwise off. In general, I don't think users want to be manually configuring format validation for schema validation, certainly not in this specific case, they likely don't care or even know what formats are used in the metaschema, they just want to know whether their schema is correct or not, where here it seems it wasn't. What I was saying about the spec was that if the spec didn't mandate that format be off by default for the general case (i.e. for data, not schema validation), then it'd be even more critical to essentially always install the format dependencies, and you likely would have noticed the invalid schema even earlier.
The level of strictness by the way is indeed under your control, you're free to turn it on or off by not enabling format validation, if you indeed want to keep the buggy schema as-is. |
This is being handled upstream it seems. The general behavior as I mentioned I don't think can or will change -- format validation in general will stay opt-in, but in the case of invalid schemas, it's currently "best effort", which indeed may change with the set of installed packages, unfortunate as that might be. If format dependencies become mandatory, which also seems unlikely given some of them are not trivial deps, that may change, but given I'm not sure there's anything else to do here, going to close. Additional thoughts are of course still welcome regardless, if I'm indeed still missing the point. |
According to the docs, depending on which packages are installed, jsonschema will validate certain formats but not others.
I would like to suggest that changing the behavior of a piece of code depending on what packages appear to be present (for whatever reason) in the current environment might not be a good idea, because it opens up potential for "spooky action at a distance" where a seemingly unrelated change to the environment can suddenly cause failures that can be pretty difficult to understand because they seem to come out of nowhere.
For example, consider the case of two packages:
altair
andjupyter
, which are frequently used together but are otherwise unrelated.jupyter_events
, a dependency ofjupyter
, started depending onjsonschema[format-nongpl]
a few months ago in jupyter/jupyter_events@decd0ec. This causedrfc3986-validator
to be installed, which in turn changed the behavior ofjsonschema
to start validatinguri-reference
s. This had the surprising consequence of turning a benign schema error in Altair (which also pullsjsonschema
, but without[format-nongpl]
) into a serious problem that suddenly broke my code (see vega/altair#2794). It should not be possible for a change to the dependency spec of package A to cause a breaking change in behavior in completely unrelated package B.I would suggest that
jsonschema
users should be required to explicitly list in caller code which format validators they want to use, instead of having validators magically "pop up" as soon as some package is installed.The text was updated successfully, but these errors were encountered: