Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help understanding change from application/xml to text/xml #158

Closed
focusaurus opened this issue May 24, 2017 · 2 comments
Closed

Help understanding change from application/xml to text/xml #158

focusaurus opened this issue May 24, 2017 · 2 comments

Comments

@focusaurus
Copy link

Hi! I help maintain superagent and we had some tests that broke/changed between mime 1.3.5 and 1.3.6 where mime.types.xml changed from application/xml to text/xml. I'm just curious if those with more mime DB experience understand the back story as to why this changed. I'm not requesting any change to node-mime with this issue, just asking a question. I would think these types would have been stable for many many years. Anyone know why this change came about? Is it normal for a change like this to be released as semver patch?

@broofa
Copy link
Owner

broofa commented May 24, 2017

Hi @focusaurus, my apologies for the trouble. The long-winded explanation of what happened follows, but the tl;dr: is "mime-db changed".

Long version: I'll start by noting that this module is just a thin API that sits on top of the canoncial mime-db database. This is a great thing, as it keeps me out of the business of dealing with the constant stream of changes needed to keep such a db current (an ever-present thorn in my side prior to mime-db becoming a thing). So, this module just has a silly little build script that converts mime-db into the more compact types.json map of type->extension this module ships with.

Unfortunately, the world of mime types is not without contradictions. E.g. IANA says that application/xml and text/xml are both valid types for the xml extension, but provides no guidance on how to prioritize one over the other. And that's just IANA. mime-db also pulls from Apache and NGINX, which introduce the possibility of yet more confusion. Suffice it to say... when types.json gets built, it uses a naive first-one-wins approach to resolving inconsistencies.

This has always bugged me, because it means resolving inconsistencies is non-deterministic. It depends on the order entries appear in the mime-db file... and I'm pretty sure there's no real logic to that on their end. So, recently, I tried to take a more nuanced approach by prioritizing types based on their "facet". I briefly pushed that out as v1.3.5, before having it pointed out that it broke loading of custom mime types. 10 hours later, I pushed out v1.3.6 that reverted that change while incorporating a couple minor improvements from PRs that had been languishing for too long. Codewise, v1.3.6 is almost identical to v1.3.4.

... except where types.json is concerned. A while back I'd updated the mime-db dependency from v1.2.0 to v1.22.0 intending to publish new mappings with the next minor release. When v1.3.6 went out, it went out with a new types.json file.

Where you're issue is concerned that's where the problem was introduced. (See shell output below)

In the past, I have always treated updates to mime type mappings as a semver patch since the API of this library doesn't change, and such changes usually "felt" trivial in nature. So... historically, yeah, that's been the normal process.

Was that appropriate in this case? I honestly don't know. The semver spec speaks to software API's... but an API isn't a dataset. I ran into a similar issue recently when we restructured node-uuid's internal file layout w/out changing the API.

Any thoughts you have on how to quantify the impact of mime-db changes so as to predictably map those to the appropriate semver changes here are welcome.


kieffer@MacBook-Pro-3$ lsf
total 0
drwxr-xr-x  10 kieffer  staff  340 May 24 14:36 mime1.3.4/
drwxr-xr-x  10 kieffer  staff  340 May 24 14:36 mime1.3.6/

kieffer@MacBook-Pro-3$ cp -pr mime1.3.6 test

kieffer@MacBook-Pro-3$ cd test

kieffer@MacBook-Pro-3$ node -e "console.log(require('.').lookup('xml'))"
text/xml

kieffer@MacBook-Pro-3$ cp ../mime1.3.4/types.json .

kieffer@MacBook-Pro-3$ node -e "console.log(require('.').lookup('xml'))"
application/xml

@focusaurus
Copy link
Author

Thanks for the detailed and informative reply. I'm at least relieved to understand that the upstream sources aren't revising their mapping of what xml means years and years later. I'm not too concerned about the semver logic for this. I'll give you my thoughts but this is just expectations of random user/developer with no particular authority/expertise around this. I think the general rule is, when in doubt, bump bigger versions is preferred to sneaking bigger changes into smaller semver bumps.

Thus for a module that combines both a dataset that is really prominent and a small wrapper API, I'd say:

  • only code bugfixes or real data "bugfixes": semver patch
    • I'd consider a data "bugfix" to be for example a new mime type is released and a misspelling is discovered shortly thereafter, such that tons of downstream deps haven't had enough time to get too dependent on it (as if they had caught the HTTP Referer header spelling the day after the spec finalized). The longer the "bug" is uncorrected, the bigger the semver impact of fixing it.
  • new code features/API without any semantic breaks, and new additions to the data: semver minor
  • code breaking or semantic changes, anything being removed from the data, anything in the data being changed: semver major

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants