[BUG] cache and registry in combination makes fetch unstable #23

pgsandstrom · 2020-02-17T09:27:51Z

What / Why

Whenever I have a cache and registry specified in the conf-object, some fetches fail. Its not 100%, but somewhere around 50% of the fetches fail. It only happens if I have both a cache and a registry. I have removed all other configurations.

How

My conf object:
{cache: '/tmp/new-cache', registry: 'my-company-internal-registry'}

My usage of npm-registry-fetch:
const json = (await npmRegistryFetch.json(dependencyName, conf))

Current Behavior

Sometimes I receive this response:
{"code":"FETCH_ERROR","errno":"FETCH_ERROR","type":"invalid-json"}

It never happens unless both registry and cache has been specified. In my testing I have done thousands of fetches and it is very reliable.

I'd be happy to supply more information if needed, or perform testing if it can help.

The text was updated successfully, but these errors were encountered:

isaacs · 2020-02-18T00:32:46Z

That's really weird that specifying both would cause troubles, because the registry gets a default value anyway.

What's my-company-internal-registry? Maybe it's doing something unique that's messing with the cache in a way that the public registry doesn't?

pgsandstrom · 2020-02-18T07:25:37Z

Its a Nexus repository (https://www.sonatype.com/product-nexus-repository). The configuration of it seems to be quite standard. This is at a largish company and no one has detected anything weird with it. But as you mentioned the default repository, I did more testing.

This does produce errors:
conf = { cache: '/tmp/extra-new-cache', registry: 'my-company-internal-registry' }

But this does not produce errors:
conf = { cache: '/tmp/extra-new-cache', registry: 'https://registry.npmjs.org' }

So the registry is the source of the issue. But it still seems really strange that it bugs out so completely in this scenario, but works well otherwise. I mean, my npm cache seems to be working perfectly when using the npm client in my work. But I dont know enough about the npm cache to say in what way the registry could be buggy. Is there a sensible way to debug this?

isaacs · 2020-03-02T01:30:09Z

Can you please update your test to show the raw output?

Ie, instead of this:

const json = (await npmRegistryFetch.json(dependencyName, conf))

Do this:

const response = await npmRegistryFetch(dependencyName, conf)
const body = await response.text()
let json
try {
  json = JSON.parse(body)
} catch (er) {
  console.error('JSON failed', response.status, response.headers, body)
  throw er
}

And then share the output.

pgsandstrom · 2020-03-10T18:38:29Z

response.headers was an object and the body was too large to be logged to console. So the exact code I ran was like this:

console.error(
  'JSON failed',
  response.status,
  JSON.stringify(response.headers),
  body.substr(0, 1000),
)

All the responses I got was similar to this:

JSON failed 200 {} io.com","bugs":{"url":"https://github.com/Microsoft/vsce/issues"},"keywords":["vscode","vsce","extension"],"contributors":[{"name":"Microsoft Corporation"}],"author":{"name":"Microsoft Corporation"},"license":"MIT","main":"out/main.js","bin":{"vsce":"out/vsce"},"scripts":{"test":"mocha","prepublish":"gulp compile && mocha"},"engines":{"node":">= 0.12"},"dependencies":{"commander":"^2.8.1","denodeify":"^1.2.1","glob":"^5.0.14","lodash":"^3.10.1","mime":"^1.3.4","minimatch":"^2.0.10","osenv":"^0.1.3","read":"^1.0.7","tmp":"0.0.27","url-join":"0.0.1","vso-node-api":"^0.5.0","yazl":"^2.2.2"},"devDependencies":{"event-stream":"^3.3.1","gulp":"^3.9.0","gulp-filter":"^3.0.1","gulp-tsb":"^1.6.0","mocha":"^2.3.3","rimraf":"^2.4.3","source-map-support":"^0.3.2","xml2js":"^0.4.12"},"gitHead":"1af60269feede8d74fcd52fde591c9754781ec22","_id":"vsce@0.7.1","_shasum":"d2d430805a46e7693e0cd824a535e883920ab31d","_from":".","_npmVersion":"3.3.10","_nodeVersion":"0.12.7","_npmUser":{"name":"joaomoreno.ms"

It's really strange, it is as if they are cut of in the beginning. I guess that explains why the json cant be parsed.

I wrote some responses to file and inspected them, they all looked like this. Here is a pastebin of an entire response: https://pastebin.com/E7456cXz

isaacs · 2020-03-10T19:34:10Z

Ah, can you run it again, but don't JSON.stringify() the response.headers. Fetch headers is a Map object, so it'll JSON.stringify as just an empty object, but the headers might give us some indication if maybe we're incorrectly handling a chunked encoding or some other thing.

isaacs · 2020-03-10T19:34:50Z

Cutting down the body is fine. And yeah, it's definitely being cut off for some reason.

pgsandstrom · 2020-03-22T16:31:23Z

Maybe I'm missing something, but if I leave the headers as is then only [object Headers] is printed. However, I did manage to stringify it with an ugly hax. This is what I got:

{"cache-control":"public, max-age=300","cf-cache-status":"REVALIDATED","cf-ray":"5781460dbe9dd13b-GOT","connection":"keep-alive","content-encoding":"gzip","content-type":"application/json","date":"Sun, 22 Mar 2020 16:26:19 GMT","etag":"W/\"f217f19f947ba9ddec66a105b8085819\"","expect-ct":"max-age=604800, report-uri=\"https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct\"","last-modified":"Mon, 04 Feb 2019 10:34:16 GMT","server":"cloudflare","set-cookie":"__cfduid=<redacted>; expires=Tue, 21-Apr-20 16:26:19 GMT; path=/; domain=.npmjs.org; HttpOnly; SameSite=Lax","transfer-encoding":"chunked","vary":"accept-encoding, accept","x-fetch-attempts":"1"}

isaacs · 2020-05-04T16:13:08Z

Update on this! I was able to finally track down a way to reproduce this issue, and track it down to this bit of code in make-fetch-happen's cache.js, where we tee into the cache write stream if the response doesn't fit in memory:

      const tee = new Minipass()
      const cacheStream = cacache.put.stream(
        cachePath,
        ckey,
        cacheOpts
      )
      // XXX problem is right here:
      // if the cacheStream is slow, and there is more than one chunk
      // queued for consumption, then the first chunk is lost.  witaf?
      tee.pipe(cacheStream)
      // this works fine, though:
      // tee.on('data', d => cacheStream.write(d))
      // tee.on('end', () => cacheStream.end())
      // this fails in the same way (ie, not a cacache issue, just a minipass tee issue)
      // const cacheStream = new Minipass()
      // setTimeout(() => cacheStream.resume())
      cacheStream.promise().then(cacheWriteResolve, cacheWriteReject)
      newBody.unshift(tee)

I think this might be a bug in how Minipass handles multiple pipe destinations, so I'm going to dig into that. If so, it's a bug lurking for later anyway. If I can prove that minipass is behaving properly, then it means it's somewhere further up the stack, maybe in minipass-pipeline or minipass-flush (though those modules are super simple and just rely on the Minipass semantics, so I'm skeptical that they're to blame here).

isaacs · 2020-05-04T16:17:45Z

For posterity: the repro case is, in the npm v7 branch that uses arborist, doing npm i tap@14 --before=2020-01-01 with an empty cache, so that it has to fetch a large number of full-metadata packuments and cache them all. The fs streams get backed up, and the responses are large, so it hits this code path where the tee target is slower than the incoming or "main" pipeline of data.

Going to try to reduce it down to just a set of minipass streams in a synthetic environment today or tomorrow. In the meantime, I can float the "no backpressure pipe" patch on make-fetch-happen, since it's really not too much of a hazard to let the cache writes queue up in memory a bit anyway.

isaacs · 2020-05-04T18:18:17Z

Oh, wild! It's not just dropping the first chunk, it's getting the full data, but out of order!

 FAIL  test/cache.js
 ✖ should have same body

  --- expected
  +++ actual
  @@ -1,1 +1,1 @@
  -"hello, world!"
  +"ld!hello, wor"

  test: "test/cache.js put method stream (not in memory): caches correctly"

Something weird is going on here. This SHOULD be fine as a simple pipe(), but for some reason, backpressure from the cache stream can cause the pipeline to drop the first chunk of data, resulting in invalid JSON. Until that is fixed, just write into the cache without any backpressure. The only hazard is that, if the fs is truly very slow, and the rest of the consumption pipeline is very fast, then we'll back up into memory and use more than we ought to, rather than pushing back on the incoming stream. However, this isn't likely to ever be a problem due to how npm does HTTP. Either it's fetching a JSON response, or a tarball (which is also either unpacking to disk, or streaming directly to a tarball file on disk). So, if the disk is slow, and it's a tarball request, we're likely to get backpressure from the main pipeline anyway. It can only become a problem if the JSON response is large enough to span multiple chunks, and also the fs is loaded enough to start slowing down. In the JSON response case, we're going to load the whole thing in memory anyway, so nothing is made particularly *worse* by this lack of backpressure. It is possible that the root cause of this bug exists either in cacache, minipass-pipeline, or minipass itself. But since we don't do a multi-pipe tee stream anywhere else in npm's stack, this is the only spot where it can make itself known. Re: npm/npm-registry-fetch#23

Workaround for #23

isaacs · 2020-05-04T18:27:25Z

Ok, update this to npm-registry-fetch 8.0.2, and it should not hit the problem any more. I'm leaving this issue open though, as the root cause still has to be determined and fixed properly either in cacache, minipass-pipeline, or minipass.

isaacs · 2020-05-13T01:32:38Z

Alright! Removed the kludge in make-fetch-happen, and fixed this properly at the root.

That was a beast to track down.

If the pipeline as a whole is not flowing, then it should return `false` from any write operation. Since the Pipeline listens to the tail stream's `data` event, the streams in the pipeline always are in flowing mode. However, the Pipeline itself may not be, so it would return `true` from writes inappropriately, allowing data to be buffered up in the Pipeline excessively. This would not cause any significant issues in most cases, except excess memory usage. Discovered while debugging npm/npm-registry-fetch#23

joelspadin-garmin mentioned this issue Apr 3, 2020

Error with Artifactory joelspadin-garmin/vscode-private-extension-manager#11

Closed

isaacs added a commit that referenced this issue May 4, 2020

fix: update make-fetch-happen to 8.0.6

226df2c

Workaround for #23

isaacs closed this as completed in isaacs/minipass@4c5a106 May 13, 2020

isaacs reopened this May 13, 2020

isaacs closed this as completed in 3b6c5d0 May 13, 2020

dependabot bot mentioned this issue Aug 17, 2020

Bump npm-registry-fetch from 4.0.2 to 4.0.5 in /frontend Sunny-Lau/dirty-room#3

Open

This was referenced Mar 16, 2021

Bump npm-registry-fetch from 8.0.0 to 8.1.5 in /Rawfiles/AdminLTE-3.0.5 Prom12/lmsxtraclass#14

Closed

Bump npm-registry-fetch from 8.0.0 to 8.1.5 in /public/admin nghia23d/tattooz#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] cache and registry in combination makes fetch unstable #23

[BUG] cache and registry in combination makes fetch unstable #23

pgsandstrom commented Feb 17, 2020

isaacs commented Feb 18, 2020

pgsandstrom commented Feb 18, 2020

isaacs commented Mar 2, 2020

pgsandstrom commented Mar 10, 2020

isaacs commented Mar 10, 2020

isaacs commented Mar 10, 2020

pgsandstrom commented Mar 22, 2020

isaacs commented May 4, 2020

isaacs commented May 4, 2020

isaacs commented May 4, 2020

isaacs commented May 4, 2020

isaacs commented May 13, 2020

[BUG] cache and registry in combination makes fetch unstable #23

[BUG] cache and registry in combination makes fetch unstable #23

Comments

pgsandstrom commented Feb 17, 2020

What / Why

How

Current Behavior

isaacs commented Feb 18, 2020

pgsandstrom commented Feb 18, 2020

isaacs commented Mar 2, 2020

pgsandstrom commented Mar 10, 2020

isaacs commented Mar 10, 2020

isaacs commented Mar 10, 2020

pgsandstrom commented Mar 22, 2020

isaacs commented May 4, 2020

isaacs commented May 4, 2020

isaacs commented May 4, 2020

isaacs commented May 4, 2020

isaacs commented May 13, 2020