Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file:// URLs #75

Closed
mgax opened this issue Feb 6, 2016 · 44 comments
Closed

file:// URLs #75

mgax opened this issue Feb 6, 2016 · 44 comments
Labels

Comments

@mgax
Copy link

mgax commented Feb 6, 2016

Are there plans to support local filesystem URLs?

@Klortho
Copy link

Klortho commented Feb 12, 2016

+1
Really surprised to discover this doesn't work.

@TimothyGu
Copy link
Collaborator

There might be some security implications.

@bitinn
Copy link
Collaborator

bitinn commented Feb 15, 2016

It's not on my roadmap at the moment, node-fetch sits on server-side and local url usage is limited.

@stevenvachon
Copy link

stevenvachon commented Oct 4, 2016

This should work as it works in the browser Firefox.

Edit: This is not defined in the spec https://fetch.spec.whatwg.org/#basic-fetch

@TimothyGu
Copy link
Collaborator

I guess we've decided that file:// will not be supported.

@bitinn
Copy link
Collaborator

bitinn commented Oct 16, 2016

Just to clarify, if this works consistently across major browsers we will re-open.

But more as a clear warning that we don't support file:// protocol, just like we don't support relative url.

@matthewadams
Copy link

@bitinn @TimothyGu I don't I understand the aversion to file:/// urls.

If we're on the server side (using Node.js + Express), and I have to call another rest endpoint whose swagger is not easily obtainable, why can't I take care to put it under source control and simply use it instead of the undownloadable spec from the service I'm calling?

This limitation seems arbitrary. Please clarify its justification.

@TimothyGu
Copy link
Collaborator

TimothyGu commented May 5, 2017

matthew-andrews@matthewadams That sounds like a problem with how node-fetch is used, not with node-fetch. FYI Node.js v8 will be released with support for file URLs in fs module methods.

@jimmywarting
Copy link
Collaborator

The Response constructor can take a Readable stream so why not simply do something like this?

let {Response} = fetch
let stream = fs.createReadStream('myfile.txt')
let promise = new Response(stream).text()

I don't think file protocol should be supported, sorry
I for once use node-fetch as a proxy, user can request any url they want with my api
wouldn't want them to get access to the filesystem...

@stevenvachon
Copy link

I don't see why it can't be supported behind an option to enable it.

@matthewadams
Copy link

matthewadams commented May 7, 2017

NB: I'm @matthewadams, not @matthew-andrews. Sorry for the noise, @matthew-andrews

@TimothyGu I'm confused by your response. Please see swagger-api/swagger-js#1044 for justification. Do you have a pointer for Node.js v8 supporting file:// URLs in fs?

@matthewadams
Copy link

@jimmywarting I still don't understand why node-fetch shouldn't support file:// URLs. You could easily prevent your users from using file:// URLs simply by testing for it in your API.

@jimmywarting
Copy link
Collaborator

You could easily prevent your users from using

As you could easily switch to using fs by testing the URL in your api...
IMO I think security comes first. The file protocol isn't part of the spec

@stevenvachon
Copy link

@jimmywarting the spec is already broken here: https://github.com/bitinn/node-fetch/blob/master/LIMITS.md

@Klortho
Copy link

Klortho commented May 7, 2017

+1 again ... it still seems a no-brainer, to me. URL stands for Uniform Resource Locator, after all. Tim Berners Lee would be rolling over in his grave. (If he were dead, that is.)

@TimothyGu
Copy link
Collaborator

Much thought have gone into this issue, for sure. I'd like to condense the most important reasons I believe URLs of file scheme should not be implemented in node-fetch. I would of course be thankful if @bitinn or @zkat would like to weigh in and offer their opinions here as well.

1. Accessing the local filesystem eases security breaches.

Take @jimmywarting's (admittedly anecdotal and contrived) use case:

I for once use node-fetch as a proxy, user can request any url they want with my api
wouldn't want them to get access to the filesystem...

Really, I cannot believe many people are using node-fetch as a proxy (I might be wrong of course, but IMO there are many much better ways to create a proxy in Node.js). But I find it easy to imagine people who are using a URL from an untrusted source (the network for example) and just plug it into fetch(). In many cases, that practice is fine. It would not be fine if file:///etc/passwd is accessed.

@stevenvachon proposed using an option to allow file URL scheme. That may solve this issue, but see my response below.

2. There is no recognized standards for interpretation of file-scheme URLs.

As outlined in sindresorhus/got#227 filed by @stevenvachon, WHATWG's Fetch Standard offers no discussion of how file URL schemes should be interpreted on different platforms. In fact, in the PR adding documentation for Node.js fs module file URL support, I explicitly recommended documenting how URLs are interpreted on different platforms, precisely because of the competing interpretations.

Now that file URL scheme is supported officially in Node.js, I foresee a de-facto standard being established at least in the Node.js community. But then in less than a month when Node.js 8.0.0 is released, users can use fs.readFile(new URL(fileURL), (err, buf) => { if (err) ... } ) (or any promisified variant), which I consider to be an improvement over fetch(fileURL, { allowFileScheme: true }).then(res => res.buffer()).then(buf => { ... }).catch(err => { ... });.

3. There is no precedent for support of file-scheme URLs in Node.js community.

None of the most popular URL-to-response libraries I can think of in under a minute do not support file-scheme URLs:

4. Browser fetch() does not support file-scheme URLs.

There are a lot of people who use node-fetch as a server-side polyfill to allow for isomorphic requests. It would be a surprising behavior if node-fetch has the capability to access the file system browsers' fetch() cannot. However, neither Chrome nor Firefox supports fetch() with file-scheme URLs. GitHub's fetch polyfill refuses to do so, also: JakeChampion/fetch#92 (comment)


@matthewadams

NB: I'm @matthewadams, not @matthew-andrews. Sorry for the noise, @matthew-andrews

Oops, apologies. Shouldn't have depended on the autocompletion.

I'm confused by your response. Please see swagger-api/swagger-js#1044 for justification.

What I meant to express is that Swagger should be aware of node-fetch's limitations, and decide if it wants to allow file URLs. I do not see any indication of such a decision made in the linked issue by the maintainers of Swagger, and if I were to design the Swagger API, I would allow both HTTP(S) URLs and a plain file path instead of resorting to file-scheme URLs.

Do you have a pointer for Node.js v8 supporting file:// URLs in fs?

Code was added in nodejs/node#10739. Docs were added in nodejs/node#12670.


@stevenvachon

I don't see why it can't be supported behind an option to enable it.

Adding an option that defaults to true would not change issue 1 above.

If it defaults to false, issues 1 and 4 would indeed be resolved. However I would prefer not to add to the list of non-spec options we have unless absolutely necessary. With the exception of size, the options we have all aim to reimplement browser defaults that users on Node.js may or may not want.

the spec is already broken here: https://github.com/bitinn/node-fetch/blob/master/LIMITS.md

This issue is not an issue of spec-compliance. Even if it is, as the steward of node-fetch, I would exercise my best decision when it comes to following the spec, which according to the reasons I outlined above leads me to the side of "against."


@Klortho

URL stands for Uniform Resource Locator, after all. Tim Berners Lee would be rolling over in his grave. (If he were dead, that is.)

Yes, URL indeed stands for that. But it should come as no surprise that certain applications only support certain URL schemes. And speaking of Tim Berners-Lee, I would like to add a shameless plug to a particular tweet of mine. 😄

@bitinn
Copy link
Collaborator

bitinn commented May 8, 2017

I still pretty against it. I don't see why we want to wrap around fs module (we wrap around http module and that's enough of a maintenance headache given the changing Fetch Spec).

Plus the issue 2 in Timothy's reply: Fetch Spec gave us nothing to work with on file protocol.

@joshua-gould
Copy link

How to enable file URLs - https://gist.github.com/joshua-gould/58e1b114a67127273eef239ec0af8989

@masaeedu
Copy link

@bitinn Could you please reconsider this and unix socket support? Interacting with things like the REST API on the Docker daemon socket are impossible without it.

@bitinn
Copy link
Collaborator

bitinn commented Nov 1, 2017

@masaeedu Still a nope on File URL.

But if you really need UNIX socket support, try got, they are modelled after Fetch. https://github.com/sindresorhus/got#unix-domain-sockets

@masaeedu
Copy link

masaeedu commented Nov 1, 2017

Unfortunately I can't use got or anything else that doesn't conform to the Fetch API, as I am using a code generator which generates clients that accept an implementation of fetch. I could try wrapping it to conform to the Fetch API but I may as well implement my own fetch at that point.

@j0nimost

This comment has been minimized.

@est31
Copy link

est31 commented Apr 28, 2019

@TimothyGu replying to your comment #75 (comment)

However, neither Chrome nor Firefox supports fetch() with file-scheme URLs.

That's news to me. Yes, Chrome disallows fetch-ing any file:// scheme based url, but Firefox definitely allows fetch('file://') as long as their same origin policy is followed.

As your comment is from 2017 and my reply from 2019 there is the potential that something has changed since. But I definitely have used fetch on file:// in 2017 so I guess nothing much has.

@bitinn bitinn added the wontfix label May 13, 2019
@bitinn bitinn mentioned this issue May 13, 2019
@Kreijstal
Copy link

Kreijstal commented Jan 10, 2021

so if I want to fetch a ftp:// file, how should I go around doing this?
Is there a fork that allows this?

@3p3r
Copy link

3p3r commented Apr 26, 2023

Every once in a while I have the misfortune of using node-fetch directly or indirectly through another package. What happens every single time is spending 2hrs+ only to find out file:// scheme is not supported in a package named "NODE-fetch". Now read that again. It says Node in the name. Which is supposed to run in Node, right? Because the name says so! It does not say web-fetch. It does not say browser-fetch. It says NODE-fetch.

Where is Node ran? server side. right? What do you have access to in the server side always, a filesystem, right?? right???

SO, why exactly does this not support local filesystems? Why does this have to work like a browser? Who cares if this works like a browser?? If I wanted browser behavior, I would not have pulled a package called NODE-fetch. I would have pulled something called browser-fetch or something else.

And what security implications exactly does this lack of support refer to? People like to throw this word around like a magic statement that makes bad decisions and architecture go away. If you can't point exactly what security implication you're trying to address, you are addressing absolutely nothing, just a waste of everyone's time.

This is not even the only GH issue on this for this package. Clearly lots of people are suffering from this problem. On another thread a maintainer cited the reason for this due to "servers not having an origin URL". Excuse me, but then what is process.cwd() in the context of a local filesystem for a package THAT IS SUPPOSED TO RUN SERVER SIDE ALWAYS as its name implies?

@jimmywarting
Copy link
Collaborator

I understand that you're frustrated @3p3r and have had a difficult experience with node-fetch and just want to vent a little bit. However, it's important to keep a respectful tone when discussing issues with others on open source projects.

Regarding your question about why node-fetch have decided not to support the file:// scheme, there are several reasons for this. First and foremost, allowing fetch requests to access local files can introduce security vulnerabilities. This is because it opens up the possibility of a malicious user accessing files on your computer that they shouldn't have access to.

such as in the example of building CORS proxies \w the fetch api.

While I understand that you would like node-fetch to support the file:// scheme, it's important to consider the potential risks and drawbacks of doing so also.


you can get around this with a simple wrapper around fetch that first checks the protocol and if it starts with file:// then you could create a blob backed up by the disc and construct a new Response with the blob. Using blob instead of a stream will give you the added content-length and content-type header as a bonus for using blob (thanks to blob.size and blob.type)

import fetch, { Response, blobFrom } from 'node-fetch'
import fs from 'node:fs'
import process from 'node:process'

const base = process.cwd()

export default async function fetchWrapper(url, options) {
  const parsedUrl = new URL(url, base)
  if (parsedUrl.protocol === 'file:') {
    // If the URL starts with 'file://', read the file from the local file system
    const filePath = decodeURIComponent(parsedUrl.pathname)

    if (!fs.existsSync(filePath)) {
      return new Response(null, { status: 404, statusText: 'NOT FOUND' })
    }

    const blob = await blobFrom(filePath, { type: 'application/octet-stream' })
    return new Response(blob, { status: 200, statusText: 'OK', url })
  } else {
    // For all other requests, use node-fetch as usual
    return fetch(url, options)
  }
}

@3p3r
Copy link

3p3r commented Apr 27, 2023

I am absolutely not trying to bash the maintainers. I am so insignificant in the pool of 36M+ people that download this package that whatever I say makes no dent in anything.

I am trying to point out a design flaw here. In a server environment where this package is used 99% of the time (and should be 100% of the time!), accessing resources over a network is most definitely less secure than accessing the local filesystem. You cannot argue that fetching network resources in Node is safer than accessing local filesystem.

This behavior only makes sense in browser, because there are further safeguards for networks access. The browser platform fundamentally does not allow file access and its networking stack is strict due to all layers of standards that went into it all these years.

But those networking safeguards do not exist in Node, so arguing that accessing network is safer in Node vs accessing local filesystem is moot. A CORS proxy built with node-fetch, hosted somewhere online should not have sensitive files next to it. Or it should not have filesystem permission setup in a way that you can fetch /etc/shadows. That's not on this package to solve. That's on users to solve. Let the user handle edge cases. Inversion of Control please.

That won't make this package any less secure. If I could use node-fetch to cause stack overflow to execute arbitrary shell commands, now that would make this package insecure.

@guest271314
Copy link

Chrome (Chromium-based browsers) extensions support fetch() with file: protocol passed Issue 1227761: Fetch() should support file scheme for extensions

manifest.json

{
  "name": "Service Worker-based background script",
  "description": "Test that fetching a file scheme URL succeeds",
  "version": "1",
  "manifest_version": 3,
  "host_permissions": ["file:///*"],
  "background": {"service_worker": "service_worker_background.js"}
}

@guest271314
Copy link

Keep in mind Chromium-based browser supports fetch() for chrome-extension:, too. Chromium might arbitrarily create any protocol, e.g., quic-transport:, then remove the protocol and create another protocol, e.g., isolated-app:. There is also ipfs: scheme support https://chromestatus.com/feature/5105580464668672.

@guest271314
Copy link

guest271314 commented Nov 20, 2023

Looks like node v22.0.0-nightly202311151d8483e713 executable implementation of fetch() supports file: protocol

>  var scriptText = await (await fetch("file:///home/user/exports.js")).text();
undefined
> scriptText
"const fn = () => 123;\nexport {fn};"
> const mod = await import(URL.createObjectURL(new Blob([scriptText], {type:"text/javascript"})));
undefined
> mod
[Module: null prototype] { fn: [Function: fn] }
> mod.fn()
123

@Kreijstal
Copy link

Kreijstal commented Nov 20, 2023

Looks like node v22.0.0-nightly202311151d8483e713 executable implementation of fetch() supports file: protocol

>  var scriptText = await (await fetch("file:///home/user/exports.js")).text();
undefined
> scriptText
"const fn = () => 123;\nexport {fn};"
> const mod = await import(URL.createObjectURL(new Blob([scriptText], {type:"text/javascript"})));
undefined
> mod
[Module: null prototype] { fn: [Function: fn] }
> mod.fn()
123

time to migrate to built in fetch

@jimmywarting
Copy link
Collaborator

jimmywarting commented Nov 23, 2023

As you may already know file: are left as an exercise for the reader (in the spec)

Some of all the original members and authors have all agreed to not support file: cuz of security concerns.
it's unclear what the response should be if it find the file, what header it should have and what should it return when it's not found.
it's very bad if a fetch call makes a request to a server and the server responds with a redirect header going from something like https: to file:

@node-fetch node-fetch deleted a comment from guest271314 Nov 23, 2023
@node-fetch node-fetch deleted a comment from guest271314 Nov 23, 2023
@guest271314
Copy link

file: is a secure context per the controlling W3C specification, or rather is no specified as insecure.

It makes sense that it is more "secure" to fetch local files than to expose the entire system to remote networking requests.

it's unclear what the response should be if it find the file, what header it should have and what should it return when it's not found.

Handle the request like any other protocol. 404 if not found.

The developer should be in control over the API surface. The user determines what is "secure" and what is "insecure", not some external specification controlled by a handful of individuals who are not in control of the users' machine.

it's very bad if a fetch call makes a request to a server and the server responds with a redirect header going from something like https: to file:

That's possible anyway. In the browser using an extension. Outside of the browser using a proxy.

Again, the user makes that decision on their own machine.

Full-duplex streaming is a little more involved specification-wise, though shouldn't/doesn't have to be. Folks have implemented full-duplex streaming to the browser. Specification language is W.I.P. by certain parties. Both node and deno support full-duplex streaming. bun doesn't.

My point is I'm not a proponent of censorship. Everybody thinks their own ideas are great.

Goebbels was in favor of free speech for views he liked. So was Stalin. If you’re really in favor of free speech, then you’re in favor of freedom of speech for precisely the views you despise. Otherwise, you’re not in favor of free speech.

-Noam Chomsky

@jimmywarting
Copy link
Collaborator

jimmywarting commented Nov 23, 2023

It makes sense that it is more "secure" to fetch local files than to expose the entire system to remote networking requests.

Makes no sense to me. cuz why would you exposing a file system?
if you can do fetch('file://') then you have as much read access as having access to any file system.

sure i may be more secure in a way that you can not list, delete or edit files but still...

Handle the request like any other protocol. 404 if not found.

why not throw? and what about the response? should it contain content-type, content-length, last modified date, accept byte range? how much should it expose?

That's possible anyway. In the browser using an extension. Outside of the browser using a proxy.

I know that deno supports file, and it have protections agains redirect attacks like https:// -> file://
if we can have that then all kudos to that!

and if we can implement something like a permission control to allow fetch('file://...') by passing some explicit option to enable it or having a --flag then sure by all means we could reconsider it. it should not be enabled by default.

TimothyGu have already gone trough why supporting file:// would be bad

Why are we talking about duplex streams? it's have nothing to do with file:

And you can say whatever you like, but plz do it elsewhere so that the person dose not get offended by it.
i don't want to keep doing open source stuff if ppl is going to have a "bad" atitude

@jimmywarting
Copy link
Collaborator

I haven't tested node v22.0.0-nightly202311151d8483e713 cuz it's not released?
but i tried latest undici which is what nodejs are using... and fetch did not have any support for file://

So i have my doubts if it's working

@guest271314
Copy link

Makes no sense to me. cuz why would you exposing a file system?
if you can do fetch('file://') then you have as much read access as having access to any file system.

Then you can not fetch file: on your own machine.

Extending your own personal preferences to an entire API is selfish and to me denotes you want to control what users do on their own machine if they use your library.

Why?

You are not responsible for what users' fetch on their own machines.

Relevant specification issue w3c/webappsec-secure-contexts#66.

why not throw? and what about the response? should it contain content-type, content-length, last modified date, accept byte range? how much should it expose?

You can throw or not throw. Whatever. The point is arbitrarily barring fetching file: protocol is something you can do on your own machine. Yes, you can include ordinary headers you would for any other request.

I haven't tested node v22.0.0-nightly202311151d8483e713 cuz it's not released?
but i tried latest undici which is what nodejs are using... and fetch did not have any support for file://

It is released. Fetch Node.js nightly archive and test for yourself to resolve any doubts you might have.

Why are we talking about duplex streams? it's have nothing to do with file:

I'm just saying the reason for node-fetch has passed. We can full-duplex stream and fetch file: using the lastest node executable standalone, without any libraries.

And you can say whatever you like, but plz do it elsewhere so that the person dose not get offended by it.

i don't want to keep doing open source stuff if ppl is going to have a "bad" atitude

I think maintainers need to learn how to handle constructive feedback.

You have done great work in the open source software domain.

Editing honest user feedback is problematic on many levels. IMO you would have done better to just mark the comment as off-topic rather than editing the users' comment because you felt "offended". I don't think the author of the comment was trying to throw shade. node-fetch probably did help a lot of developers wanting to use fetch() in Node.js environment. Now Fetch Standard is implemented. Why use node-fetch? Because of legacy dependencies?

Anyway, I ain't trying to give you grief. I was startled by you editing a users' comments here and conveyed my opinion on that act.

Have a great day, cheers.

@Kreijstal
Copy link

I mean, yeah, at least, disable it by default, but write options to enable it if you want, doesn't node has permissions now? if you want to be safe you can use those permissions.

@guest271314
Copy link

I mean, yeah, at least, disable it by default, but write options to enable it if you want, doesn't node has permissions now?

If you are using node the question must be asked, why are you using node-fetch? node implements fetch(), marked as "stable" now nodejs/node@740ca54 even though there is some discussion still ongoing based on lack of input, allegedly nodejs/node#49867.

I didn't use any special flags when I tested. I fetch the Node.js nightly archive every day or so though, so the example I posted is using the node nightly release.

Isn't the whole point of polyfills that native support is not shipped?

@jimmywarting
Copy link
Collaborator

Node-fetch was created long before when fetch was only a browser thing and was very new
Saying that is's a pollyfill for what nodejs have built in today is "wrong". It dose not rip out what exist in node today and and put it on npm or anything like that. It dose not try to adhere to nodejs implementation. We are just following the fetch specification.

we created a server side implementation first and our goal where to give browser fetch to the server. And prefixed it with node cuz it was specifically built for node:http in mind

@guest271314
Copy link

I see. So node-fetch and fetch() implementation in node executable are not intended to be, and are comparable. They do different things for different contexts and use cases, are not mutually exclusive.

@guest271314
Copy link

Looks like fetching file: protocol isn't working anymore in node v22.0.0-nightly202311250bb5d88871.

@guest271314
Copy link

For anybody interested in making this happen for Undici fetch() here are the relevant lines that throw when passing a file: URL https://gist.github.com/guest271314/2779cfbb4a4f8ba3496e253e546c0104.

@Kreijstal
Copy link

For anybody interested in making this happen for Undici fetch() here are the relevant lines that throw when passing a file: URL https://gist.github.com/guest271314/2779cfbb4a4f8ba3496e253e546c0104.

at new URL (node:internal/url:804:36)

hmm maybe somthing that can be patched?

@guest271314
Copy link

@Kreijstal A few ways to do this for Node.js' Undici fetch() implementation https://gist.github.com/guest271314/a4f005d9a6b5b433ae6d6e6c5c6d7595.

At https://github.com/nodejs/undici/blob/main/lib/web/fetch/index.js#L129 add the following to short-circuit for file: protocol

  const url = new URL(input);
  if (url.protocol === "file:") {
    return import("node:fs").then((fs) => {
      p.resolve(new Response(fs.readFileSync(url)));
      return p.promise;
    });   
  }

Using curl

import { spawn } from "node:child_process";
import { Duplex } from "node:stream";

async function fetchFile(path) {
  // https://github.com/chcunningham/atomics-post-message/blob/main/server.js
  const mimeTypes = {
    "html": "text/html",
    "jpeg": "image/jpeg",
    "jpg": "image/jpeg",
    "png": "image/png",
    "js": "text/javascript",
    "wasm": "application/wasm",
    "css": "text/css",
  };
  try {
    const { stdout, stderr } = spawn("./bin/curl", ["-NSs", path]);
    if (stderr) {
      // Handle file not found: "curl: (37) Couldn't open file /home/user/bin/nm_hosts.js\n"
      const err = await new Response(Duplex.toWeb(stderr).readable).text();
      if (err) {
        throw err;
      }
    }
    return new Response(Duplex.toWeb(stdout).readable, {
      headers: {
        "Content-Type": mimeTypes[path.split(".").pop()] || "text/plain",
      },
    });
  } catch (e) {
    throw e;
  }
}

export { fetchFile };

dd command

  const url = new URL(path);
  try {
     const { stdout, stderr } = spawn("dd", [`if=${url.pathname}`, `count=${1024**2}`, "status=none"]);

GNU Coreutils head

  const url = new URL(path);
  try {
    const { stdout, stderr } = spawn("head", ["-c", 1024**2, url.pathname]);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests