Blob URL store partitioning #153

annevk · 2020-05-05T09:54:13Z

https://privacycg.github.io/storage-partitioning/ has some general background here and https://trac.torproject.org/projects/tor/ticket/15502 is much more specific.

@bakulf was thinking that we could restrict blob URL lookup to the agent cluster (in addition to origin, that is). The one tweak I would suggest to that is that navigating a top-level browsing context (including a noopener one) to a blob URL still ought to work.

Concretely, this would mean that if you have https://example.com/ open twice, in separate browsing context groups, any blob URLs they mint cannot be used by the other.

The one gotcha with the tweak I suggested is that the other could observe existence through a popup then. Now that's an attack that's unlikely to yield anything useful in practice, but we could break that too by forcing noopener or a version of COOP that never matches (and thus always creates a new browsing context group).

We suspect this to be web-compatible and are happy to try it out in Firefox.

cc @mkruisselbrink @hober @SubhamoyS

The text was updated successfully, but these errors were encountered:

mkruisselbrink · 2020-05-21T18:42:35Z

Limiting it like that doesn't seem too crazy to me, although I'm not entirely sure I understand the attack/threat model that defends against.

A blob URL can only be resolved when the page that created it is still alive, and when the page trying to fetch it is same origin with the original page. If that is the case, you might as well just BroadcastChannel to talk to the original page, rather than jumping through hoops with blob URLs?

Of course if storage partitioning blocks BroadcastChannel because of different top-level URLs things would be different, but in that case wouldn't it make more sense to partition blob URLs the same way as other communication mechanisms, rather than having them be partitioned even more?

mkruisselbrink · 2020-05-21T20:32:44Z

Having said all that, for MediaSource blob URLs chrome already limits them to their agent cluster. And scoping all blob URLs to their agent cluster certainly would make the implementation a whole lot simpler (well, the top-level navigation case would still be tricky). So as an implementer agent cluster seems reasonable, as a spec editor I question if that is the right scoping level.

annevk · 2020-05-22T06:19:47Z

The keying for storage seems like it will end up being dynamic (i.e., storage access). I think if we can avoid things being dynamically keyed that's preferable.

To expand on that, if the key were to change it would mean old blob URLs in that environment would no longer be accessible, which would be quite weird, especially since blob URLs also serialize some origins into their identifier. And also, it seems that if two browsing context groups needed to share a blob they could/would share the object instead of the URL.

smaug---- · 2020-08-13T20:59:19Z

It would be a bit odd inconsistency if sharing objects worked, but sharing blob URLs didn't.

mkruisselbrink · 2020-08-13T21:02:58Z

That doesn't seem that different from today? Blob objects can be shared across origins, while blob URLs can only be resolved by same origin originators. As such there are already plenty of cases where sharing a blob URL doesn't work but sharing a blob, and having the receiver create a blob URL from it does work.

smaug---- · 2020-08-13T21:56:31Z

That is somewhat different inconsistency tough, since it is about cross-origin.

annevk · 2020-08-14T14:38:39Z

It seems that blob objects could always be widely shared and that does not change. They essentially have no scope. Blob URLs had a scope that was undefined to some extent (see #135). This proposes to scope blob URLs more clearly, to the agent cluster they are created in as well as any new top-level browsing contexts (which we'd ideally force noopener on).

wanderview · 2021-06-25T17:40:51Z

I don't understand why we would do this instead of just making blob URLs only loadable by contexts with the same StorageKey. It would seem better to lean on the partitioning we are pursuing in other APIs instead of introducing a new kind of isolation.

annevk · 2021-06-25T17:50:52Z

Apart from the issues discussed above that would not work well for top-level navigations to partitioned blob: URLs.

wanderview · 2021-06-25T18:01:14Z

Apart from the issues discussed above that would not work well for top-level navigations to partitioned blob: URLs.

What are the issues above? I looked at the links but it was not obvious to me. Dynamic keying? Not all browsers are proposing to do that.

And I don't see why its problematic for top-level navigations. Can you explain that? If we have a StorageKey for the blob we can propagate that to the context. Just like how we are going to have partitioned service workers if they are registered with a partitioned StorageKey.

But even if we wanted to prevent blob URLs from partitioned StorageKeys from being navigated to, I don't see why we need to block a context with the same StorageKey as the blob from loading it as a subresource.

asutherland · 2021-06-25T23:11:31Z

An upside of limiting Blob URL's to agent clusters is that it limits the potential for browser compat issues related to races for resolving Blob URLs against the Blob URLs being revoked. Right now resolve a Blob URL references the Blob URL Store like it's something that's synchronously accessible across all agent clusters. (I believe this leads to the situation described in #157 where Chrome has to use sync IPC when creating Blobs.)

That could alternately be addressed by specifying Blob resolution and Blob URL Store manipulations in a more multi-process-aware way. The previously referenced #157 would likely entail this because of the integration with storage.

mkruisselbrink · 2021-06-29T04:25:29Z

(kind of aside, but Chrome doesn't actually need to use sync IPCs when creating Blobs anymore, at least not for any web-exposed API. We still do because some internal and chrome extension usage of blobs might otherwise have race conditions).

Blob URL creation however does indeed need sync IPCs in particular to solve situations like: 1) agent1 registers a blob URL; 2) agent1 postMessages said URL to some other agent2; 3) agent2 tries to resolve the blob URL. With chrome's IPC there is no guarantee that the registration in step 1 arrives in the browser process before the attempt to resolve from step 3 arrives. As you say, if all blob URL registrations were scoped to the agent cluster they are created in this would no longer be an issue, and we could possibly eliminate these as sync IPCs.

So with my implementer hat on I think scoping by agent cluster makes a lot of sense. I don't have a good idea how web compatible that would be, but it shouldn't be too hard to collect metrics for that.

With my spec editor hat on, I'm not sure what makes more sense; scoping by storage key seems like it would solve all the same issues, while probably being more web compatible. On the other hand it does feel a bit weird to me to use storage key for this as blob URLs don't really seem like anything storage related to me (but then we're also using storage key for things like broadcast channel, so that isn't much of an argument). Also I'd say that anything that drives down usage of blob URLs might be a good thing, so being as restrictive as we can while not breaking too much does seem attractive. Overall I don't feel particularly strongly either way.

wanderview · 2021-06-29T15:07:47Z

FWIW, if there are other reasons to restrict blob URL loading to same agent, I'm not objecting to that. But lets be clear that's the reason its being done. I just want to avoid conflating agent isolation with storage partitioning because they are conceptually different things.

arichiv · 2021-07-19T16:26:48Z

Heads up that I'm examining the potential breakage this would cause if implemented in chrome: https://bugs.chromium.org/p/chromium/issues/detail?id=1224926

annevk · 2021-07-20T13:24:14Z

@wanderview agreed, though I would add that agent cluster isolation does obviate the need for partitioning.

(To explain the scenario above: if A embeds B and B mints a blob URL, you'd want the user to be able to copy that URL and navigate to it. It's not clear to me how that would work if lookup for blob URLs would use the storage key.)

wanderview · 2021-07-20T14:36:30Z

Sure, but the first post explaining why you want to do agent clustering isolation uses 3rd party partitioning as its primary and only motivation. An explainer laying out the additional motivations would be useful for clarifying why we are attempting a bigger hammer than needed for 3rd party partitioning.

(To explain the scenario above: if A embeds B and B mints a blob URL, you'd want the user to be able to copy that URL and navigate to it. It's not clear to me how that would work if lookup for blob URLs would use the storage key.)

(Just to answer this, but not suggesting we need to do this if agent cluster isolation is preferred for other reasons, but:)

A blob URL has a uuid in it, correct? That can be used to look up the storage key associated with the blob and apply it to the context created by navigation. This seems just like how a browser has to use the uuid to lookup the origin of the blob based on the URL today.

mkruisselbrink · 2021-07-20T17:08:47Z

A blob URL has a uuid in it, correct? That can be used to look up the storage key associated with the blob and apply it to the context created by navigation. This seems just like how a browser has to use the uuid to lookup the origin of the blob based on the URL today.

I think we're all pretty much describing the same thing, just in a different manner. We want to change the "map" we look up blob URLs in to not only be keyed on blob URL/UUID, but also be keyed on storage key or agent cluster. However for navigations we still need to be able to look up an entry in this map while ignoring the second part of the key, i.e. we'll still need the existing blob URL -> blob map as well.

mkruisselbrink · 2021-10-19T20:34:15Z

FWIW, we landed metrics in Chrome a while ago to determine potential breakage if we'd partition blob URLs by agent clusters: https://www.chromestatus.com/metrics/feature/timeline/popularity/3963. So about ~0.1% of page loads might be broken if blob URLs are tied to an agent cluster.

annevk · 2021-10-21T10:29:44Z

@mkruisselbrink does that exclude top-level navigations to blob URLs? As those could result in a new agent cluster but would nonetheless remain working.

arichiv · 2021-10-21T14:11:49Z

I believe those accesses are split out here: https://www.chromestatus.com/metrics/feature/timeline/popularity/3964
(The CL: https://chromium-review.googlesource.com/c/chromium/src/+/3043367)

mkruisselbrink · 2021-10-21T14:34:23Z

3963 is only for subresources, 3964 would be for top-level navigations where the initiators agent cluster doesn't match the blob URLs agent cluster (not sure if we would block those as well or not)

annevk · 2021-10-21T14:41:59Z

Interesting, I don't think we ran into any issues with subresources. We did ran into issues with top-level navigations as we broke those and have yet to try again. (As you note in #153 (comment) those should work.)

annevk · 2022-06-01T11:59:04Z

Discussing this with @artines1 again today he pointed out that we hit a problem in Firefox with an A nests (sandboxed) B scenario and B then creating a blob URL and attempting to download it. For some reason that doesn't use the agent cluster of B. He'll look into it a bit more.

(If that ends up being a blocker I suppose we might be stuck with the "storage key" here, which is unfortunate given the IPC calls that would be needed still in certain scenarios. What's less of a concern these days is unpartitioning as all browsers seem aligned on always partitioning storage.)

https://bugs.webkit.org/show_bug.cgi?id=260035 rdar://problem/113705298 Reviewed by Alex Christensen and Sihui Liu. Public blob URLs are only accessible from same-origin dcuments, but access is not restricted by the top-level origin. This means that Blob URLs can be used as a cross-origin tracking mechanism within iframes. In this patch we partition public blob URLs within the Blob Registry by top-level origin. This partitioning is controlled by a feature flag that is disabled by default. I took a few approaches at solving this. The most difficult challenge was finding a solution that allowed retrieving BlobData using a public blob URL from WKWebView APIs. In that case, the relevant top document may not be obvious, or may not exist. As a result, the design of this partitioning is more like access control rather than adding another key into the hashmap. Two alternative designs I considered include creating a second hashmap that is keyed by <URL, SecurityOriginData> and we lookup the BlobData in that map if we have a SecurityOriginData, otherwise we use the unpartitioned map. Or, we create a new map from URL -> SecurityOriginData where we can lookup the associated top origin SecurityOriginData if we don't already know it. However, both of these options are more complex than the chosen implementation, and neither of them seemed safer. This change also enforces a noopener policy on new windows when the top origin of the opener is cross-origin with the blob's security origin. This is a mitigation that was discussed in the blob URL storage partitioning issue [0] with cross-engine support, and that seemed reasonable to me. [0] w3c/FileAPI#153 * LayoutTests/TestExpectations: * LayoutTests/http/tests/local/blob/download-blob-from-iframe-expected.txt: Added. * LayoutTests/http/tests/local/blob/download-blob-from-iframe.html: Added. * LayoutTests/http/tests/local/blob/navigate-blob-expected.txt: Added. * LayoutTests/http/tests/local/blob/navigate-blob.html: Added. * LayoutTests/http/tests/local/blob/resources/broadcast-channel-proxy.html: Added. * LayoutTests/http/tests/local/blob/resources/iframe-creating-or-downloading-blob.html: Added. * LayoutTests/http/tests/local/blob/resources/iframe-for-creating-and-navigating-to-blob.html: Added. * LayoutTests/http/tests/local/blob/resources/main-frame-with-iframe-creating-or-navigating-to-blob.html: Added. * LayoutTests/http/tests/local/blob/resources/main-frame-with-iframe-downloading-blob.html: Added. * LayoutTests/http/tests/security/blob-null-url-location-origin-expected.txt: * LayoutTests/http/tests/security/blob-null-url-location-origin.html: * LayoutTests/http/tests/security/cross-origin-blob-transfer-expected.txt: Added. * LayoutTests/http/tests/security/cross-origin-blob-transfer.html: Added. * LayoutTests/http/tests/security/resources/iframe-cross-origin-blob-transfer.html: Added. * LayoutTests/http/tests/security/top-level-unique-origin2.https.html: * LayoutTests/platform/gtk-wk2/http/tests/local/blob/download-blob-from-iframe-expected.txt: Added. * LayoutTests/platform/mac-wk1/TestExpectations: * Source/WTF/Scripts/Preferences/UnifiedWebPreferences.yaml: * Source/WebCore/fileapi/BlobURL.cpp: (WebCore::BlobURL::isInternalURL): * Source/WebCore/fileapi/BlobURL.h: * Source/WebCore/fileapi/ThreadableBlobRegistry.cpp: (WebCore::ThreadableBlobRegistry::registerInternalFileBlobURL): (WebCore::ThreadableBlobRegistry::registerInternalBlobURL): (WebCore::ThreadableBlobRegistry::registerInternalBlobURLOptionallyFileBacked): (WebCore::ThreadableBlobRegistry::registerInternalBlobURLForSlice): (WebCore::isInternalBlobURL): Deleted. * Source/WebCore/loader/FrameLoader.cpp: (WebCore::FrameLoader::loadURL): (WebCore::FrameLoader::loadPostRequest): (WebCore::createWindow): * Source/WebCore/platform/network/BlobRegistryImpl.cpp: (WebCore::BlobRegistryImpl::registerBlobURLOptionallyFileBacked): (WebCore::BlobRegistryImpl::unregisterBlobURL): (WebCore::BlobRegistryImpl::getBlobDataFromURL const): (WebCore::BlobRegistryImpl::addBlobData): (WebCore::BlobRegistryImpl::registerBlobURLHandle): (WebCore::BlobRegistryImpl::unregisterBlobURLHandle): * Source/WebCore/platform/network/BlobRegistryImpl.h: Canonical link: https://commits.webkit.org/267172@main

annevk added normative change privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response. labels May 5, 2020

This was referenced May 5, 2020

Blob URL store partitioning w3cping/tracking-issues#99

Open

Blob URL store partitioning w3c/security-review#37

Open

annevk mentioned this issue Jan 12, 2021

Convoluted blob: URL issue w3c/webappsec-secure-contexts#71

Open

annevk mentioned this issue Jun 25, 2021

consider allow late importScripts/import for local URLs w3c/ServiceWorker#1595

Open

sysrqb mentioned this issue Jan 26, 2023

Give URLKeepingBlobAlive a top-origin SecurityOriginData WebKit/WebKit#9184

Merged

sysrqb mentioned this issue Aug 10, 2023

Partition Blob Registry by the top-level main document origin WebKit/WebKit#7549

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blob URL store partitioning #153

Blob URL store partitioning #153

annevk commented May 5, 2020

mkruisselbrink commented May 21, 2020

mkruisselbrink commented May 21, 2020

annevk commented May 22, 2020 •

edited

smaug---- commented Aug 13, 2020

mkruisselbrink commented Aug 13, 2020

smaug---- commented Aug 13, 2020

annevk commented Aug 14, 2020

wanderview commented Jun 25, 2021

annevk commented Jun 25, 2021

wanderview commented Jun 25, 2021

asutherland commented Jun 25, 2021 •

edited

mkruisselbrink commented Jun 29, 2021

wanderview commented Jun 29, 2021

arichiv commented Jul 19, 2021

annevk commented Jul 20, 2021

wanderview commented Jul 20, 2021

mkruisselbrink commented Jul 20, 2021

mkruisselbrink commented Oct 19, 2021

annevk commented Oct 21, 2021

arichiv commented Oct 21, 2021 •

edited

mkruisselbrink commented Oct 21, 2021

annevk commented Oct 21, 2021

annevk commented Jun 1, 2022

Blob URL store partitioning #153

Blob URL store partitioning #153

Comments

annevk commented May 5, 2020

mkruisselbrink commented May 21, 2020

mkruisselbrink commented May 21, 2020

annevk commented May 22, 2020 • edited

smaug---- commented Aug 13, 2020

mkruisselbrink commented Aug 13, 2020

smaug---- commented Aug 13, 2020

annevk commented Aug 14, 2020

wanderview commented Jun 25, 2021

annevk commented Jun 25, 2021

wanderview commented Jun 25, 2021

asutherland commented Jun 25, 2021 • edited

mkruisselbrink commented Jun 29, 2021

wanderview commented Jun 29, 2021

arichiv commented Jul 19, 2021

annevk commented Jul 20, 2021

wanderview commented Jul 20, 2021

mkruisselbrink commented Jul 20, 2021

mkruisselbrink commented Oct 19, 2021

annevk commented Oct 21, 2021

arichiv commented Oct 21, 2021 • edited

mkruisselbrink commented Oct 21, 2021

annevk commented Oct 21, 2021

annevk commented Jun 1, 2022

annevk commented May 22, 2020 •

edited

asutherland commented Jun 25, 2021 •

edited

arichiv commented Oct 21, 2021 •

edited