Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All cacheKey override on Request. #8211

Open
or-else opened this issue Jan 20, 2024 · 19 comments
Open

All cacheKey override on Request. #8211

or-else opened this issue Jan 20, 2024 · 19 comments
Labels
enhancement Feature not a bug

Comments

@or-else
Copy link

or-else commented Jan 20, 2024

Signed URLs, such as https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html, include a transient token as a query parameter. Okhttp provides no facility to filter out transient query parameters (or, in fact, no facility at all to customize cache lookups) for the purpose of caching. Such objects are effectively never cached by okhttp. Furthermore, nearly every class in okhttp is final making it impossible to add such functionality by subclassing.

Javascript in browsers has this facility. You should too.

See this asked (and ignored) at stackoverflow: https://stackoverflow.com/questions/40783613/how-to-custom-cache-in-okhttp3

@yschimke
Copy link
Collaborator

Can you link to how browsers remove the transient params?

@yschimke
Copy link
Collaborator

As a kludgy workaround, you can use application and network interceptors to modify what gets evaluated for caching, and what gets sent on the wire.

But we should consider a proper cache key API. Or learn from browsers on transient params?

  data class OriginalUrl(val url: HttpUrl)

  val client = OkHttpClient.Builder().eventListener(object : EventListener() {
      override fun cacheMiss(call: Call) {
        println("cacheMiss ${call.request().url}")
      }

      override fun cacheHit(call: Call, response: Response) {
        println("cacheHit ${call.request().url}")
      }

      override fun cacheConditionalHit(call: Call, cachedResponse: Response) {
        println("cacheConditionalHit ${call.request().url}")
      }
    }).addInterceptor {
      val request = it.request()
      val url = request.url
      // only strip the key param for caching, not other params
      val newRequest = if (url.queryParameter("key") != null) {
        request.newBuilder().tag<OriginalUrl>(OriginalUrl(url)).url(url.newBuilder().removeAllQueryParameters("key").build()).build()
      } else {
        request
      }
      it.proceed(newRequest).newBuilder().request(request).build()
    }.addNetworkInterceptor {
      val request = it.request()
      val originalUrl = request.tag<OriginalUrl>()?.url
      // if we modified the url for caching, send the real one
      if (originalUrl != null) {
        val networkRequest = request.newBuilder().url(originalUrl).build()
        it.proceed(networkRequest).newBuilder().request(request).build()
      } else {
        it.proceed(request)
      }
    }.cache(Cache("/cache".toPath(), Long.MAX_VALUE, FakeFileSystem())).build()

  // cache miss - first request
  client.newCall(Request("https://httpbin.org/cache/3600?key=123".toHttpUrl())).execute().use {
    println(it.body.string())
  }
  // cache hit - cached without key
  client.newCall(Request("https://httpbin.org/cache/3600?key=345".toHttpUrl())).execute().use {
    println(it.body.string())
  }

  // cache miss - another=123
  client.newCall(Request("https://httpbin.org/cache/3600?another=123".toHttpUrl())).execute().use {
    println(it.body.string())
  }
  // cache miss - another=345
  client.newCall(Request("https://httpbin.org/cache/3600?another=345".toHttpUrl())).execute().use {
    println(it.body.string())
  }

@or-else
Copy link
Author

or-else commented Jan 20, 2024

Can you link to how browsers remove the transient params?

See ignoreSearch in Cache.match():
https://developer.mozilla.org/en-US/docs/Web/API/Cache/match#ignoresearch

@or-else
Copy link
Author

or-else commented Jan 20, 2024

As a kludgy workaround, you can use application and network interceptors to modify what gets evaluated for caching, and what gets sent on the wire.

Thank you for the suggestion. I'll see if I can make it work.

@yschimke
Copy link
Collaborator

That Cache seems like a specific web workers thing, slightly detached from fetch.

https://developer.mozilla.org/en-US/docs/Web/API/Cache

So not sure we want to make our Cache the same thing. And it doesn't seem like it applies to the fetch API calls.

But still worth working out how to support

@or-else
Copy link
Author

or-else commented Jan 20, 2024

That Cache seems like a specific web workers thing, slightly detached from fetch.

https://developer.mozilla.org/en-US/docs/Web/API/Cache

So not sure we want to make our Cache the same thing. And it doesn't seem like it applies to the fetch API calls.

But still worth working out how to support

Here is an example how ignoreSearch is used to check cache while executing HTTP GET (an event is sent to the service worker, service worker looks up cached response first):

https://github.com/tinode/webapp/blob/5faedc35f3b27bff11f50fc1a388f5bef1ebd2fc/service-worker.js#L154

self.addEventListener('fetch', event => {
  if (event.request.method != 'GET') {
    return;
  }

  event.respondWith((async _ => {
    //  Try to find the response in the cache.
    const cache = await caches.open(PACKAGE_VERSION);

    const reqUrl = new URL(event.request.url);
    // Using ignoreSearch=true to read cached images and docs despite different auth signatures.
    const cachedResponse = await cache.match(event.request, {ignoreSearch: (self.location.origin == reqUrl.origin)});
    if (cachedResponse) {
      return cachedResponse;
    }
    // Not found in cache.
    const response = await fetch(event.request);
    if (!response || response.status != 200 || response.type != 'basic') {
      return response;
    }
    if (reqUrl.protocol == 'http:' || reqUrl.protocol == 'https:') {
      await cache.put(event.request, response.clone());
    }
    return response;
  })());
});

@yschimke
Copy link
Collaborator

Yep, understood.

It's not a HTTP cache (honouring cache headers)

It's a specific web worker cache for Request/Response.

Might be interesting to allow for a cache only request without going through with a call.

@yschimke
Copy link
Collaborator

I'm going to close, I don't think we can copy the WebWorker Cache, and there is an ugly workaround.

@or-else
Copy link
Author

or-else commented Jan 28, 2024

I'm going to close, I don't think we can copy the WebWorker Cache, and there is an ugly workaround.

Is there a plan to address it in any ("right") way or no plan at all? I need to make the decision to keep Picasso (which relies on okhttp cache) or switch to Glide. If there is no plan then I have to byte the bullet and switch.

Thanks.

@or-else
Copy link
Author

or-else commented Jan 28, 2024

The fix in okhttp could be trivially simple. Make class Cache open and make key method not static.

@yschimke
Copy link
Collaborator

yschimke commented Jan 28, 2024

We have a similar requirement for a cache key for DoH DNS caching. #8113

So maybe as part of that.

This might be some optional tag for a cache key if it's not just POST.

@yschimke
Copy link
Collaborator

We have a similar requirement for a cache key for DoH DNS caching. #8113

So maybe as part of that.

@swankjesse any thoughts?

@or-else
Copy link
Author

or-else commented Jan 28, 2024

Yes, looks similar. The simplest solution would be to allow users to provide a custom implementation of the Cache.key().

@yschimke
Copy link
Collaborator

Similar requests #1605

@yschimke
Copy link
Collaborator

I'll reopen for further discussion

@yschimke yschimke reopened this Jan 28, 2024
@or-else
Copy link
Author

or-else commented Jan 28, 2024

BTW, #8113 would likely require passing the entire Request to Cache.key which I guess should not be a problem.

@yschimke
Copy link
Collaborator

BTW, #8113 would likely require passing the entire Request to Cache.key which I guess should not be a problem.

It's unlikely we would open up Cache for subclassing, we are very careful about new public API surface. So more likely some way to specify the cache key for a request.

@swankjesse
Copy link
Member

I’m glad this issue was reported before we proceeded on #8113.

I’m tempted to do a cacheKey: ByteString parameter on the Request itself: if specified, we hash that instead of the request URL. We could even cache POST requests that have a cache key.

@yschimke
Copy link
Collaborator

yschimke commented Jan 30, 2024

Yep, let's do it.

Does the override on otherwise non cacheable requests affect just the initial cache search, or also update the cache if a miss? or stays uncacheable?

@yschimke yschimke changed the title Option to ignore some or all URL query parameters when doing cache lookup All cacheKey override on Request. Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature not a bug
Projects
None yet
Development

No branches or pull requests

3 participants