Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HttpKernel] ResponseCacheStrategy does not allow client (= browser) cache #26245

Closed
aschempp opened this issue Feb 20, 2018 · 12 comments
Closed

Comments

@aschempp
Copy link
Contributor

Q A
Bug report? yes
Feature request? no
BC Break report? no
RFC? no
Symfony version 3.4

This issue is a follow-up to #25902. @leofeyer, @Toflar and me re-analyzed the problem: The ResponseCacheStrategy sets the response to no-cache, must-revalidate if any of the responses is set to private. This is incorrect, as private means it should be cacheable by the browser, just not by any shared cache.

Because the Response::isCacheable method returns false if a response is private, the ResponseCacheStrategy replaces the private header with a no-cache, must-revalidate.

RFC 7234 says:

The "private" response directive indicates that the response message is intended for a single user and MUST NOT be stored by a shared cache. A private cache MAY store the response and reuse it for later requests, even if the response would normally be non-cacheable.

/cc @nicolas-grekas

@nicolas-grekas
Copy link
Member

Your analysis looks good to me. Can you think of any PR you could provide?

@Toflar
Copy link
Contributor

Toflar commented Feb 20, 2018

I'm wondering about the best way to implement a bugfix, honestly.
The problem we face is that Response::getMaxAge() as well as Response::isCacheable() returns the correct values from the reverse proxy point of view. Both should behave differently (getMaxAge() should ignore s-max-age and isCacheable() should ignore private) when asking from the client point of view.

However, we cannot just adjust these for BC reasons so I wonder how we should best implement this. Shall we just add comments that the existing methods are from the reverse proxy point of view and add new ones? Or not adding new ones and just access the stuff we need in ResponseCacheStrategy, basically leaving the trap open for other people? Or should we have a parameter $forReverseProxy = true for both of these methods so you can pass false if you want a different result for the client point of view?

WDYT?

@Toflar
Copy link
Contributor

Toflar commented Feb 28, 2018

Any feedback on the preferred solution?

@nicolas-grekas
Copy link
Member

Please review #26352

@mpdude
Copy link
Contributor

mpdude commented Mar 21, 2018

Tried this on Symfony 2.8:

<?php

use Sensio\Bundle\FrameworkExtraBundle\Configuration\Route;
use Symfony\Component\HttpFoundation\Response;

class TestController
{
    /**
     * @Route("/test/private")
     */
    public function privateAction()
    {
        $response = new Response("foo");
        $response->setPrivate();

        return $response;
    }

    /**
     * @Route("/test/embeds")
     */
    public function embedsAction()
    {
        $response = new Response('<esi:include src="/test/private" />');
        $response->setPrivate();

        return $response;
    }

}

Then, with the HttpCache enabled:

$ curl -I -X GET http://localhost/test/embeds 
HTTP/1.1 200 OK
Date: Wed, 21 Mar 2018 21:55:29 GMT
Cache-Control: must-revalidate, no-cache, private
...
Content-Length: 3
X-Symfony-Cache: GET /test/embeds: miss; GET /test/private: miss
Content-Type: text/html; charset=UTF-8

Am I getting you right that you're looking for the private Cache-Control in the combined/merged response?

@aschempp
Copy link
Contributor Author

Well, yes. But also there should not be a no-cache, because that renders private obsolete. I'm also looking for a max-age and s-maxage merged from multiple responses (picking the lowest value).

@mpdude
Copy link
Contributor

mpdude commented Mar 22, 2018

Regarding (s)max-age, we'd need to make sure that smax-age is for shared caches. private responses probably must look at max-age only?

What do you mean with no-cache rendering private obsolete?

@aschempp
Copy link
Contributor Author

Sorry, I forgot to mention that #26532 is a replacement for this ticket. I have added extensive information about the RFC to that pull request.

Quoting RFC 2616, Section 14.9.1

private: Indicates that all or part of the response message is intended for a single user and MUST NOT be cached by a shared cache.

no-cache: […] a cache MUST NOT use the response to satisfy a subsequent request without successful revalidation with the origin server. […] A private (non-shared) cache MAY cache the response.

A no-cache directive essentially tells all caches (and that includes the Browser, which is a private cache) to no cache the response. The private directive can never be used.

@mpdude
Copy link
Contributor

mpdude commented Mar 23, 2018

I started here because #26532 mentioned it as a fixed ticket, so I thought it might help me understand your requirements there.

A no-cache directive essentially tells all caches (and that includes the Browser, which is a private cache) to no cache the response.

I think you're misreading the RFC you quoted? no-cache means "don't serve this from a cache without revalidation", private means "store in private caches only". no-store is the way of stating "don't keep this in caches".

Does that make sense?

@aschempp
Copy link
Contributor Author

Are you ok with discussing this in #26532 ?

@mpdude
Copy link
Contributor

mpdude commented Mar 25, 2018

Sure

@aschempp
Copy link
Contributor Author

Closing in favor of #26532

fabpot added a commit that referenced this issue Feb 25, 2019
…he/ResponseCacheStrategy (aschempp)

This PR was squashed before being merged into the 3.4 branch (closes #26532).

Discussion
----------

[HttpKernel] Correctly merging cache directives in HttpCache/ResponseCacheStrategy

| Q             | A
| ------------- | ---
| Branch?       | 3.4
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #26245, #26352, #28872
| License       | MIT
| Doc PR        | -

This PR is a first draft to fix the incorrect merging of private and other cache-related headers that are not meant for the shared cache but the browser (see mentioned issues).

The existing implementation of `HttpFoundation\Response` is very much tailored to the `HttpCache`, for example `isCacheable` returns `false` if the response is `private`, which is not true for a browser cache. That is why my implementation does not longer use much of the response methods. They are however still used by the `HttpCache` and we should keep them as-is. FYI, the `ResponseCacheStrategy` does **not** affect the stored data of `HttpCache` but is only applied to the result of multiple merged subrequests/ESI responses.

I did read up a lot on RFC2616 as a reference. [Section 13.4](https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4) gives an overall view of when a response MAY be cached. [Section 14.9.1](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1) has more insight into the `Cache-Control` directives.

Here's a summary of the relevant information I applied to the implementation:

 - > Unless specifically constrained by a cache-control (section 14.9) directive, a caching system MAY always store a successful response (see section 13.8) as a cache entry, MAY return it without validation if it is fresh, and MAY return it after successful validation.

    A response without cache control headers is totally fine, and it's up to the cache (shared or private) to decide what to do with it. That is why the implementation does not longer set `no-cache` if no `Cache-Control` headers are present.

 - > A response received with a status code of 200, 203, 206, 300, 301 or 410 MAY be stored […] unless a cache-control directive prohibits caching.

    > A response received with any other status code (e.g. status codes 302 and 307) MUST NOT be returned […] unless there are cache-control directives or another header(s) that explicitly allow it.

    This is what `ResponseCacheStrategy::isUncacheable` implements to decide whether a response is not cacheable at all. It differs from `Response::isCacheable` which only returns true if there are actual `Cache-Control` headers.

 - > [Section 13.2.3](https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2.3): When a response is generated from a cache entry, the cache MUST include a single Age header field in the response with a value equal to the cache entry's current_age.

    That's why the implementation **always** adds the `Age` header. It takes the oldest age of any of the responses as common denominator for the content.

 - > [Section 14.9.3](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3): If a response includes an s-maxage directive, then for a shared cache (but not for a private cache), the maximum age specified by this directive overrides the maximum age specified by either the max-age directive or the Expires header.

    This effectively means that `max-age`, `s-maxage` and `Expires` must all be kept on the response. My implementation assumes that we can only do that if they exist in **all** of the responses, and then takes the lowest value of any of them. Be aware the implementation might look confusing at first. Due to the fact that the `Age` header might come from another subresponse than the lowest expiration value, the values are stored relative to the current response date and then re-calculated based on the age header.

The Symfony implementation did not and still does not implement the full RFC. As an example, some of the `Cache-Control` headers (like `private` and `no-cache`) MAY actually have a string value, but the implementation only supports boolean. Also, [Custom `Cache-Control` headers](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.6) are currently not merged into the final response.

**ToDo/Questions:**

 1. [Section 13.5.2](https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.2) specifies that we must add a [`Warning 214 Transformation applied`](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.46) if we modify the response headers.

 2. Should we add an `Expires` headers based on `max-age` if none is explicitly set in the responses? This would essentially provide the same information as `max-age` but with support for HTTP/1.0 proxies/clients.

 3. I'm not sure about the implemented handling of the `private` directive. The directive is currently only added to the final response if it is present in all of the subresponses. This can effectively result in no cache-control directive, which does not tell a shared cache that the response must not be cached. However, adding a `private` might also tell a browser to actually cache it, even though non of the other responses asked for that.

 4. > [Section 14.9.2](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.2): The purpose of the `no-store` directive is to prevent the inadvertent release or retention of sensitive information […]. The `no-store` directive applies to the entire message, and MAY be sent either in a response or in a request. If sent in a request, a cache MUST NOT store any part of either this request or any response to it. If sent in a response, a cache MUST NOT store any part of either this response or the request that elicited it.

    I have not (yet) validated whether the `HttpCache` implementation respects any of this.

 5. As far as I understand, the current implementation of [`ResponseHeaderBag::computeCacheControlValue`](https://github.com/symfony/symfony/blob/master/src/Symfony/Component/HttpFoundation/ResponseHeaderBag.php#L313) is incorrect. `no-cache` means a response [must not be cached by a shared or private cache](https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1), which overrides `private` automatically.

 5. The unit tests are still very limited and I want to add plenty more to test and sort-of describe the implementation or assumptions on the RFC.

/cc @nicolas-grekas

#SymfonyConHackday2018

Commits
-------

893118f [HttpKernel] Correctly merging cache directives in HttpCache/ResponseCacheStrategy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants