Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add response attribute to indicate when content changes #744

Open
masavini opened this issue Dec 12, 2022 · 4 comments
Open

Add response attribute to indicate when content changes #744

masavini opened this issue Dec 12, 2022 · 4 comments

Comments

@masavini
Copy link

Hi,
if a cached page expires and a new one is fetched it would be interesting to know if the new content differs from the cached one.
In my own view, from_cache response attribute should be True if the page content has not changed, regardless whether the cache had expired and a new page was fetched or not.
What do you think about?

@JWCook
Copy link
Member

JWCook commented Dec 14, 2022

I think what you're describing is best handled by conditional requests. For any servers that support it, requests-cache will send a conditional request, and if the remote content hasn't changed, from_cache will still be True because no new data was received from the server. Here's an example: https://requests-cache.readthedocs.io/en/stable/user_guide/headers.html#conditional-requests

Otherwise, from_cache is meant to indicate where a given response object came from, not necessarily what the contents are. Do you have a case where you want to do something only when response content changes?

@JWCook
Copy link
Member

JWCook commented Jan 13, 2023

Another relevant piece of info: in 1.0 (beta), there is now also a CachedResponse.revalidated attribute that indicates if the response was revalidated by a conditional request.

I'll close this issue for now, but let me know if you have any other questions.

@JWCook JWCook closed this as completed Jan 13, 2023
@masavini
Copy link
Author

masavini commented Jan 13, 2023

Otherwise, from_cache is meant to indicate where a given response object came from, not necessarily what the contents are. Do you have a case where you want to do something only when response content changes?

Well, working with scrapers I find it hard to imagine a use case when knowing if the content has actually changed is irrelevant: if a response content has not changed since the last process, I can avoid reprocessing it and that's a great improvement by itself. An attribute like CachedResponse.has_content_changed would help a lot.

@JWCook JWCook reopened this Jan 13, 2023
@JWCook
Copy link
Member

JWCook commented Jan 13, 2023

I see. Adding a new attribute would be reasonable. I'll keep this open, then.

@JWCook JWCook changed the title from_cache response attribute behaviour Add response attribute to indicate when content changes Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants