Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid uploading empty files as chunked transfers #15036

Closed

Conversation

eivindt
Copy link

@eivindt eivindt commented Oct 31, 2023

Several web servers return a 400 Bad Request for transfers using "Transfer-Encoding: chunked" if the file is empty.

This change avoids this by passing a fixed string as upload data if the file is empty.

The problem is described here: #13672

Changelog: Bugfix: Avoid uploading empty files as chunked transfers as this causes problems for some web servers

  • Refer to the issue that supports this Pull Request.
  • If the issue has missing info, explain the purpose/use case/pain/need that covers this Pull Request.
  • I've read the Contributing guide.
  • I've followed the PEP8 style guides for Python code.
  • I've opened another PR in the Conan docs repo to the develop branch, documenting this one.

Several web servers return a 400 Bad Request for transfers
using "Transfer-Encoding: chunked" if the file is empty.

This change avoids this by passing a fixed string as upload
data if the file is empty.
@CLAassistant
Copy link

CLAassistant commented Oct 31, 2023

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@memsharded memsharded left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @eivindt !

This seems it could be a bit risky change, but lets investigate what is possible. Are we always sure that it is the same to send an empty file (possibly binary?) than a file with an empty text string? Sounds it is not exactly the same?

This avoids an issue with several web servers that cannot handle
"Transfer-Encoding: chunked" for empty files."""
if os.stat(abs_path).st_size == 0:
return contextlib.nullcontext("")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this is breaking our, check CI: https://ci.conan.io/blue/organizations/jenkins/ConanTestSuitev2/detail/PR-15036/1/pipeline/41. Please note that things must work with Python 3.6 too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can fix that, if this is a good way to fix the issue. I see @SSE4 is asking some good questions.

@SSE4
Copy link
Contributor

SSE4 commented Nov 2, 2023

@memsharded if we know file size, why do we use chunked transfer encoding in the first place?

@eivindt
Copy link
Author

eivindt commented Nov 2, 2023

@memsharded if we know file size, why do we use chunked transfer encoding in the first place?

This is a good question. Especially since this only happens for an empty file, not for a file with data.

I checked the requests library, and I think the problem is in requests.models.PreparedRequest's prepare_body():

        if is_stream:
            try:
                length = super_len(data)
            except (TypeError, AttributeError, UnsupportedOperation):
                length = None

         [...]
            if length:
                self.headers["Content-Length"] = builtin_str(length)
            else:
                self.headers["Transfer-Encoding"] = "chunked"

as super_len(data) will correctly set the length to 0 if data is a file handle, but the if length check should preferably have been a if length is None.

@SSE4
Copy link
Contributor

SSE4 commented Nov 2, 2023

zero Content-Length is valid value per HTTP RFC (Any Content-Length greater than or equal to zero is a valid value.).
we may try to modify headers instead, something like:

if st_size == 0:
    del headers['Transfer-Encoding']
    headers['Content-Length'] = 0

but I am still not 100% sure all the web-servers will be happy with that kind of requests (need some testing)

@memsharded
Copy link
Member

@memsharded if we know file size, why do we use chunked transfer encoding in the first place?

That is the thing, Conan doesn't. This is done by python-requests library, Conan is not defining this.
As I understand this, python-requests should have good default behavior. If that is not the case, it seems this might be contributed to python-requests. Or maybe to those special servers that are not behaving as expected.

My concern is that this kind of change could be breaking working systems, to try to workaround possible issues outside of Conan. I am fine with exploring possibilities, but we cannot guarantee that it will be moved forward.

@SSE4
Copy link
Contributor

SSE4 commented Nov 3, 2023

My concern is that this kind of change could be breaking working systems, to try to workaround possible issues outside of Conan. I am fine with exploring possibilities, but we cannot guarantee that it will be moved forward.

@memsharded that's valid concern, but this change could be hidden by an additional paywall. e.g. alternative FileUploader implementation could be activated only based on some opt-in or opt-out; for example, it could be environment variable, command line option or configuration entry. in other words, only activated by advanced users who know what are they doing and why with some big warning. as it's only needed for a rare unfortunate combination of HTTP proxy servers (bottle + gunicorn + conan_server), maybe less that 1% of total conan use-cases.

@eivindt
Copy link
Author

eivindt commented Nov 6, 2023

I've posted a pull-request to psf/requests (psf/requests#6568), maybe that will fix it.

As for the conan situation, it is a bit of a shame that the docker file for conan server defines a setup that breaks whenever conaninfo.txt is empty (i.e. header only packages).

Perhaps rewriting the conan server to use fastapi would be a way to fix it (doesn't seem to have problems with chunked transfers) or change the entrypoint script to use bottle without gunicorn?

@memsharded
Copy link
Member

As for the conan situation, it is a bit of a shame that the docker file for conan server defines a setup that breaks whenever conaninfo.txt is empty (i.e. header only packages).

That is indeed something that is worth changing, I was not fully aware about that this situation happened with the docker image, and in fact, I think the plans were not to distribute via that docker image anymore. I'll follow up this with the team. Can you please point me to the docker image and also if you got documentation pointing to it?

@memsharded memsharded added this to the 2.0.15 milestone Nov 6, 2023
@eivindt
Copy link
Author

eivindt commented Nov 6, 2023

We're using the one defined here: https://github.com/conan-io/conan-docker-tools/tree/master/legacy/conan_server (I think?) served via docker.io/conanio/conan_server:latest

I'm afraid I don't know if we had any documentation pointing to it.

@memsharded
Copy link
Member

Ah, ok, that makes sense. legacy is in the URL 🙂

This image is legacy, not really maintained or recommended for prod anymore.

@eivindt
Copy link
Author

eivindt commented Nov 6, 2023

Ah, ok, that makes sense. legacy is in the URL 🙂

This image is legacy, not really maintained or recommended for prod anymore.

Ok, fair enough. Not easily visible at e.g:

https://hub.docker.com/r/conanio/conan_server

You may want to add a disclaimer there, if possible.

@memsharded
Copy link
Member

Asking help to @uilianries to tag the images in dockerhub with some warning or similar that says that those images are legacy and are not recommended anymore. Maybe even taking them down if they don't really work?

We have re-reviewed this, and we are still quite concerned about trying to workaround something in the Conan codebase that seems to be an issue in some webservers, the risk of breaking existing users is not small.

The issue has been around for a few months already, and no other server vendor (there are at least other 3, besides conan_server without those specific frontends, Artifactory and ConanCenter) has reported any issue either.

So taking all into account, we think that is better not to try this workaround, and fixing the issue should be pursued against the specific webservers (probably better than against python-requests lib). Also, in any case we recommend the usage of the free ArtifactoryCE, it has way better performance than conan_server, plus many other advantages.

Many thanks in any case for your contribution and your feedback, really appreciated.

@memsharded memsharded removed this from the 2.0.15 milestone Dec 13, 2023
@memsharded memsharded closed this Dec 13, 2023
@uilianries
Copy link
Member

Hello, each folder in Conan Docker tools has a README file with a description of what is maintained and not. The main README contains:

Also, all non-x64 images are no longer supported, as we do not use them on Conan Center, we do not have maintainers/experts enough to keep them, and we have only a few users looking for them. For personal projects, avoid the legacy folder.

Talking about legacy: https://github.com/conan-io/conan-docker-tools/tree/master/legacy#readme

This folder contains all legacy docker recipes. Some of them are still used in Conan Center CI, therefore they are still maintained.

Older compilers like GCC 5 and GCC 7 are distributed from this legacy folder and we can´t update to newer one due glibc version installed in the base linux distribution.

Specifically about conan-server image, it's not used by ConanCenterIndex, but still generated for users, because it was asked by users many times, that's why is still there. It's only generated to Conan 1.x version and follows the documentation https://docs.conan.io/1/uploading_packages/running_your_server.html. If it's failing, I guess is because you are trying to mix with Conan 2.x.

I'll add conan-server image as deprecated, because it's not used in Conan Center and it's only related to Conan 1.x, which goes against the flow of Conan 2.x.

Maybe we can add conan-server docker image for Conan 2.x in the future, but it will require a high number of users asking about, because is something which requires more time, resource and maintenance.

@memsharded
Copy link
Member

I think the main issue is that in https://hub.docker.com/r/conanio/conan_server there is no single indicator of all of that, no warnings about it being legacy, no information that it is only for 1.X, etc, so it would be worth if it was possible that https://hub.docker.com/r/conanio/conan_server shows some of this information in that page.

@uilianries
Copy link
Member

@memsharded it makes sense, yes. I'll check it out.

@uilianries
Copy link
Member

It's possible to update Dockerhub repositories, but it requires administrator permission over the organization, first. We have 168 Docker images available on https://hub.docker.com/u/conanio (most deprecated), so updating one-by-one would be very tedious using the web interface. As second option, we could automate it via Dockerhub API: https://docs.docker.com/docker-hub/api/latest/

I tried to update one of my personal images using the API, and is not working. I found similar errors with others users, so, or I'm missing something (I hope so) or, there is a bug.

I opened a question on Stackoverflow asking for help because is more active. I checked the Docker forum first, and there is the very same question, but no answers in 4 years.

Please, up-vote the question, so it should be more visible there:
https://stackoverflow.com/questions/77660439/update-dockerhub-repository-description-via-api

@Ousret
Copy link

Ousret commented Dec 20, 2023

Hello there,

Sorry to barge in,
I don't know how far this could interest you (e.g. conan dev team)
But you may be interested in looking closely at Niquests.

Issues like those are things of the past with this.

Regards,

@memsharded
Copy link
Member

Hi @Ousret

Thanks for the pointer. Still https://github.com/jawah/niquests with only 35 stars doesn't seem to be getting much traction, so we prefer to stay with more "mainstream" dependencies whenever possible. It is not requests the one being problematic here, but some web servers.

@Ousret
Copy link

Ousret commented Dec 20, 2023

I understand. Thank you for your answer.

with only 35 stars doesn't seem to be getting much traction so we prefer to stay with more "mainstream" dependencies whenever possible.

It is a weird paradox. I've yet to see that statement in "mainstream" sciences. Earth is no longer seen as flat. 😄

It is not requests the one being problematic here, but some web servers.

I can assure you that what requests do in that case is problematic.

Good continuation,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants