Using email module to parse multipart insteal of the deprecated cgi module #1437

aisk · 2023-11-28T16:22:14Z

Since cgi will be removed, the Python change log recommends to using email.message or the PYPI package multipart, and bottle does not allow to use external dependencies, and vendoring multipart is not a good practice, so I think the email package is a better way.

I don't check too much about the compatibilities, if some maintainer think this way is okay, I'll invest more time to do it. But the test_multipart passed on my local machine (some other tests failed because I'm using Windows and they failed in the master branch).

…odule

defnull · 2023-11-28T16:56:27Z

Unfortunately, all data parsed by email.parser.FeedParser will end up in memory buffered Messages. Uploading large files (or many small ones) would likely trigger MemoryError on a busy server. The parser needs a way to offload large file uploads into temporary files to be useful in a web context. Not sure if the email.parser package supports that use case.

aisk · 2023-11-28T17:17:18Z

The email.parser.FeedParser has an optional argument _factory, which can specify which Message class will be used in the parsed result. So we can subclass the email.message.Message, and override the set_payload and any other methods to offload the large file to the disk.

I didn't take too much to see if this will work, if you didn't check this too, I want to investigate on it.

defnull · 2023-11-28T17:31:00Z

Does not really help for large uploads, as those are still collected as a list of strings in memory before set_payload is even called. That behavior is hard-coded in the parser. The parser is also string-based, binary data is passed in as data.decode('ascii', 'surrogateescape') and copied multiple times. It was designed for emails (where you need an error tolerant and lax parser) and not for the internet (where you need a fast and strict parser that bails immediately if it sees something fishy). I would love to use that parser, that would be my first choice if that was an option. But I do not think it is suitable for this use case.

aisk · 2023-11-29T14:00:56Z

Thanks for the kindly reply, I've got the point!

Using email module to parse multipart insteal of the deprecated cgi m…

97cd222

…odule

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using email module to parse multipart insteal of the deprecated cgi module #1437

Using email module to parse multipart insteal of the deprecated cgi module #1437

aisk commented Nov 28, 2023 •

edited

defnull commented Nov 28, 2023

aisk commented Nov 28, 2023

defnull commented Nov 28, 2023

aisk commented Nov 29, 2023

Using email module to parse multipart insteal of the deprecated cgi module #1437

Are you sure you want to change the base?

Using email module to parse multipart insteal of the deprecated cgi module #1437

Conversation

aisk commented Nov 28, 2023 • edited

defnull commented Nov 28, 2023

aisk commented Nov 28, 2023

defnull commented Nov 28, 2023

aisk commented Nov 29, 2023

aisk commented Nov 28, 2023 •

edited