Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle very large feeds #1040

Open
stakats opened this issue Feb 13, 2019 · 4 comments
Open

Handle very large feeds #1040

stakats opened this issue Feb 13, 2019 · 4 comments

Comments

@stakats
Copy link
Member

stakats commented Feb 13, 2019

A very large feed will hang retrieval with PHP Fatal error: Allowed memory size of NNNNNNN bytes exhausted (tried to allocate NNNNNNNNN bytes) in /websites/dhnow/www/wp-includes/wp-db.php on line 1978.

See, for example, the feed at http://kbender.blogspot.com/feeds/posts/default which is 9.5MB

@AramZS
Copy link
Member

AramZS commented Jul 8, 2019

Yikes, I'm not sure why a site would have a 9.5MB feed, but this is a server-level setting. PressForward can't change the allowed memory size of a server's PHP configuration.

@AramZS
Copy link
Member

AramZS commented Jul 15, 2019

@stakats We can use ini_set('memory_limit','24M'); for example, but the server may not allow a WordPress plugin to escalate that setting. Could you try adding that line to your wp-config.php file and see if it resolves the issue?

@lordmatt
Copy link

lordmatt commented Aug 6, 2019

It should be possible to load a large file in chunks as this plugin does: https://en-gb.wordpress.org/plugins/tuxedo-big-file-uploads/ - in theory, the steps would be (1) detect memory limit reached (2) handle error (3) switch to large file upload. Of course, processing a file of that size is another matter altogether. It might be best just to gracefully acknowledge the error and mark the feed as broken?

I've checked the feed OP mentioned. I have no idea what they are doing but their embed code (they are using on the blog) is something else. (VERY heavy). Never seen anything like it before.

@boonebgorges boonebgorges added this to the 5.6.0 milestone Dec 16, 2022
@boonebgorges
Copy link
Contributor

As suggested by @lordmatt, the issue here is not really the fetching of the XML file, but the parsing of that file.

There are some PHP libraries for reading XML files in chunks, and PHP natively has the XMLReader class for streaming an XML file. But these techniques are not compatible with the SimplePie library that WordPress uses to parse feeds. And unfortunately it's not likely to be addressed in SimplePie. See eg simplepie/simplepie#598, https://core.trac.wordpress.org/ticket/45303, simplepie/simplepie#731

As such, the only viable way forward is to move away from SimplePie. I'm having a hard time finding a PHP library that uses XMLReader to stream in a scaleable way (here's a very old proof of concept https://stackoverflow.com/questions/925300/parsing-media-rss-using-xmlreader) so we'd probably have to write our own, or contribute to a project like SimplePie to help them make the improvement. This is going to be a very large task.

So, for the time being, I think it'll have to be a documented issue: If you have problems with your server conking out due to large feeds, you'll have to use WP's tools for increasing memory limit (ie the WP_MEMORY_LIMIT constant in your wp-config.php).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants