Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No CDATA block in content block of atom feed #7

Open
weierophinney opened this issue Dec 31, 2019 · 7 comments
Open

No CDATA block in content block of atom feed #7

weierophinney opened this issue Dec 31, 2019 · 7 comments

Comments

@weierophinney
Copy link
Member

Hello,

I wanted to provide feeds (via the Writer of Zend Feed) with the full content of an article (including some HTML5 markup) and thought to prefer atom over rss. But the writer is acting different and causes some trouble for me.

My code:

    $entry = $feed->createEntry();
    $entry->setContent($news->getText);

Output for RSS:

    <item>
      <content:encoded><![CDATA[<p>My content ...</p>]]></content:encoded>

Output for Atom:

  <entry xmlns:xhtml="http://www.w3.org/1999/xhtml">
    <content xmlns:xhtml="http://www.w3.org/1999/xhtml" type="xhtml">
      <xhtml:div xmlns:xhtml="http://www.w3.org/1999/xhtml">
        <xhtml:p>My content ...</xhtml:p>
      </xhtml:div>
    </content>
  </entry>

And if I add any image to it in HTML5-Style <img src="myimage.jpg"> instead of XHTML-Style <img src="myimage.jpg" />, I get a warning:

DOMDocument::loadXML(): Opening and ending tag mismatch: img line 1 and p in Entity, line: 1

In the atom example in the documentation there is the output:

        <content type="html">
            <![CDATA[I am not writing the article.
                     The example is long enough as is ;).]]>
        </content>

In _setDescription I found $dom->createCDATASection (Entry\Rss and Entry\Atom). But in Atom it's just the summary and in Rss the Content.

In Entry\Atom the _setContent is relevant for the content block, which I wanted to use to output the full content and not just a summary. And there I found $element->setAttribute('type', 'xhtml') in _setContent.

I doubt that the atom output of the example in the documentation is even possible with Zend Feed or am I wrong? It would be great, if the atom feed would also use the CDATA blockinstead of the xhtml for the content.


Originally posted by @av3 at zendframework/zend-feed#82

@weierophinney
Copy link
Member Author

@av3
If you install the PHP extension "Tidy", then zend-feed will be converted your HTML to XHTML.

Example:

$tidy = new \tidy;
$tidy->parseString(
    '<p><img src="foo.jpg"></p>',
    [
        'output-xhtml'   => true,
        'show-body-only' => true,
        'quote-nbsp'     => false,
    ]
);
$tidy->cleanRepair();

var_dump((string) $tidy); // <p><img src="foo.jpg" /></p>

https://github.com/zendframework/zend-feed/blob/b3d847afc0830a0ca7841a8ecc409c175dfea49d/src/Writer/Renderer/Entry/Atom.php#L383-L396


Originally posted by @froschdesign at zendframework/zend-feed#82 (comment)

@weierophinney
Copy link
Member Author

Thanks for your reply, @froschdesign. With Tidy it's working, even if it's not very beautiful:

<content xmlns:xhtml="http://www.w3.org/1999/xhtml" type="xhtml">
  <xhtml:div xmlns:xhtml="http://www.w3.org/1999/xhtml"><xhtml:img src="myimage" />
    <xhtml:p>My content</xhtml:p>
  </xhtml:div>
</content>

But is this really necessary? Wouldn't it be better to use $dom->createCDATASection? Then we wouldn't need tidy to create the content section. Or is there a specific reason why _setDescription (of Rss and Atom) creates a CDATA section and _setContent does not?

But this would also mean that the atom output example of the documentation is wrong, right?

If I don't want that xhtml output: Would it be possible to write an own Writer Extension where I could overwrite the _setContent method? Are there any examples how to register own Writers? In the documentation there is just a "TODO" for that chapter.


Originally posted by @av3 at zendframework/zend-feed#82 (comment)

@weierophinney
Copy link
Member Author

With Tidy it's working, even if it's not very beautiful:

Why isn't it beautiful? The generated code works and is correct.

Then we wouldn't need tidy to create the content section.

The content of atom:content should be suitable for handling as HTML or XHTML - depending on the specified type. Tidy helps us here to meet the specifications.

But this would also mean that the atom output example of the documentation is wrong, right?

Right!

In the documentation there is just a "TODO" for that chapter.

Oh, this is a mistake. Thanks for the hint!


Originally posted by @froschdesign at zendframework/zend-feed#82 (comment)

@weierophinney
Copy link
Member Author

Please have a look at content:encoded: http://www.rssboard.org/rss-profile#namespace-elements-content-encoded

zend-feed also provides an extension for this element: Zend\Feed\Writer\Extension\Content\Renderer\Entry

The usage of the writer extensions are the same like described for the reader: https://docs.zendframework.com/zend-feed/reader/#extending-feed-and-entry-apis


Originally posted by @froschdesign at zendframework/zend-feed#82 (comment)

@weierophinney
Copy link
Member Author

Why isn't it beautiful? The generated code works and is correct.

Yes, (meanwhile) I know that it's correct with the XHTML. It looks unusual for me and I thought it could be better to provide the content without modification (faster and smaller size), but this isn't important for a feed.

Please have a look at content:encoded: http://www.rssboard.org/rss-profile#namespace-elements-content-encoded

There it says:

The content MUST be suitable for presentation as HTML and be encoded as character data in the same manner as the description element.

and in description it says:

HTML markup MUST be encoded as character data either by employing the HTML entities < ("<") and > (">") or a CDATA section.

No word about xhtml content, but I know that it's also a valid solution for Atom feeds. But when it says "same manner as the description element" and the _setDescription method of the Renderer\Entry\Atom is correct with its createCDATASection inside, it should also be suitable for _setContent. I'm just wondering, because for me it's not consistent.

zend-feed also provides an extension for this element

But this works only for RSS feeds, not for Atom.

The usage of the writer extensions are the same like described for the reader: https://docs.zendframework.com/zend-feed/reader/#extending-feed-and-entry-apis

Thank you, but unfortunately I wasn't successful with this. Writing my own Renderer\Entry with a _setContent called in the constructor would cause a second content:encoded block in my Atom feed if Tidy is enabled.

I addition to this I tried to write my own extension to optimize my feed for feedly. I tried to start with my own Writer\Feed class and add methods for an accentColor and registering the namespace. But there is no Zend\Feed\Writer\Extension\AbstractFeed. extending my class with Zend\Feed\Writer\AbstractFeed will cause an error:

…/vendor/zendframework/zend-feed/src/Writer/StandaloneExtensionManager.php40:

Maximum function nesting level of '256' reached, aborting!

Next attempt: Without an extend AbstractFeed caused another error:

…/vendor/zendframework/zend-feed/src/Writer/AbstractFeed.php845:

call_user_func_array() expects parameter 1 to be a valid callback, class 'Webfeed\Writer\Feed' does not have a method 'getItunesAuthors'

I don't know why it's checking for a method of the iTunes extension in my own extension. But okay, this is another problem. Maybe you (or someone else) could provide an "JungleBooks" extension example for the Writer in the documentation.


Originally posted by @av3 at zendframework/zend-feed#82 (comment)

@weierophinney
Copy link
Member Author

Sorry, the topic was Atom and not RSS. My mistake. 🤦‍♂️

No word about xhtml content, but I know that it's also a valid solution for Atom feeds.

See at the specification: https://tools.ietf.org/html/rfc4287#page-14

Maybe you (or someone else) could provide an "JungleBooks" extension example for the Writer in the documentation.

Maybe tomorrow. I will definitely give feedback.


Originally posted by @froschdesign at zendframework/zend-feed#82 (comment)

@weierophinney
Copy link
Member Author

@av3
An example for registering a writer extension can be found at #86


Originally posted by @froschdesign at zendframework/zend-feed#82 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant