Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce 'citation_reference' nodes instead of plain 'reference' nodes #275

Closed
brechtm opened this issue Oct 5, 2021 · 11 comments
Closed
Assignees

Comments

@brechtm
Copy link

brechtm commented Oct 5, 2021

As a follow-up to #273, rendering a Sphinx project involving sphinxcontrib-bibtex using the rinoh builder now is closer to producing the proper output (brechtm/rinohtype#298 (comment)). But it's not quite there yet. I want to ask you to consider making another change to sphinxcontrib-bibtex.

The extension produces a plain reference node instead of a citation_reference node [1]. Since you are outputting citation nodes, it makes sense to also output citation_reference nodes. This may be complicated by the presence (and order) of the Sphinx transforms. For example, it's possible that you need to output a pending_xref node (equivalent to what a plain Sphinx citation produces) instead of a citation_reference node.

@rappdw

[1] I would demonstrate this using output from Sphinx's pseudoxml builder, but the issue is not visible there due to the involvement of Sphinx transforms. Please see brechtm/rinohtype#262 (comment) for some hints on what Sphinx is doing internally.

mcmtroffaes added a commit that referenced this issue Oct 6, 2021
@mcmtroffaes
Copy link
Owner

Yes, sphinx does some, let's say, "interesting" things with citation references. It's really fun to trace through all the transforms. 😄 I've documented the relevant parts for citations and citation references here: https://github.com/mcmtroffaes/sphinxcontrib-bibtex/blob/develop/test/test_debug.txt

Currently, citation references through any cite role produce a pending_xref, and the cited keys are then processed in the resolve_xref stage, which is part of the sphinx post-transforms. As part of that all the necessary text is rendered (i.e. author names, years, labels, brackets, ... whatever is required by the citation reference style), and eventually also the reference to the citation is rendered. This reference can contain any text as specified by the citation reference style (this is what children = super().render(backend) does). At this point of the transformation process, sphinx expects a reference node to citations for all backends, not a citation_reference node. Well, all backends, except for latex. But latex does not support custom text in citation_reference nodes. So for latex I've settled on generating a hyperlink command that produces a working reference to the citation. This is not a very well documented latex feature (I think I found it somewhere on stackoverflow) but it works.

In a nutshell, by the time sphinxcontrib-bibtex resolves citation references (at resolve_xref stage), no citation_reference nodes are present anymore in the doctree (except for latex, due to special casing in sphinx). In the post resolve_xref stage, sphinxcontrib-bibtex has produced exactly the same doctree structure as if one was to use regular sphinx citations (again, except for latex). Yes, the extension could produce citation_reference nodes instead of reference nodes at this point, however sphinx does some extra formatting on citation_reference nodes (in particular, it adds square brackets), which break the formatting... This is really unfortunate. You can test it with this branch: https://github.com/mcmtroffaes/sphinxcontrib-bibtex/tree/feature/generate-citation-reference

@mcmtroffaes
Copy link
Owner

mcmtroffaes commented Oct 7, 2021

I had a further think about this and what I think could solve your problem with rhinotype is to have a special separate citation reference style that generates citation_reference nodes in the formatting stage, instead of regular reference nodes. You'll be limited that these nodes have "automatic brackets" as in regular docutils/sphinx. But in this way you can keep the stylesheet system consistent between regular sphinx citation citations and sphinxcontrib-bibtex citations. You'll be limited in that you can only use this specific citation reference style.

This might also be useful for those folks that want to produce \sphinxcite commands instead of \hyperlink commands for citations, for whatever reason.

@brechtm
Copy link
Author

brechtm commented Oct 8, 2021

Thanks for looking into this! I haven't had much time to spend on this, unfortunately.

I didn't dive deep into this yet (and therefore couldn't follow your analysis above fully), but I want to share the following:

So, sphinxcontrib-bibtex could perhaps produce the same kind of pending_xref node as Sphinx does for rST citation references, with:

  • setting reftarget to the citation reference label, and
  • adding a "bibtex" class to the pending_xref node so that it can be styled differently in rinohtype (leaving out the brackets)

I understand this will not work for the other builders. You mention a "special separate citation reference style". Could this be set independenly for each builder? I'm asking because people may want to generate both HTML and PDF from the same sources, without having to change this option when building both formats. Perhaps you could also check for the "rinoh" builder and output the difference pending_xref just for it?

mcmtroffaes added a commit that referenced this issue Oct 8, 2021
… (will enable different text classes to be used by different styles, so different sorts of docutils nodes can be generated on rendering depending on the pybtex node used, see #275).
@mcmtroffaes
Copy link
Owner

So, sphinxcontrib-bibtex could perhaps produce the same kind of pending_xref node as Sphinx does for rST citation references

The reference content is only known after all documents have been processed, in the env-updated stage, and thus only after the pending_xref is created. For this reason, sphinxcontrib-bibtex produces its own pending_xrefs (in separate domains, cite for regular citations and footcite for footnote citations). It then produces the required reference nodes in the resolve stage. Post-resolve, this results in the same doctree structure as Sphinx does for regular citations (except for LaTeX because the LaTeX builder does not support regular reference nodes to citations).

I understand this will not work for the other builders. You mention a "special separate citation reference style". Could this be set independenly for each builder?

The style is independent of the builder. But the rendering of the style can depend on the builder. For example, in all styles, currently, :cite: references are rendered into reference nodes (surrounded by other nodes for authors, brackets, ...) on all builders except latex, where it produces raw_latex nodes to create a custom hyperlink. A route could be pursued for rinotype if rinotype subjects itself to the same limitations as the LaTeX builder. I'm not sure what to generate though. Under the rinotype builder, what is the doctree structure in post-resolve stage for citation references, with custom content (i.e. content not necessarily equalling the label)?

@brechtm
Copy link
Author

brechtm commented Mar 25, 2022

I finally found some time for a deeper dive into this problem. To my surprise, the fix is fairly simple because rinohtype will use the text specified for a citation_reference node (unlike the LaTeX builder?). All that is needed is adding the following lines to the body of the SphinxReferenceText.render() method:

        elif info.builder.name == 'rinoh':
            key = f'%{info.todocname}#{info.citation_id}'
            children = super().render(backend)
            citnode = docutils.nodes.citation_reference(text=children[0],
                                                        refid=key,
                                                        reftitle=info.title)
            citnode.extend(children[1:])
            return [citnode]

One issue that remains is that both pybtex and rinohtype add brackets (https://github.com/brechtm/rinoh_error):
image
The last reference here (CIT2002) is a plain docutils reference.

This can be solved by overriding the pre/suffix in rinohtype's stylesheet:

[citation marker]
label_prefix=''
label_suffix=''

This will also remove the brackets for docutils citation references, so I think it would be better to have sphinxcontrib-bibtex omit the brackets when the rinoh builder is used.

P.S. I wasn't able to find out how to get "Nel87a"-style referencing working. Setting bibtex_reference_style = 'author_year' in conf.py shows numbers in HTML output for the bibliography.

@mcmtroffaes
Copy link
Owner

Thanks for doing the research, it's good to know that it comes down to creating citation_reference nodes at the render stage. I think what suits best is a separate citation style that produces citation_reference nodes - the rinoh (and other) builders can then use that one single citation style. Overriding the bracket behaviour based on the builder is not on the table: since most styles render much more than just a citation_reference node and so some users might still want some bracket support, e.g. for multiple citations that are grouped together and so on.

I've been super busy but I promise I'll eventually get around to adding the style!

@brechtm
Copy link
Author

brechtm commented Mar 25, 2022

I think what suits best is a separate citation style that produces citation_reference nodes - the rinoh (and other) builders can then use that one single citation style.

I may be misunderstanding, but why is a separate citation style necessary? With the proposed code, rinohtype produces citation references in the configured style.

Overriding the bracket behaviour based on the builder is not on the table: since most styles render much more than just a citation_reference node and so some users might still want some bracket support, e.g. for multiple citations that are grouped together and so on.

I see. I didn't think of that case! I suppose we could add extra information to the citation_reference node to instruct rinohtype not to add brackts. For example, prefix and suffix attributes to override those specified in the style sheet.

I've been super busy but I promise I'll eventually get around to adding the style!

No pressure. I also took 5 months to get back to you! 😄

@mcmtroffaes
Copy link
Owner

mcmtroffaes commented Mar 28, 2022

I wasn't able to find out how to get "Nel87a"-style referencing working. Setting bibtex_reference_style = 'author_year' in conf.py shows numbers in HTML output for the bibliography.

For future reference: you're overriding the conf.py style in the rst document so the conf.py setting gets ignored. Just remove the :style: option from your rst and it should use the conf.py style again.

[EDIT: Also mind there's a difference between the citation label and the actual thing that appears in the reference link, unless one uses the label style. The new cit_ref_label style ensures these are always in sync as with standard sphinx citations.]

@brechtm
Copy link
Author

brechtm commented Mar 28, 2022

For future reference: you're overriding the conf.py style in the rst document so the conf.py setting gets ignored. Just remove the :style: option from your rst and it should use the conf.py style again.

👍 I didn't realize this.

Also mind there's a difference between the citation label and the actual thing that appears in the reference link, unless one uses the label style. The new cit_ref_label style ensures these are always in sync as with standard sphinx citations.

I think this may be confusing me. When I render HTML with bibtex_reference_style = 'author_year', the reference labels change(1987 or Nelson, 1987), but the labels in the bibliography remain the same (Nel87). That makes it confusing and harder to find the corresponding citation in the bibliography, doesn't it?

@mcmtroffaes
Copy link
Owner

I think this may be confusing me. When I render HTML with bibtex_reference_style = 'author_year', the reference labels change(1987 or Nelson, 1987), but the labels in the bibliography remain the same (Nel87). That makes it confusing and harder to find the corresponding citation in the bibliography, doesn't it?

Yes. It would be possible to have labels corresponding to the author and year, but that makes for very ugly formatting in the bibliography, unfortunately.

Just bear in mind that the reference style refers to how the citation reference is rendered (be it via labels, author/year, etc.), whilst the label style on the other hand refers to how the labels (as appearing in the bibliography) are rendered; the label style is declared as part of the bibliography style. These two styles are handled almost completely independently from each other. The only link is that some reference styles will use the labels (such as the "label" style), whilst others do not (such as the "author_year" style). The reference style handles a lot more, like how multiple citations are rendered together, where the author names appear, the shape of the brackets, etc.

@mcmtroffaes
Copy link
Owner

Fixed by #291. Thank you again for reporting, and for your assistance in making this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants