Skip to content

Latest commit

 

History

History
429 lines (246 loc) · 17.1 KB

xml-class.rst

File metadata and controls

429 lines (246 loc) · 17.1 KB

Xml

  • New in v1.21.0

This represents an HTML or an XML node. It is a helper class intended to access the DOM (Document Object Model) content of a Story object.

There is no need to ever directly construct an Xml object: after creating a Story, simply take Story.body -- which is an Xml node -- and use it to navigate your way through the story's DOM.

Method / Attribute Description
~.add_bullet_list add a ul tag - bulleted list, context manager.
~.add_codeblock add a pre tag, context manager.
~.add_description_list add a dl tag, context manager.
~.add_division add a div tag (renamed from “section”), context manager.
~.add_header add a header tag (one of h1 to h6), context manager.
~.add_horizontal_line add a hr tag.
~.add_image add a img tag.
~.add_link add a a tag.
~.add_number_list add a ol tag, context manager.
~.add_paragraph add a p tag.
~.add_span add a span tag, context manager.
~.add_subscript add subscript text(sub tag) - inline element, treated like text.
~.add_superscript add subscript text (sup tag) - inline element, treated like text.
~.add_code add code text (code tag) - inline element, treated like text.
~.add_var add code text (code tag) - inline element, treated like text.
~.add_samp add code text (code tag) - inline element, treated like text.
~.add_kbd add code text (code tag) - inline element, treated like text.
~.add_text add a text string. Line breaks \n are honored as br tags.
~.set_align sets the alignment using a CSS style spec. Only works for block-level tags.
~.set_attribute sets an arbitrary key to some value (which may be empty).
~.set_bgcolor sets the background color. Only works for block-level tags.
~.set_bold sets bold on or off or to some string value.
~.set_color sets text color.
~.set_columns sets the number of columns. Argument may be any valid number or string.
~.set_font sets the font-family, e.g. “sans-serif”.
~.set_fontsize sets the font size. Either a float or a valid HTML/CSS string.
~.set_id sets a id. A check for uniqueness is performed.
~.set_italic sets italic on or off or to some string value.
~.set_leading set inter-block text distance (-mupdf-leading), only works on block-level nodes.
~.set_lineheight set height of a line. Float like 1.5, which sets to 1.5 * fontsize.
~.set_margins sets the margin(s), float or string with up to 4 values.
~.set_pagebreak_after insert a page break after this node.
~.set_pagebreak_before insert a page break before this node.
~.set_properties set any or all desired properties in one call.
~.add_style set (add) some “style” attribute not supported by its own set_ method.
~.add_class set (add) some “class” attribute.
~.set_text_indent set indentation for first textblock line. Only works for block-level nodes.
~.tagname either the HTML tag name like p or None if a text node.
~.text either the node's text or None if a tag node.
~.is_text check if the node is a text.
~.first_child contains the first node one level below this one (or None).
~.last_child contains the last node one level below this one (or None).
~.next the next node at the same level (or None).
~.previous the previous node at the same level.
~.root the top node of the DOM, which hence has the tagname html.

Class API

add_bullet_list

Add an ul tag - bulleted list, context manager. See ul.

add_codeblock

Add a pre tag, context manager. See pre.

add_description_list

Add a dl tag, context manager. See dl.

add_division

Add a div tag, context manager. See div.

add_header(value)

Add a header tag (one of h1 to h6), context manager. See headings.

arg int value

a value 1 - 6.

add_horizontal_line

Add a hr tag. See hr.

add_image(name, width=None, height=None)

Add an img tag. This causes the inclusion of the named image in the DOM.

arg str name

the filename of the image. This must be the member name of some entry of the Archive parameter of the Story constructor.

arg width

if provided, either an absolute (int) value, or a percentage string like "30%". A percentage value refers to the width of the specified where rectangle in Story.place. If this value is provided and height is omitted, the image will be included keeping its aspect ratio.

arg height

if provided, either an absolute (int) value, or a percentage string like "30%". A percentage value refers to the height of the specified where rectangle in Story.place. If this value is provided and width is omitted, the image's aspect ratio will be honored.

add_link(href, text=None)

Add an a tag - inline element, treated like text.

arg str href

the URL target.

arg str text

the text to display. If omitted, the href text is shown instead.

add_number_list

Add an ol tag, context manager.

add_paragraph

Add a p tag, context manager.

add_span

Add a span tag, context manager. See span

add_subscript(text)

Add "subscript" text(sub tag) - inline element, treated like text.

add_superscript(text)

Add "superscript" text (sup tag) - inline element, treated like text.

add_code(text)

Add "code" text (code tag) - inline element, treated like text.

add_var(text)

Add "variable" text (var tag) - inline element, treated like text.

add_samp(text)

Add "sample output" text (samp tag) - inline element, treated like text.

add_kbd(text)

Add "keyboard input" text (kbd tag) - inline element, treated like text.

add_text(text)

Add a text string. Line breaks \n are honored as br tags.

set_align(value)

Set the text alignment. Only works for block-level tags.

arg value

either one of the TextAlign or the text-align values.

set_attribute(key, value=None)

Set an arbitrary key to some value (which may be empty).

arg str key

the name of the attribute.

arg str value

the (optional) value of the attribute.

get_attributes()

Retrieve all attributes of the current nodes as a dictionary.

returns

a dictionary with the attributes and their values of the node.

get_attribute_value(key)

Get the attribute value of key.

arg str key

the name of the attribute.

returns

a string with the value of key.

remove_attribute(key)

Remove the attribute key from the node.

arg str key

the name of the attribute.

set_bgcolor(value)

Sets the background color. Only works for block-level tags.

arg value

either an RGB value like (255, 0, 0) (for "red") or a valid background-color value.

set_bold(value)

Sets bold on or off or to some string value.

arg value

True, False or a valid font-weight value.

set_color(value)

Set the color of the text following.

arg value

either an RGB value like (255, 0, 0) (for "red") or a valid color value.

set_columns(value)

Sets the number of columns.

arg value

a valid columns value.

Note

Currently ignored - supported in a future MuPDF version.

set_font(value)

Set the font-family.

arg str value

e.g. "sans-serif".

set_fontsize(value)

Set the font size for text following.

arg value

a float or a valid font-size value.

set_id(unqid)

Sets a id. This serves as a unique identification of the node within the DOM. Use it to easily locate the node to inspect or modify it. A check for uniqueness is performed.

arg str unqid

id string of the node.

set_italic(value)

Sets italic on or off or to some string value for the text following it.

arg value

True, False or some valid font-style value.

set_leading(value)

Set inter-block text distance (-mupdf-leading), only works on block-level nodes.

arg float value

the distance in points to the previous block.

set_lineheight(value)

Set height of a line.

arg value

a float like 1.5 (which sets to 1.5 * fontsize), or some valid line-height value.

set_margins(value)

Sets the margin(s).

arg value

float or string with up to 4 values. See CSS documentation.

set_pagebreak_after

Insert a page break after this node.

set_pagebreak_before

Insert a page break before this node.

set_properties(align=None, bgcolor=None, bold=None, color=None, columns=None, font=None, fontsize=None, indent=None, italic=None, leading=None, lineheight=None, margins=None, pagebreak_after=False, pagebreak_before=False, unqid=None, cls=None)

Set any or all desired properties in one call. The meaning of argument values equal the values of the corresponding set_ methods.

Note

The properties set by this method are directly attached to the node, whereas every set_ method generates a new span below the current node that has the respective property. So to e.g. "globally" set some property for the body, this method must be used.

add_style(value)

Set (add) some style attribute not supported by its own set_ method.

arg str value

any valid CSS style value.

add_class(value)

Set (add) some "class" attribute.

arg str value

the name of the class. Must have been defined in either the HTML or the CSS source of the DOM.

set_text_indent(value)

Set indentation for the first textblock line. Only works for block-level nodes.

arg value

a valid text-indent value. Please note that negative values do not work.

append_child(node)

Append a child node. This is a low-level method used by other methods like Xml.add_paragraph.

arg node

the Xml node to append.

create_text_node(text)

Create direct text for the current node

arg str text

the text to append.

rtype

Xml

returns

the created element.

create_element(tag)

Create a new node with a given tag. This a low-level method used by other methods like Xml.add_paragraph.

arg str tag

the element tag.

rtype

Xml

returns

the created element. To actually bind it to the DOM, use Xml.append_child.

insert_before(elem)

Insert the given element elem before this node.

arg elem

some Xml element.

insert_after(elem)

Insert the given element elem after this node.

arg elem

some Xml element.

clone()

Make a copy of this node, which then may be appended (using Xml.append_child) or inserted (using one of Xml.insert_before, Xml.insert_after) in this DOM.

returns

the clone (Xml) of the current node.

remove()

Remove this node from the DOM.

debug()

For debugging purposes, print this node's structure in a simplified form.

find(tag, att, match)

Under the current node, find a node with the given tag, atribute att and value match.

arg str tag

restrict search to this tag. May be None for unrestricted search.

arg str att

check this attribute.

arg str match

the desired attribute value to match.

rtype

Xml.

returns

None if nothing found, otherwise the first matching node.

find_next( tag, att, match)

Continue a previous Xml.find with the same values.

rtype

Xml.

returns

None if none more found, otherwise the next matching node.

tagname

Either the HTML tag name like p or None if a text node.

text

Either the node's text or None if a tag node.

is_text

Check if a text node.

first_child

Contains the first node one level below this one (or None).

last_child

Contains the last node one level below this one (or None).

next

The next node at the same level (or None).

previous

The previous node at the same level.

root

The top node of the DOM, which hence has the tagname html.

Setting Text properties

In HTML tags can be nested such that innermost text inherits properties from the tag enveloping its parent tag. For example <p><b>some bold text<i>this is bold and italic</i></b>regular text</p>.

To achieve the same effect, methods like Xml.set_bold and Xml.set_italic each open a temporary span with the desired property underneath the current node.

In addition, these methods return there parent node, so they can be concatenated with each other.

Context Manager support

The standard way to add nodes to a DOM is this:

body = story.body
para = body.add_paragraph()  # add a paragraph
para.set_bold()  # text that follows will be bold
para.add_text("some bold text")
para.set_italic()  # text that follows will additionally be italic
para.add_txt("this is bold and italic")
para.set_italic(False).set_bold(False)  # all following text will be regular
para.add_text("regular text")

Methods that are flagged as "context managers" can conveniently be used in this way:

body = story.body
with body.add_paragraph() as para:
   para.set_bold().add_text("some bold text")
   para.set_italic().add_text("this is bold and italic")
   para.set_italic(False).set_bold(False).add_text("regular text")
   para.add_text("more regular text")