-
-
Notifications
You must be signed in to change notification settings - Fork 559
/
test_xhtml.txt
30 lines (21 loc) · 994 Bytes
/
test_xhtml.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
>>> from lxml.html import document_fromstring, fragment_fromstring, tostring
lxml.html has two parsers, one for HTML, one for XHTML:
>>> from lxml.html import HTMLParser, XHTMLParser
>>> html = "<html><body><p>Hi!</p></body></html>"
>>> root = document_fromstring(html, parser=HTMLParser())
>>> print(root.tag)
html
>>> root = document_fromstring(html, parser=XHTMLParser())
>>> print(root.tag)
html
There are two functions for converting between HTML and XHTML:
>>> from lxml.html import xhtml_to_html, html_to_xhtml
>>> doc = document_fromstring(html, parser=HTMLParser())
>>> tostring(doc)
b'<html><body><p>Hi!</p></body></html>'
>>> html_to_xhtml(doc)
>>> tostring(doc)
b'<html:html xmlns:html="http://www.w3.org/1999/xhtml"><html:body><html:p>Hi!</html:p></html:body></html:html>'
>>> xhtml_to_html(doc)
>>> tostring(doc)
b'<html xmlns:html="http://www.w3.org/1999/xhtml"><body><p>Hi!</p></body></html>'