New html parser #888

q0w · 2022-02-06T13:45:30Z

Is your feature request related to a problem? Please describe.

As I mentioned earlier, html5lib will be removed from pip (it already does not switch to html5lib by default). So maybe rewrite to html.parser or add 3th-party lib parser, like html5lib.
Or use a faster parser, like selectolax to improve performance.

Describe the solution you'd like

I think, html.parser is slow, as html5lib too. So adding selectolax can be a solution.

The text was updated successfully, but these errors were encountered:

frostming · 2022-02-06T14:37:47Z

IMO html parser shouldn't be the performance bottleneck unless you can provide some proof.

frostming · 2022-02-06T14:47:03Z

At present, PDM is reusing the ability of pip for index parsing so there isn't room for customizing the parser. But we are in the process of dropping pip and third-party HTML parsers may be worth considering.

q0w · 2022-02-06T15:41:48Z

Do you think dropping pip in pdm will be earlier than pip will fully remove html5lib from vendors?

abersheeran · 2022-06-27T10:40:17Z

https://peps.python.org/pep-0691/

Maybe don’t need new html parser

q0w · 2022-06-27T17:09:06Z

But existing api would not be deprecated soon

frostming · 2022-06-28T02:14:42Z

On PDM 2.0 we switched from pip to unearth, which uses html.parser.

frostming · 2022-06-29T05:29:07Z

Please test it on 2.0.0a1

q0w added the ⭐ enhancement Improvements for existing features label Feb 6, 2022

frostming added this to the version 2.0 milestone May 7, 2022

frostming closed this as completed Jun 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New html parser #888

New html parser #888

q0w commented Feb 6, 2022 •

edited

frostming commented Feb 6, 2022

frostming commented Feb 6, 2022

q0w commented Feb 6, 2022

abersheeran commented Jun 27, 2022

q0w commented Jun 27, 2022

frostming commented Jun 28, 2022

frostming commented Jun 29, 2022

New html parser #888

New html parser #888

Comments

q0w commented Feb 6, 2022 • edited

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

frostming commented Feb 6, 2022

frostming commented Feb 6, 2022

q0w commented Feb 6, 2022

abersheeran commented Jun 27, 2022

q0w commented Jun 27, 2022

frostming commented Jun 28, 2022

frostming commented Jun 29, 2022

q0w commented Feb 6, 2022 •

edited