Closed
Description
Most users will use Cheerio with documents loaded from the web, which can lead to decoding issues; see #1785. Cheerio should provide a method to load a buffer that properly handles encodings. JSDom uses https://github.com/jsdom/whatwg-encoding to do this. I have started working on a solution at https://github.com/fb55/encoding-sniffer, which will support streams.
One current user-land implementation of this is https://github.com/ktty1220/cheerio-httpcli (in Japanese).
We should add three functions for NodeJS users:
load(buffer, options)
— sniffs the encoding of the passed buffer and returns the loaded document; overload of the existingload
function.stream(cb, options)
(see.stream(cb)
method #99) — returns a writeable stream that will (1) sniff the encoding, (2) parse the document as chunks arrive, and (3) calls the callback with a loaded Cheerio instance once the stream has ended.- It would be nice to have the return value of
stream
be both a writeable stream, as well as a promise that allows users to await the response. - An alternative interface might be
stream(readableStream, options)
, which returns a promise and automatically consumes the readable stream. Note that this is against NodeJS conventions.
- It would be nice to have the return value of
request(url, options)
— fetches the document aturl
and pipes it intostream
. Returns a promise for the loaded document.- Not named
fetch
, to avoid a name collision with the officialfetch
API.
- Not named
For me, the big open question is how much of this we can bring to other platforms as well. Eg. Deno users will no doubt have similar requirements.
Metadata
Metadata
Assignees
Labels
No labels
Activity
[-]Provide method to read buffers with unknown encodings[/-][+]Add methods to load buffers & URLs[/+].stream(cb)
method #99[-]Add methods to load buffers & URLs[/-][+]Add methods to load buffers, streams & URLs[/+][-]Add methods to load buffers, streams & URLs[/-][+]Add functions to load buffers, streams & URLs[/+]Update dependency cheerio to v1.0.0-rc.12 (#62)
Update dependency cheerio to v1.0.0-rc.12 (#73)
14 remaining items