Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow String objects as source #329

Open
piranna opened this issue Oct 2, 2021 · 4 comments
Open

Allow String objects as source #329

piranna opened this issue Oct 2, 2021 · 4 comments
Labels
documentation Improvements or additions to documentation help-wanted External contributions welcome question Contains open questions that need to be answered spec:HTML spec:IDL types Anything regarding Typescript

Comments

@piranna
Copy link

piranna commented Oct 2, 2021

At

if (source && typeof source === 'string') {
is being checked that source is a string literal, but the spec dictates is a DOMString object, that in Javascript, it maps directly to the corresponding String class. So, String objects, should be allowed, but instead they are directly rejected.

Here we have two options, call to .toString() to all values provided in the source argument, or check that source is a string literal or a String object.

@karfau karfau added bug Something isn't working documentation Improvements or additions to documentation good first issue Good for newcomers help-wanted External contributions welcome types Anything regarding Typescript spec:DOM-Level-2 https://www.w3.org/TR/DOM-Level-2-Core/ labels Oct 2, 2021
@karfau karfau added this to the 0.8.x milestone Oct 2, 2021
@karfau karfau added spec:HTML spec:IDL and removed spec:DOM-Level-2 https://www.w3.org/TR/DOM-Level-2-Core/ bug Something isn't working labels Oct 2, 2021
@karfau karfau modified the milestones: 0.8.x, planning 1.0.0 Oct 2, 2021
@karfau karfau added question Contains open questions that need to be answered and removed good first issue Good for newcomers labels Oct 2, 2021
@karfau
Copy link
Member

karfau commented Oct 2, 2021

I double checked and we are not talking about strings created using the String function, but by using the String constructor like this:

var S = new String('hello');

Just for later reference:

  • typeof S // object
  • S instanceof String //true
  • S.valueOf() // 'hello' (primitive)
  • String(S) // 'hello' (primitive)

After reading through the specs a bit more I have my doubts.
Here is what MDN says:

A DOMString is a sequence of 16-bit unsigned integers, typically interpreted as UTF-16 code units. This corresponds exactly to the JavaScript primitive String type.

✔️

When a DOMString is provided to JavaScript, it maps directly to the corresponding String.

❓ What does "is provided to Javascript" mean? There seems to be some transition from values outside of javascript?

When a Web API accepts a DOMString, the value provided will be stringified, using the ToString abstract operation. (For types other than Symbol, this has the same behavior as the String() function.)

If I understand this correctly any API accepting DOMString in the spec should accept all the possible values listed in the ToString specification.
Which we could implement using String for all input other than Symbols.
If we take this seriously it will affect the whole exposed API, not just new DOMParser().parseFromString().
It would be an API much more similar to "native automatic value conversion" in non strict javascript, since we would basically allow "any" input value. There are even specific APIs that convert null to an empty string.

I also checked what browsers are doing:
image
So there is even a different in handling XML and HTML...

Even the types typescript provides for the same API you reference only allow the primitive string type.

interface DOMParser {
    parseFromString(str: string, type: SupportedType): Document;
}

I must say that I personally prefer a more clearly typed API that only allows strings as input (explicitly stating when it allows null or undefined).

With to goal to still allow users to use our lib as described in the spec, it could make sense to implement and expose a IDLToString method, but I would not want to change our whole API to implicitly always call it.

@piranna What do you think?

@piranna
Copy link
Author

piranna commented Oct 3, 2021

Good analisys, it made me doubt and double check myself too :-)

A DOMString is a sequence of 16-bit unsigned integers, typically interpreted as UTF-16 code units. This corresponds exactly to the JavaScript primitive String type.

When a DOMString is provided to JavaScript, it maps directly to the corresponding String.

I find this two quotes from the docs conflicting, since there's no String primitive, but instead string one, String is an object class. Can you point me to the first one so we can propose to fix the typo?

What does "is provided to Javascript" mean? There seems to be some transition from values outside of javascript?

That's my interpretation. MDN documentation is confusing on this point, but seems that when a DOMString is provided to Javascript from the platform (C++, Rust...), they are directly mapped to Javscript String objects, not string literals.

If I understand this correctly any API accepting DOMString in the spec should accept all the possible values listed in the ToString specification.

Yes... but it's as simple as calling to .toString() method on the input data to sanitize it, that's what I'm proposing on this issue :-) This would also works for Symbols, so there's no need to re-implement the ToString protocol as you propose.

It would be an API much more similar to "native automatic value conversion" in non strict javascript, since we would basically allow "any" input value.

Yes, maybe... but seems it's how the spec dictates it must work (at least in Chrome):

Captura de pantalla -2021-10-03 09-02-18

I must say that I personally prefer a more clearly typed API that only allows strings as input (explicitly stating when it allows null or undefined).

With to goal to still allow users to use our lib as described in the spec, it could make sense to implement and expose a IDLToString method, but I would not want to change our whole API to implicitly always call it.

@piranna What do you think?

My opinion is to stick to the spec and not add non-standard APIs or methods. That means:

  1. change the Typescript types to allow any as spec behaviour works, or left them as string as they currently are to provide a help to Typescript only developers
  2. calls to .toString() method internally on all input data to sanitize it, as it seems the Javascript spec works

@karfau
Copy link
Member

karfau commented Oct 3, 2021

I will have to think about it.
Still not convinced that we should allow any as input for all those APIs and call .toString() on all the input.

@piranna
Copy link
Author

piranna commented Oct 3, 2021

Still not convinced that we should allow any as input for all those APIs and call .toString() on all the input.

As I said, the other alternative is to have it typed as string, since this will be endorsed only by Typescript, and call .toString() so it works properly with vanilla Javascript.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation help-wanted External contributions welcome question Contains open questions that need to be answered spec:HTML spec:IDL types Anything regarding Typescript
Projects
None yet
Development

No branches or pull requests

2 participants