Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL validator does not permit underscores in the domain #1351

Open
1 of 3 tasks
richard-jones opened this issue Feb 11, 2021 · 7 comments
Open
1 of 3 tasks

URL validator does not permit underscores in the domain #1351

richard-jones opened this issue Feb 11, 2021 · 7 comments

Comments

@richard-jones
Copy link

What kind of issue is this? (put 'x' between the square brackets)

I am having problems validating a URL containing an underscore, they are coming back as invalid according to Parsley.

I have checked the regex Parsley uses: https://github.com/guillaumepotier/Parsley.js/blob/master/src/parsley/validator_registry.js#L53

It does appear that it disallows an underscore.

I went on to try to find out what the deal with underscores in URLs is, and I have not been able to find an absolute clear answer:

  • RFC1738 says they are not allowed
  • The URI spec says that they are allowed
  • In various places I have read that hostnames do not allow underscores but domain names do.

The conclusion is that it's a mess, and I genuinely have no idea if they are valid, after spending most of a day trying to find out.

But, there is a more practical issue, which is that nonetheless URLs with underscores do exist, and my browser will resolve them, so that seems like a fairly good test that they are allowed, if not in spec, then at least "in the wild". Here's an example of such a URL: https://imuk_eng.pnzgu.ru/

So, I need to get my form to permit them, which it currently doesn't.

The best answer is that Parsley's type=url validator would allow underscores in URLs, and I don't know how feasible that is.

The second best answer is that I could override the Parsley URL regex with my own weakened version which allows underscores. I haven't found a clean way to do that yet, short or removing and then replacing the type validator with my own one, which I don't want to do, as I think I'd need to copy a lot of Parsley internal code.

Is there a way to replace https://github.com/guillaumepotier/Parsley.js/blob/master/src/parsley/validator_registry.js#L31 at runtime?

Thanks!

@marcandre
Copy link
Collaborator

Here's an example of such a URL: https://imuk_eng.pnzgu.ru/

I note that https://imukeng.pnzgu.ru/ works too.

Is there a way to replace https://github.com/guillaumepotier/Parsley.js/blob/master/src/parsley/validator_registry.js#L31 at runtime?

Not right now, no. These regexes could be stored in a global and made accessible though.

@marcandre
Copy link
Collaborator

FWIW, I am not convinced to change the regex in Parsley.

@richard-jones
Copy link
Author

These regexes could be stored in a global and made accessible though.

That would be great. Is that something you would add to your to do list, or would you be looking for me to provide a PR? I can try to do the latter, though I'm not sure if I would do it in the same way that you would like, as I don't have much experience with the internals of Parsley.

FWIW, I am not convinced to change the regex in Parsley.

If we can override the regex ourselves, that becomes much less of an issue.

One other observation: the HTML type=url attribute triggers Parsley to bind the URL validation to the element. This is fine, except that type=url then becomes semantically overloaded, as it is also used in the browser as an accessibility hint. As a result, I cannot remove it from the element and maintain my accessibility level, and I also cannot keep it if I want to use Parsley and validate the URLs the way I want.

One way forward would be to break the tight coupling between the type attribute and the validator, or at least allow Parsley to be configured to break that coupling on demand. That would allow me to use a simple Parsley pattern validator for the URL form that I want to use.

@richard-jones
Copy link
Author

One other observation: the HTML type=url attribute triggers Parsley to bind the URL validation to the element. This is fine, except that type=url then becomes semantically overloaded, as it is also used in the browser as an accessibility hint. As a result, I cannot remove it from the element and maintain my accessibility level, and I also cannot keep it if I want to use Parsley and validate the URLs the way I want.

For completeness, I tested what happened if I tried to submit a form without parsley but with type=url to see if the browser validation rejected the URL with underscores, and it did not, the form submitted fine.

@marcandre
Copy link
Collaborator

You can usetype="url" or data-parsley-type="url", your choice.

I can make the changes, should be easy, my issue is that I need to fix the build and that can take more time...

@richard-jones
Copy link
Author

You can use type="url" or data-parsley-type="url", your choice.

My issue is that my UX tells me that I must use type=url but I can't prevent parsley from applying its URL validation. Hm, or will setting data-parsley-type to an unexpected value function to disable the url validation; I may experiment.

I can make the changes, should be easy, my issue is that I need to fix the build and that can take more time...

No problem, I fully understand the challenges and time demands of managing an open source library, and I really appreciate your time on this.

@twiggy
Copy link

twiggy commented Jul 14, 2021

Will probably want to enter this as a separate ticket but underscore is not allowed in email domains and I think parsley is allowing them as a valid email. HTML 5 type="email" validator will complain if an _ is part of a domain. Seems like the only folks that would use an _ in the domain is someone phishing or being a RFC nerd. We have been using parsley for 10 years and this is the first time I've had someone plop a _ in their email. My backend Java code blew up which I think is using Apache Commons behind the scenes...not that it could have a bug as well. I'd honestly say even if _ is valid we should make the end user have to specify they are okay with it from a security perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants