Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Email validator does not allow double hyphen #283

Closed
dAnjou opened this issue Mar 17, 2017 · 2 comments
Closed

Email validator does not allow double hyphen #283

dAnjou opened this issue Mar 17, 2017 · 2 comments

Comments

@dAnjou
Copy link

dAnjou commented Mar 17, 2017

lorem@i--ipsum.com is a valid email address but the current email validator doesn't validate it.

Here's an email validator that I wrote with code that I stole from Django which works:

class Email(object):
    """
    Email validator shamelessly stolen from Django
    """
    USER_REGEX = (
        r"(^[-!#$%&'*+/=?^_`{}|~0-9A-Z]+(\.[-!#$%&'*+/=?^_`{}|~0-9A-Z]+)*\Z"  # dot-atom
        r'|^"([\001-\010\013\014\016-\037!#-\[\]-\177]|\\[\001-\011\013\014\016-\177])*"\Z)'  # quoted-string
    )
    # max length for domain name labels is 63 characters per RFC 1034
    DOMAIN_REGEX = r'((?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+)(?:[A-Z0-9-]{2,63}(?<!-))\Z'

    def __init__(self, msg=None):
        if msg is None:
            msg = "Invalid email address"
        self.msg = msg
        self.user_regex = re.compile(self.USER_REGEX, re.IGNORECASE)
        self.domain_regex = re.compile(self.DOMAIN_REGEX, re.IGNORECASE)

    def __call__(self, node, value):
        if not value or '@' not in value:
            raise Invalid(node, self.msg)

        user_part, domain_part = value.rsplit('@', 1)

        if not self.user_regex.match(user_part):
            raise Invalid(node, self.msg)

        if not self.domain_regex.match(domain_part):
            # Try for possible IDN domain-part
            try:
                domain_part = domain_part.encode('idna').decode('ascii')
                if self.domain_regex.match(domain_part):
                    return
            except UnicodeError:
                pass
            raise Invalid(node, self.msg)
@robertknight
Copy link

In fixing this for our project I opted to use an email regex taken from Chrome based on the HTML specs, on the basis that pattern is widely used and is pretty simple.

Would this be accepted upstream?

@digitalresistor
Copy link
Member

I'd be willing to accept that regex. Since that is what the browser is using for validation.

digitalresistor added a commit that referenced this issue Feb 1, 2019
This is the original case that prompted the ticket/change in #283.
digitalresistor added a commit that referenced this issue Feb 1, 2019
This is the original case that prompted the ticket/change in #283.
seanh added a commit to hypothesis/h that referenced this issue May 17, 2019
This is no longer necessary since Colander 1.7.0 changed its default
email regex to match the one from the WhatWG HTML spec.

See:

* 759e0b9
* Pylons/colander#324
* Pylons/colander#283
seanh added a commit to hypothesis/h that referenced this issue Jun 4, 2019
This is no longer necessary since Colander 1.7.0 changed its default
email regex to match the one from the WhatWG HTML spec.

See:

* 759e0b9
* Pylons/colander#324
* Pylons/colander#283
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants