Email validator does not allow double hyphen #283

dAnjou · 2017-03-17T17:33:29Z

lorem@i--ipsum.com is a valid email address but the current email validator doesn't validate it.

Here's an email validator that I wrote with code that I stole from Django which works:

class Email(object):
    """
    Email validator shamelessly stolen from Django
    """
    USER_REGEX = (
        r"(^[-!#$%&'*+/=?^_`{}|~0-9A-Z]+(\.[-!#$%&'*+/=?^_`{}|~0-9A-Z]+)*\Z"  # dot-atom
        r'|^"([\001-\010\013\014\016-\037!#-\[\]-\177]|\\[\001-\011\013\014\016-\177])*"\Z)'  # quoted-string
    )
    # max length for domain name labels is 63 characters per RFC 1034
    DOMAIN_REGEX = r'((?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+)(?:[A-Z0-9-]{2,63}(?<!-))\Z'

    def __init__(self, msg=None):
        if msg is None:
            msg = "Invalid email address"
        self.msg = msg
        self.user_regex = re.compile(self.USER_REGEX, re.IGNORECASE)
        self.domain_regex = re.compile(self.DOMAIN_REGEX, re.IGNORECASE)

    def __call__(self, node, value):
        if not value or '@' not in value:
            raise Invalid(node, self.msg)

        user_part, domain_part = value.rsplit('@', 1)

        if not self.user_regex.match(user_part):
            raise Invalid(node, self.msg)

        if not self.domain_regex.match(domain_part):
            # Try for possible IDN domain-part
            try:
                domain_part = domain_part.encode('idna').decode('ascii')
                if self.domain_regex.match(domain_part):
                    return
            except UnicodeError:
                pass
            raise Invalid(node, self.msg)

The text was updated successfully, but these errors were encountered:

robertknight · 2017-09-21T08:59:22Z

In fixing this for our project I opted to use an email regex taken from Chrome based on the HTML specs, on the basis that pattern is widely used and is pretty simple.

Would this be accepted upstream?

digitalresistor · 2017-09-21T21:12:20Z

I'd be willing to accept that regex. Since that is what the browser is using for validation.

This is the original case that prompted the ticket/change in #283.

This is no longer necessary since Colander 1.7.0 changed its default email regex to match the one from the WhatWG HTML spec. See: * 759e0b9 * Pylons/colander#324 * Pylons/colander#283

robertknight mentioned this issue Sep 20, 2017

Sign up form rejects valid international email addresses (Unicode & ACE) hypothesis/h#4662

Closed

digitalresistor mentioned this issue Feb 1, 2019

Fix: email regex #324

Merged

mmerickel closed this as completed in #324 Feb 1, 2019

pyup-bot mentioned this issue Jun 30, 2020

Pin colander to latest version 1.7.0 camptocamp/c2cgeoportal#6618

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Email validator does not allow double hyphen #283

Email validator does not allow double hyphen #283

dAnjou commented Mar 17, 2017

robertknight commented Sep 21, 2017

Uh oh!

digitalresistor commented Sep 21, 2017

Uh oh!

Email validator does not allow double hyphen #283

Email validator does not allow double hyphen #283

Comments

dAnjou commented Mar 17, 2017

robertknight commented Sep 21, 2017

Uh oh!

digitalresistor commented Sep 21, 2017

Uh oh!