Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce usage of regex #2644

Merged
merged 11 commits into from Dec 1, 2021
1 change: 1 addition & 0 deletions CHANGES.md
Expand Up @@ -6,6 +6,7 @@

- Fixed Python 3.10 support on platforms without ProcessPoolExecutor (#2631)
- Fixed `match` statements with open sequence subjects, like `match a, b:` (#2639)
- Reduce usage of the `regex` dependency (#2644)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this worth mentioning in the changelog? This doesn't really have an impact on end users since we still depend on regex unconditionally so all of the problem involved in that will persist. Not a big deal but I wanted to flag this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that there's some chance it does have an impact on end users, so it's worth mentioning.


## 21.11b1

Expand Down
2 changes: 1 addition & 1 deletion src/black/__init__.py
Expand Up @@ -10,7 +10,7 @@
import os
from pathlib import Path
from pathspec.patterns.gitwildmatch import GitWildMatchPatternError
import regex as re
import re
import signal
import sys
import tokenize
Expand Down
2 changes: 1 addition & 1 deletion src/black/comments.py
@@ -1,7 +1,7 @@
import sys
from dataclasses import dataclass
from functools import lru_cache
import regex as re
import re
from typing import Iterator, List, Optional, Union

if sys.version_info >= (3, 8):
Expand Down
4 changes: 2 additions & 2 deletions src/black/strings.py
Expand Up @@ -2,7 +2,7 @@
Simple formatting on strings. Further string formatting code is in trans.py.
"""

import regex as re
import re
import sys
from functools import lru_cache
from typing import List, Pattern
Expand Down Expand Up @@ -156,7 +156,7 @@ def normalize_string_prefix(s: str, remove_u_prefix: bool = False) -> str:
# performance on a long list literal of strings by 5-9% since lru_cache's
# caching overhead is much lower.
@lru_cache(maxsize=64)
def _cached_compile(pattern: str) -> re.Pattern:
def _cached_compile(pattern: str) -> Pattern[str]:
return re.compile(pattern)


Expand Down
2 changes: 1 addition & 1 deletion src/black/trans.py
Expand Up @@ -4,7 +4,7 @@
from abc import ABC, abstractmethod
from collections import defaultdict
from dataclasses import dataclass
import regex as re
import regex as re # We need recursive patterns here (?R)
from typing import (
Any,
Callable,
Expand Down
2 changes: 1 addition & 1 deletion src/blib2to3/pgen2/conv.py
Expand Up @@ -29,7 +29,7 @@
"""

# Python imports
import regex as re
import re

# Local imports
from pgen2 import grammar, token
Expand Down
4 changes: 2 additions & 2 deletions src/blib2to3/pgen2/tokenize.py
Expand Up @@ -52,7 +52,7 @@
__author__ = "Ka-Ping Yee <ping@lfw.org>"
__credits__ = "GvR, ESR, Tim Peters, Thomas Wouters, Fred Drake, Skip Montanaro"

import regex as re
import re
from codecs import BOM_UTF8, lookup
from blib2to3.pgen2.token import *

Expand Down Expand Up @@ -86,7 +86,7 @@ def _combinations(*l):
Comment = r"#[^\r\n]*"
Ignore = Whitespace + any(r"\\\r?\n" + Whitespace) + maybe(Comment)
Name = ( # this is invalid but it's fine because Name comes after Number in all groups
r"\w+"
r"[^\s#\(\)\[\]\{\}+\-*/!@$%^&=|;:'\",\.<>/?`~\\]+"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point we might need to add a FAQ entry describing why Black is incredibly inconsistent detecting invalid syntax. We don't promise that Black will fail on all invalid code but people do reasonably assume consistency. We don't need to get into the nitty gritty but simply explaining how it requires less work while achieving a high degree compatibility.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can add that separately.

)

Binnumber = r"0[bB]_?[01]+(?:_[01]+)*"
Expand Down
4 changes: 2 additions & 2 deletions tests/test_black.py
Expand Up @@ -31,7 +31,7 @@

import click
import pytest
import regex as re
import re
from click import unstyle
from click.testing import CliRunner
from pathspec import PathSpec
Expand Down Expand Up @@ -70,7 +70,7 @@
R = TypeVar("R")

# Match the time output in a diff, but nothing else
DIFF_TIME = re.compile(r"\t[\d-:+\. ]+")
DIFF_TIME = re.compile(r"\t[\d\-:+\. ]+")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!



@contextmanager
Expand Down