Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: reduce some false positive matches #23

Merged

Conversation

antongolub
Copy link
Contributor

@antongolub antongolub commented May 25, 2022

  • fix: handle some false positive matches
    * feat: introduce SEMVER_REGEX const
  • test: check what exactly is captured by regex
  • test: verify test cases via semver
  • chore: up deps, apply xo

closes #21

@antongolub
Copy link
Contributor Author

@sindresorhus, rfr.

@sindresorhus
Copy link
Owner

I think it's important to ensure we don't introduce another ReDoS vulnerability with these changes. I have already had a few: https://github.com/sindresorhus/semver-regex/commits/main

https://www.google.com/search?client=safari&rls=en&q=redos+check&ie=UTF-8&oe=UTF-8

@antongolub
Copy link
Contributor Author

antongolub commented May 25, 2022

Reasonable. https://devina.io/redos-checker, the results are almost identical:

Before
image

After
image

@antongolub antongolub force-pushed the fix-false-positive-matches branch from 4763bce to 9ac980b Compare May 25, 2022 18:48
@antongolub
Copy link
Contributor Author

@sindresorhus, can you trigger tests again?

@antongolub
Copy link
Contributor Author

antongolub commented May 26, 2022

Mb add named semverRegex export instead? We'd like to avoid default exports in our codebase.

export function semverRegex() {
	return /(?<=^v?|\sv?)(?:(?:0|[1-9]\d{0,9}?)\.){2}(?:0|[1-9]\d{0,9})(?:-(?:0|[1-9]\d*|[\da-z-]*?[a-z-][\da-z-]*?)){0,100}?(?=$| |\+|\.)(?:(?<=-\S+)(?:\.(?:[\da-z-]*?[a-z-][\da-z-]*|0|[1-9]\d*)){1,100}?)?(?!\.)(?:\+(?:[\da-z-]+(?:\.[\da-z-]+)*){1,100}?(?!\w))?(?!\+)/gi;
}

export default semverRegex;

@antongolub
Copy link
Contributor Author

antongolub commented May 27, 2022

I have to admit, that I have no idea how to implement trully redos-safe checker: we have a fairly wide range of chunk combinations and the semver spec does not impose their length restrictions.

@antongolub
Copy link
Contributor Author

@sindresorhus,

Anyway, is any chance to land this?

@sindresorhus
Copy link
Owner

we have a fairly wide range of chunk combinations and the semver spec does not impose their length restrictions.

But we can restrict them to some realistic values.

@sindresorhus
Copy link
Owner

Anyway, is any chance to land this?

To be honest, I don't know where to start reviewing. The regex is just way too complicated now. Maybe if you could split up the regex into manageable parts. Like the main version component, prerelease, and build.

If you need it right away, I would suggest just doing:

npm install 'antongolub/semver-regex#fix-false-positive-matches'

@antongolub
Copy link
Contributor Author

antongolub commented May 29, 2022

We have dozens packages depending on semver-regex, so I'd like to provide the fix here. The value of OSS is not in forks, but in improving existing tools (IMO)

The regex is just way too complicated now.

Like email regex, url regex, strict iso8601, etc. They are all complicated because the formats they describe are complicated too. I believe someday robots will write regexes (https://github.com/MaLeLabTs/RegexGenerator)

In general, the patch looks optimistic: the old tests work and the new ones also pass. Maybe we just need to extnd redos tests.

If the new regex completely covers the spec, perhaps this will be its last change)

@antongolub
Copy link
Contributor Author

Leaving things as they are seems wrong:
image
This misbehavior has been here for so long time and probably has a bunch of side effects, that the fix will have to be published as 5.0.0.

@antongolub
Copy link
Contributor Author

antongolub commented May 29, 2022

@sindresorhus,

https://devina.io/redos-checker

/(?<=^v?|\sv?)(?:(?:0|[1-9]\d{0,9}?)\.){2}(?:0|[1-9]\d{0,9})(?:-(?:--?|0|[1-9]\d*|\d*[a-z]+\d*)){0,100}(?=$| |\+|\.)(?:(?<=-\S+)(?:\.(?:--?|[\da-z-]*[a-z-]\d*|0|[1-9]\d*)){1,100}?)?(?!\.)(?:\+(?:[\da-z]\.?-?){1,100}?(?!\w))?(?!\+)/gi

image

@antongolub antongolub force-pushed the fix-false-positive-matches branch from de5b63d to 92d7791 Compare May 29, 2022 14:44
@antongolub
Copy link
Contributor Author

antongolub commented May 30, 2022

npm install 'antongolub/semver-regex#fix-false-positive-matches'

@sindresorhus, let's publish this as beta, so we could override the current version via yarn resolutions.

@antongolub
Copy link
Contributor Author

The execution time is strongly related to the hardware. I could just up the threshold, but I have no idea where the optimum lies.

@antongolub antongolub requested a review from sindresorhus June 1, 2022 20:20
@sindresorhus
Copy link
Owner

CI is failing

@antongolub
Copy link
Contributor Author

So close)

Execution time: 20
t.true(difference < 20, `Execution time: ${difference}`);

@antongolub
Copy link
Contributor Author

✖ invalid version does not cause catatrophic backtracking Execution time: 26
t.true(difference < 25, Execution time: ${difference});

@antongolub
Copy link
Contributor Author

antongolub commented Jun 2, 2022

I'll set the timeout to 50.

@antongolub
Copy link
Contributor Author

@sindresorhus, your turn

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@sindresorhus
Copy link
Owner

Mb add named semverRegex export instead? We'd like to avoid default exports in our codebase.

I prefer using default export when it makes sense and it makes no sense to export both a default and named export with the same thing.

@sindresorhus sindresorhus merged commit e93d9c8 into sindresorhus:main Jun 5, 2022
@sindresorhus
Copy link
Owner

Thanks :)

@sindresorhus
Copy link
Owner

https://github.com/sindresorhus/semver-regex/releases/tag/v4.0.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semantic version tests are not being passed
2 participants