Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/50 fix gitlab pathmatch 3 0 0 #52

Closed
wants to merge 10 commits into from

Conversation

mikemimik
Copy link

Closes #50

Related to #51

@mikemimik mikemimik added the bug label Sep 28, 2019
@mikemimik mikemimik requested a review from a team September 28, 2019 06:07
@mikemimik mikemimik added this to In progress in Release - v6.12.0 via automation Sep 28, 2019
@@ -64,6 +64,7 @@ function fromUrl (giturl, opts) {
var pathmatch = gitHostInfo.pathmatch
var matched = parsed.path.match(pathmatch)
if (!matched) return
/* istanbul ignore else */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be tested with a url where the host matches, but the path isn't a match? Like https://github.com/isaacs?

Possibly I'm misreading this bit, and it is actually just a safety guard that never hits. But in that case, it'd be preferable to make that explicit.

/*istanbul ignore if */
if (matched[1] === null || matched[1] === undefined) {
  throw new Error('this should be impossible')
}

@isaacs
Copy link
Contributor

isaacs commented Sep 28, 2019

Generally looks good to me. I'm usually not a huge fan of programmatically generated tests, but in this case, I can see how these would get hella repetitive (and already had some generated tests anyway).

git-host-info.js Outdated
@@ -25,7 +25,7 @@ var gitHosts = module.exports = {
'bugstemplate': 'https://{domain}/{user}/{project}/issues',
'httpstemplate': 'git+https://{auth@}{domain}/{user}/{projectPath}.git{#committish}',
'tarballtemplate': 'https://{domain}/{user}/{project}/repository/archive.tar.gz?ref={committish}',
'pathmatch': /^[/]([^/]+)[/](.+?)(?:[.]git|[/])?$/
'pathmatch': /^[/]([^/]+)[/](?!.*\/-\/)(.+?)(?:[.]git|[/])?$/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will also incorrectly match URLs like https://gitlab.com/user/foo/bar/baz/repository/archive.tar.gz

@isaacs
Copy link
Contributor

isaacs commented Oct 1, 2019

It looks like these two suffixes also need to be excluded:

\/repository\/archive.tar.gz(\?ref=...)?$
\/repository\/[^\/]\/archive.tar.gz$

@isaacs
Copy link
Contributor

isaacs commented Oct 1, 2019

Seems like this pathmatch change should do the right thing:

diff --git a/git-host-info.js b/git-host-info.js
index 4dec801..04d5f63 100644
--- a/git-host-info.js
+++ b/git-host-info.js
@@ -25,7 +25,7 @@ var gitHosts = module.exports = {
     'bugstemplate': 'https://{domain}/{user}/{project}/issues',
     'httpstemplate': 'git+https://{auth@}{domain}/{user}/{projectPath}.git{#committish}',
     'tarballtemplate': 'https://{domain}/{user}/{project}/repository/archive.tar.gz?ref={committish}',
-    'pathmatch': /^[/]([^/]+)[/](?!.*\/-\/)(.+?)(?:[.]git|[/])?$/
+    'pathmatch': /^[/]([^/]+)[/](?!.*(\/-\/|\/repository\/archive\.tar\.gz|\/repository\/[^\/]+\/archive.tar.gz))(.+)(?:[.]git|[/])?$/
   },
   gist: {
     'protocols': [ 'git', 'git+ssh', 'git+https', 'ssh', 'https' ],

@lpinca
Copy link

lpinca commented Oct 1, 2019

Not sure if allowed by GitLab but that regex will incorrectly match something like https://gitlab.com/user/repository/archive.tar.gz/projname. Anyway it is probably an edge case not worth fixing.

@isaacs
Copy link
Contributor

isaacs commented Oct 1, 2019

Can a repo be just a single path part on GitLab? I thought that they were always at least 2.

@lpinca
Copy link

lpinca commented Oct 1, 2019

I think so if you mean something like https://gitlab.com/user/group/subgroup/repo.

@isaacs
Copy link
Contributor

isaacs commented Oct 1, 2019

Right, so you could have a user named user, then a group named repository, then a subgroup named archive.tar.gz and a project named foo. I think that's possibly valid, unless GitLab doesn't allow repository or archive.tar.gz as group names (which would be a good idea!)

But the requirement here isn't to match every and only valid gitlab URLs. Just to match all the ones that are valid and likely to be used. If we treat an invalid gitlab url as valid, and then it blows up, that's fine, because it was never going to work anyway. As long as we're not trying to download a valid tarball url but using git, making it fail.

@lpinca
Copy link

lpinca commented Oct 1, 2019

As long as we're not trying to download a valid tarball url but using git, making it fail.

Yes that's my point. https://gitlab.com/user/repository/archive.tar.gz/foo would be treated as tarball url instead of a git one no?

But the requirement here isn't to match every and only valid gitlab URLs. Just to match all the ones that are valid and likely to be used.

Agreed.

@isaacs
Copy link
Contributor

isaacs commented Oct 1, 2019

Yes that's my point. https://gitlab.com/user/repository/archive.tar.gz/foo would be treated as tarball url instead of a git one no?

Oh, I see. Hm. This seems to do the trick:

/^[/]([^/]+)[/](?!.*(\/-\/|\/repository\/archive\.tar\.gz\?=.*|\/repository\/[^\/]+\/archive.tar.gz$))(.+)(?:[.]git|[/])?$/

Wonder if it might be too limiting to have it anchored at the end, though.

isaacs pushed a commit that referenced this pull request Oct 1, 2019
@isaacs isaacs closed this in e3e3054 Oct 7, 2019
Release - v6.12.0 automation moved this from In progress to Done Oct 7, 2019
@wraithgar wraithgar deleted the feature/50_fix-gitlab-pathmatch-3-0-0 branch September 21, 2021 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Pathmatch for GitLab treats tarball archives as git repos
4 participants