Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to letting go get mutate go.mod #3590

Closed
wants to merge 16 commits into from
Closed

Switch to letting go get mutate go.mod #3590

wants to merge 16 commits into from

Conversation

jeffwidman
Copy link
Member

@jeffwidman jeffwidman commented Apr 29, 2021

Previously, we were manually modifying go.mod for the dep we wanted to
bump, then running go get to update go.sum and run the MVS algorithm
to determine if any other changes needed to be made to go.mod.

However, this caused a few problems:

  • the go docs/comments on github issues consistently recommend not
    editing go.mod directly, but instead use go get to update it.
  • We have to copy/paste in private methods from upstream, and it's a
    pain to keep those in sync: Sync copy/pasted private methods with upstream #3580
  • When go get sees that go.mod is in a state that has a version
    which it can't find in go.sum, then it appears (although I'm not
    100% certain) to try to recover by doing a bunch of additional checks to
    ensure it fixes the dep graph correctly. I saw @bcmills mention
    something along these lines in a github issue although I'm afraid I
    can't find the exact reference now. Compare to if go get is run
    against a go.mod that has a matching (tidy'd) go.sum, it appears
    to skip those checks. That seems to be the root cause of Unfixable failure when the dep tree includes a dependency circle and one of those deps at one time referenced a broken module #3526 where
    those checks result in an impossible-to-recover-from-situation. (Side
    note: that problem looks like it will get resolved in go 1.17 via the
    new lazy module loading.)
  • In some cases, we're doing an unecessary write to disk since go.mod
    gets written by us and then re-written by go get. That could be a
    single write.

Instead of doing all this munging, we can simply pass our desired
version to go get -d <dep>@<version> and it will handle updating
go.mod and go.sum.

Experimenting locally, this fixes #3526 which I was previously hitting
in multiple repos. They all work now.

It also lets us do some code cleanup, skip an uneccessary write-to-disk,
make all deeply nested dep trees process much faster, becomes more
idiomatic, and reduces the risk of our logic for updating go.mod from
diverging from upstream.

Previously, we were manually modifying `go.mod` for the dep we wanted to
bump, then running `go get` to update `go.sum` and run the MVS algorithm
to determine if any other changes needed to be made to `go.mod`.

However, this caused a few problems:
* the `go` docs/comments on github issues consistently recommend _not_
editing `go.mod` directly, but instead use `go get` to update it.
* We have to copy/paste in private methods from upstream, and it's a
pain to keep those in sync: #3580
* When `go get` sees that `go.mod` is in a state that has a version
which it can't find in `go.sum`, then it _appears_ (although I'm not
100% certain) to try to recover by doing a bunch of additional checks to
ensure it fixes the dep graph correctly. I saw @bcmills mention
something along these lines in a github issue although I'm afraid I
can't find the exact reference now. Compare to if `go get` is run
against a `go.mod` that has a matching (`tidy`'d) `go.sum`, it appears
to skip those checks. That seems to be the root cause of #3526 where
those checks result in an impossible-to-recover-from-situation. (Side
note: that problem looks like it will get resolved in `go 1.17` via the
new [lazy module loading](http://golang.org/design/36460-lazy-module-loading).)
* In some cases, we're doing an unecessary write to disk since `go.mod`
gets written by us and then re-written by `go get`. That could be a
single write.

Instead of doing all this munging, we can simply pass our desired
version to `go get -d <dep>@<version>` and it will handle updating
`go.mod` and `go.sum`.

Experimenting locally, this fixes #3526 which I was previously hitting
in multiple repos. They all work now.

It also lets us do some code cleanup, skip an uneccessary write-to-disk,
become more idiomatic, and reduce the risk of our logic for updating
`go.mod` from diverging from upstream.
@jeffwidman jeffwidman requested a review from a team as a code owner April 29, 2021 07:45
@jeffwidman
Copy link
Member Author

I realize tests are failing, but before spending any time on those, can I get a yes/no on the overall direction of this PR?

@jurre
Copy link
Member

jurre commented Apr 29, 2021

In general I think the approach is an improvement 👍 , I'm not entirely sure if there is a technical reason that Dependabot manually updates the go.mod file though 🤔, the test-failures seem like they are just errors that manifest slightly differently now, but they may reveal the true reason behind the way things are?

Copy link
Contributor

@thepwagner thepwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of reducing the scope of these helpers, and I respect any non-rubyist crawling through this code ❤️ .

My concerns are around things the go_modules plugin doesn't do correctly today (module replacements, ignored versions), and wondering if this change makes that harder/easier in the future.

In particular with the recent-ish focus on ignoring versions, the latter may be improved shortly...

@jeffwidman
Copy link
Member Author

jeffwidman commented Apr 29, 2021

I'm not entirely sure if there is a technical reason that Dependabot manually updates the go.mod file though

I also wondered, and before I put up this PR I spent two hours spelunking through all the old commits to this folder. I never saw commit messages indicating it was written this way to workaround specific issues/bugs. Instead, I suspect the following all played a role:

  • this code first landed as a port of dep.
  • the go get/go mod commands were pretty immature at the time of the port.
  • the idiomatic ways to manage modules weren't very clear in the community, so the author sometimes just had to hack his way through it leading to some surprising code.
  • this pipeline originally did not git clone the repo, which I suspect resulted in more of a mindset of "surgically modify specific lines in specific files to replicate go".

Again, I'm just spitballin' here, so may be wrong.

This is tricky.
*
aa67f7d
removed catching the error, because `go mod tidy` didn't flag it.
* #3233 put it back
because in `go 1.16`, `go get -d` started flagging the error again.
* Now that we're switching from `go get -d` to `go get -d
<dep>@<version>` this error is no longer thrown. Instead the
`go.mod` file gets updated.
# So I manually ran those in the test fixture, and it resulted in
# this line appearing. But when the test executes, this line is missing.
# I'm not sure why, and not sure how to run the Ruby debugger
# to step through it.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test was failing. The backstory is pretty complicated:

  • aa67f7d removed catching the error, because go mod tidy didn't flag it.
  • Upgrade golang to v1.16 #3233 put it back because in go 1.16, go get -d started flagging the error again.
  • Now that we're switching from go get -d to go get -d <dep>@<version> this error is no longer thrown. Instead the go.mod file gets updated.

I manually replicated the go get -d github.com/googleapis/gnostic@v0.5.1 followed by go mod tidy -e and then updated the test to match what I'm seeing in the go.mod file.

Surprisingly though, the test is still failing complaining that the line isn't present.

So now I'm unclear if the test is wrong somehow, or if there's some wrong code, or what. I'm hesitant to "fix" the test to something that doesn't match the output from when I manually ran the steps.

Anyone else want to take a stab at it?

# See also discussion here where it seems the author/reviewers agreed on one solution
# but then (confusingly) merged a different one w/o explaining why:
# https://github.com/dependabot/dependabot-core/pull/3482#discussion_r612343046
# So I'm unclear if this needs the regex, fixture, test, or something else to be updatd.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help?

I'm not really sure what needs to happen here...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffwidman
Copy link
Member Author

I've taken this about as far as I can, need some help here on the tests side. I am hesitant to make them go green until we are comfortable that they're validating what we actually want 😀 .

@jurre
Copy link
Member

jurre commented Apr 30, 2021

I'm not entirely sure if there is a technical reason that Dependabot manually updates the go.mod file though

I also wondered, and before I put up this PR I spent two hours spelunking through all the old commits to this folder. I never saw commit messages indicating it was written this way to workaround specific issues/bugs. Instead, I suspect the following all played a role:

  • this code first landed as a port of dep.
  • the go get/go mod commands were pretty immature at the time of the port.
  • the idiomatic ways to manage modules weren't very clear in the community, so the author sometimes just had to hack his way through it leading to some surprising code.
  • this pipeline originally did not git clone the repo, which I suspect resulted in more of a mindset of "surgically modify specific lines in specific files to replicate go".

Again, I'm just spitballin' here, so may be wrong.

Yep, that sounds like the most likely case to me as well! 👍

@jurre
Copy link
Member

jurre commented Apr 30, 2021

Thanks @jeffwidman! One of us will can dive into the failing tests but we'll need to make a bit of time for it, I hope to have a few slots next week, or maybe someone else from the team can get to it sooner. Really appreciate your thoroughness on this though 🎉

@jeffwidman
Copy link
Member Author

Thanks @jeffwidman! One of us will can dive into the failing tests but we'll need to make a bit of time for it, I hope to have a few slots next week, or maybe someone else from the team can get to it sooner. Really appreciate your thoroughness on this though 🎉

Fantastic! I'll still be around to help out as needed, but will let you folks drive it across the line. And to be clear--I am very okay if you choose to go a different route with the design after investigating the test failures, my feelings will not be hurt. 😀

I just want the problem that we're hitting in #3526 to be resolved, w/o having to wait for lazy-loading in go 1.17 (especially as I'm not even sure that will fix it!). Another week or two is fine though.

One other thing that I'm curious on is benchmarks... I'd be very curious if this speeds things up quite a bit. I'm certainly seeing that in my local runs, but that's also running against a dep graph that has circular loops, which may simply be a special case that go get -d <dep@version> seems to short-circuit.

@jeffwidman
Copy link
Member Author

Circling back on this--any chance someone on the team has time to look into the failing / confusing tests? I just fixed the merge conflicts...

I've got a couple of teams at my day job who are eager to see this land because dependabot is failing on half their repos due to #3526 (which would be fixed by this PR).

@jeffwidman
Copy link
Member Author

Gentle nudge on this one. Anything I can do to help move this forward?

Comment on lines 183 to 171
_, stderr, status = Open3.capture3(ENVIRONMENT, "go get -d")
# TODO: go 1.18 will make `-d` the default behavior, so remove the flag then
command = +"go get -d"
# `go get` accepts multiple packages, each separated by a space
dependencies.each do |dep|
# Use version pinning rather than `latest` just in case
# a new version gets released in the middle of our run.
version = "v" + dep.version.sub(/^v/i, "")
command << " #{dep.name}@#{version}"
end
_, stderr, status = Open3.capture3(ENVIRONMENT, command)
Copy link
Member

@jurre jurre May 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec/dependabot/go_modules/file_updater/go_mod_updater_spec.rb:245 is failing because we now no longer run go get -d on all the modules, so basically we're only running the code for update_go_mod (but now using go get instead of our custom helper), we still need to run a bare go get -d to verify that the packages we have downloaded have the module that we expect to find etc.

I verified the tests pass after adding a go get -d after this command, but what might be nice is keeping the original def run_go_get method, and moving the contents of this method into def update_go_mod

I think the tests should pass after that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so that was intentional...

I'd thought the generic go get -d wasn't needed because we now rely on go get lib@version to be smart enough to know how to upgrade a particular library and if some other part of the dep tree is hosed, then we let go get -d lib@version decide whether that is relevant or not. I don't expect (or want) dependabot to validate my existing dep tree, I only expect it to validate that version bumps don't put the dep tree in a worse state.

I still think that actually.

So let's do this, I'll break this PR in two. The first will keep this generic go get -d command, and simply swap out the custom helper for go get and then the second will be a discussion of whether this generic command is within the scope of what Dependabot should be doing.

That should keep the majority of this PR in a simple, non-controversial PR, and then the second PR (which will be more controversial) can be much more easily reasoned about/researched/discussed.

I'm also reasonably certain this first PR will not solve #3526 which was my original impetus for all this work. That'd require the second PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I get what you're saying, however:

I only expect it to validate that version bumps don't put the dep tree in a worse state.

I think that running the generic go get -d still protects against this, for the case the failing test was validating, it prevents updating to a version when the package has been renamed, right?

Maybe it is an edge-case we should not worry about, assume peoples CI will fail and they'll figure out what the issue is and update the import statements etc manually, but I definitely appreciate you pulling these out into separate PRs so we can discuss more focussed 🙇

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a few notes to myself when I get time to dig into this:

  1. IIRC, the other reason this was originally here was because we'd manually munged go.mod w/o touching go.sum, so we needed this go get to pickup those go.sum changes. (This could have also been done via go mod tidy, but go get was handling it well enough). This is why I originally thought this was no longer needed.
  2. The transitory resolving stuff will likely get cleaned up a lot more as part of lazy loading in go 1.17.
  3. There's still the open question though of should dependabot fail when something deep upstream fails... eg, this started failing yesterday 4 steps up our dep chain: tencentcloud dependency seems to be gone hashicorp/go-discover#172. I can see valid arguments both pro/con on this.
  4. it prevents updating to a version when the package has been renamed, right ...double-check whether this test checks a dep we are actually updating, or a different dep that's also present in go.mod, but not the one we are updating. However, regardless whether found in Dependabot or the user's CI, this is still an issue. Unlike Unfixable failure when the dep tree includes a dependency circle and one of those deps at one time referenced a broken module #3526 which is spurious because go get simply doesn't handle cyclical deps very well, and doesn't actually need handling.

@jeffwidman jeffwidman marked this pull request as draft May 18, 2021 05:42
@jeffwidman
Copy link
Member Author

Converting to Draft for now... I'm breaking this out as 3 PRs:

@jeffwidman
Copy link
Member Author

Closing for now, this will need complete re-evaluation once go 1.17 drops later this month with the new lazy-loading feature.

@jeffwidman jeffwidman closed this Aug 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants