New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"master upgrade should maintain a functioning cluster" failing #103697
Comments
current blocking failure is |
that looks related to kubernetes/release@908c081 cc @spiffxp |
for reference, the auth upgrade jobs which upgrade from ci/latest-1.20 to ci/latest are currently green and are upgrading from v1.20.9-rc.0.20+66e6d5ee1fa946 to v1.22.0-beta.1.157+e375563732a6f5 |
This is super strange. From example failure:
So clearly, the problem is connected from calling Now - this is the upgrade auth upgrade job looks:
vs the failing cluster upgrade:
The literally look the same for me with one failing and the other passing. I will keep looking... |
Ironically - the tests seem to actually be running, e.g. from the run above from logs:
So it seems that the tests are actually running (and passing). |
I can't find a job that sets k8s-beta, so I will copy it, but it will be pointing to a stale build anyway:
This is our usual fun dance of |
I forget exactly how this all plays with job rotation once release-branch jobs are cut. I would personally be fine with moving away from the /assign @cpanato @puerco @justaugustus |
It looks like kubernetes/test-infra#22790 is what stopped writing |
That wasn't it, but thanks for tagging me in, because I was definitely the final straw on this particular camel's back. The extract step failing on 2021-07-09 means it's either kubernetes/test-infra#22840 or one of its followup PRs mentioned in kubernetes/k8s.io#2318 (comment) |
#99857 (comment) maybe this is the culprit for the first problem? |
See #103697 (comment) above [that change was reverted in one of subsequent PRs] |
Namely - this one: #101118 |
I guess I know what's happening - #101118 has to be cherrypicked back to 1.21. |
Thanks, latest run passed the extract step and we're back to the ginkgo describe error. That should be fixed by #103712. Once https://storage.googleapis.com/k8s-release-dev/ci/k8s-stable1.txt updates to v1.21.3-rc.0.28+4aa451e8458a7c, can you push that to |
Just wanted to note that for the kubeadm upgrade jobs we stopped testing
against stable and instead we test against the tips of release branches or
master. This allows us to also catch problems in backports before a patch
release is made.
To do that we have to use the latest and latest-foo markers. The foo
calculations are currently in the tooling, since there are no latest-1
markers (ala stable1).
|
it looks like the |
I guess it would. I may have been mislead by the 'stable' in the name. In practice the tooling can now do these calculations and its easy to say 'this upgrade job does latest-1 to latest upgrades', but one still needs to PR test infra. And IIRC the original intent of the *stable markers was mainly to not PR test infra on each release. |
looks like CI build v1.21.3-rc.0.28+4aa451e8458a7c is available now |
Done
|
awesome, thanks |
If we're going to make a decision on what we think version markers should look like, that's what kubernetes/sig-release#850 is for. IMO the contributor clarity gained by hardcoding version numbers would far outweigh the toil of updating them every N months. |
I agree since some markers have been a bit cryptic. Tooling can help with the regular updates too. |
/assign |
OK - so cluster upgrade tests become green again after the change: But it seems that master upgrade is panicing (I don't know how it worked for me before). It's a type - going to send out fix soon. |
#103734 is merged, should resolve the last failure as long as the job points to ci/k8s-beta.txt, I guess we'll need one more bump of that file once https://storage.googleapis.com/k8s-release-dev/ci/k8s-stable1.txt updates to v1.21.4-rc.0.3+0e1bd6ab564... |
also opened kubernetes/test-infra#22915 to fix up the testgrid tab names and make these tests use ci/latest instead of ci/k8s-beta (happy to re-rework that in the future if k8s-beta starts being automatically populated again) |
looks like it's ready now |
I got pulled away from keyboard, I'll sync shortly though the test-infra PR will probably obviate |
|
OK - so the upgrade itself works now. However, there are gazilions of storage tests that started failing after Jordan upgraded to use latest. @liggitt - should we close this one and open a separate bug for the failing storage tests? |
probably so |
Opened #103822 Closing this one as resolved. |
Which jobs are failing:
ci-kubernetes-e2e-gce-stable1-beta-upgrade-master
Which test(s) are failing:
"master upgrade should maintain a functioning cluster"
Since when has it been failing:
Since #99857 merged
On 2021-07-09, the extract step also started failing.
Testgrid link:
testgrid titles are misleading, these are upgrading from 1.21 to 1.22
Reason for failure:
Refactor broke ginkgo usage.
/assign @wojtek-t
/cc @zshihang
Anything else we need to know:
The text was updated successfully, but these errors were encountered: