Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOSTEDCP-1542: cmd/cluster: refactor to remove example fixtures #4018

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

stevekuznetsov
Copy link
Contributor

The goal of this refactor is to reduce the complexity in the command-line tooling. Overall, the changes here remove duplicative structures that copied data around and co-locate the logic with the data, instead of first aggregating all data into one uber-structure and then conditionally acting on that structure. These refactors have a number of benefits:

  • locality of behavior: in the past, it was very difficult if not impossible to determine where a value was used, as it would be bound to a flag in one package, copied around between container structs a couple of times, then have some generic logic act on the presence or absence of the value to e.g. change a field on the HostedCluster. Simply reading the generic logic was often not enough to understand what was going on, as many of the conditional branches in the example fixture code could only ever trigger for one specific platform, and you'd never know unless you traced how the example options uber-struct had its fields set in every provider.
  • clear go-to-definition: as a knock-on effect of the above, now there's one structure that holds a command-line flag and it's trivial to use the LSP when determining where that flag is used and how
  • composability: as exemplified in the KubeVirt NodePool code, we are able to compose commands as necessary. When commands re-use the same arguments with the same flags and the same validation logic, there's no need to copy things around and re-implement anything; by localizing flag binding, validation and option completion, we gain small, composable parts that we can use to build larger commands with

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 10, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented May 10, 2024

@stevekuznetsov: This pull request references HOSTEDCP-1542 which is a valid jira issue.

In response to this:

The goal of this refactor is to reduce the complexity in the command-line tooling. Overall, the changes here remove duplicative structures that copied data around and co-locate the logic with the data, instead of first aggregating all data into one uber-structure and then conditionally acting on that structure. These refactors have a number of benefits:

  • locality of behavior: in the past, it was very difficult if not impossible to determine where a value was used, as it would be bound to a flag in one package, copied around between container structs a couple of times, then have some generic logic act on the presence or absence of the value to e.g. change a field on the HostedCluster. Simply reading the generic logic was often not enough to understand what was going on, as many of the conditional branches in the example fixture code could only ever trigger for one specific platform, and you'd never know unless you traced how the example options uber-struct had its fields set in every provider.
  • clear go-to-definition: as a knock-on effect of the above, now there's one structure that holds a command-line flag and it's trivial to use the LSP when determining where that flag is used and how
  • composability: as exemplified in the KubeVirt NodePool code, we are able to compose commands as necessary. When commands re-use the same arguments with the same flags and the same validation logic, there's no need to copy things around and re-implement anything; by localizing flag binding, validation and option completion, we gain small, composable parts that we can use to build larger commands with

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from davidvossel and nirarg May 10, 2024 19:15
@openshift-ci openshift-ci bot added area/cli Indicates the PR includes changes for CLI area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/testing Indicates the PR includes changes for e2e testing and removed do-not-merge/needs-area labels May 10, 2024
@stevekuznetsov stevekuznetsov force-pushed the skuznets/delegate-resource-creation branch 4 times, most recently from ac7ac9d to 7e710d2 Compare May 10, 2024 20:54
@stevekuznetsov stevekuznetsov force-pushed the skuznets/delegate-resource-creation branch 4 times, most recently from 9725406 to f265b6f Compare May 14, 2024 18:37
Copy link
Contributor

@davidvossel davidvossel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubevirt parts look accurate

@stevekuznetsov stevekuznetsov force-pushed the skuznets/delegate-resource-creation branch from f265b6f to 6d8ba1a Compare May 14, 2024 22:51
@openshift-ci openshift-ci bot added the area/ci-tooling Indicates the PR includes changes for CI or tooling label May 16, 2024
@stevekuznetsov stevekuznetsov force-pushed the skuznets/delegate-resource-creation branch from b28e521 to 66ac257 Compare June 5, 2024 20:11
Copy link

netlify bot commented Jun 6, 2024

Deploy Preview for hypershift-docs ready!

Name Link
🔨 Latest commit d3f0bbd
🔍 Latest deploy log https://app.netlify.com/sites/hypershift-docs/deploys/6661dffe32468a0008205bd5
😎 Deploy Preview https://deploy-preview-4018--hypershift-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@stevekuznetsov stevekuznetsov force-pushed the skuznets/delegate-resource-creation branch from d3f0bbd to c695ed8 Compare June 6, 2024 16:19
@stevekuznetsov
Copy link
Contributor Author

/retest

@sjenning
Copy link
Contributor

sjenning commented Jun 6, 2024

=== RUN   TestAutoscaling/ValidateHostedCluster/EnsureNoCrashingPods
    util.go:533: Container cloud-controller-manager in pod kubevirt-cloud-controller-manager-7cc4ccf9b9-644r2 has a restartCount > 0 (1)

=== RUN   TestCreateCluster/ValidateHostedCluster/EnsureNoCrashingPods
    util.go:533: Container cloud-controller-manager in pod kubevirt-cloud-controller-manager-6d48cd658d-d6k7n has a restartCount > 0 (1)
    util.go:533: Container oauth-apiserver in pod openshift-oauth-apiserver-5c7dd4b5dc-l2s5z has a restartCount > 0 (1)

Seems like a flake but some jobs are passing now (new development)

/retest-required

@sjenning
Copy link
Contributor

sjenning commented Jun 6, 2024

/approve
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 6, 2024
Copy link
Contributor

openshift-ci bot commented Jun 6, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sjenning, stevekuznetsov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 6, 2024
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 5518314 and 2 for PR HEAD c695ed8 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD ef8c6cd and 1 for PR HEAD c695ed8 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD e6407b0 and 0 for PR HEAD c695ed8 in total

@openshift-ci-robot
Copy link

/hold

Revision c695ed8 was retested 3 times: holding

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 7, 2024
@sjenning
Copy link
Contributor

sjenning commented Jun 7, 2024

Not sure exactly what is happening here, but this PR does seem to be causing EnsureNoCrashingPods to fail in kubevirt e2e. Other kubevirt presubs do not show this flake and we hit this in the last 3 runs. Junit not working for some reason as well make it even more annoying.

    util.go:533: Container cloud-controller-manager in pod kubevirt-cloud-controller-manager-84fb6f744c-9kljp has a restartCount > 0 (1)
    util.go:533: Container oauth-apiserver in pod openshift-oauth-apiserver-5f964f6b-dx7lm has a restartCount > 0 (1)

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_hypershift/4018/pull-ci-openshift-hypershift-main-e2e-kubevirt-aws-ovn/1798984858463113216/artifacts/e2e-kubevirt-aws-ovn/run-e2e-local/artifacts/TestCreateCluster/namespaces/e2e-clusters-dgknb-example-mhl8d/core/pods/logs/kubevirt-cloud-controller-manager-84f6cd4458-pfjlp-cloud-controller-manager-previous.log

@stevekuznetsov
Copy link
Contributor Author

@sjenning that's ok - certainly possible it's broken, will look.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 7, 2024
Copy link
Contributor

openshift-ci bot commented Jun 7, 2024

New changes are detected. LGTM label has been removed.

@stevekuznetsov
Copy link
Contributor Author

/test unit

Copy link
Contributor

openshift-ci bot commented Jun 7, 2024

@stevekuznetsov: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure 86aeff7 link false /test e2e-azure

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@stevekuznetsov
Copy link
Contributor Author

Kubevirt failing on

 === NAME  TestNodePool/HostedCluster0/Main/TestNodePoolReplaceUpgrade
    nodepool_upgrade_test.go:160: Validating all Nodes have the synced labels and taints
    nodepool_upgrade_test.go:167: Updating NodePool image. Image: registry.build05.ci.openshift.org/ci-op-z247rcb1/release@sha256:fc76bbd8dbe41816d167025346ad780301266bfda93cd95f214d89e74575e906
    util.go:428: Waiting for nodepool e2e-clusters-cd69f/example-6wrhv-test-replaceupgrade to report version 4.17.0-0.ci.test-2024-06-07-190031-ci-op-z247rcb1-latest (currently 4.17.0-0.ci-2024-06-07-041707) 
    util.go:435: Failed to get nodepool: client rate limiter Wait returned an error: context deadline exceeded
    util.go:440: 
        failed waiting for nodepool version
        Unexpected error:
            <context.deadlineExceededError>: 
            context deadline exceeded
            {}
        occurred 

🤔

The goal of this refactor is to reduce the complexity in the
command-line tooling. Overall, the changes here remove duplicative
structures that copied data around and co-locate the logic with the
data, instead of first aggregating all data into one uber-structure and
then conditionally acting on that structure. These refactors have a
number of benefits:

 - locality of behavior: in the past, it was very difficult if not
   impossible to determine where a value was used, as it would be bound
   to a flag in one package, copied around between container structs a
   couple of times, then have some generic logic act on the presence or
   absence of the value to e.g. change a field on the HostedCluster.
   Simply reading the generic logic was often not enough to understand
   what was going on, as many of the conditional branches in the example
   fixture code could only ever trigger for one specific platform, and
   you'd never know unless you traced how the example options
   uber-struct had its fields set in every provider.
 - clear go-to-definition: as a knock-on effect of the above, now
   there's *one* structure that holds a command-line flag and it's
   trivial to use the LSP when determining where that flag is used and
   how
 - composability: as exemplified in the KubeVirt NodePool code, we are
   able to compose commands as necessary. When commands re-use the same
   arguments with the same flags and the same validation logic, there's
   no need to copy things around and re-implement anything; by
   localizing flag binding, validation and option completion, we gain
   small, composable parts that we can use to build larger commands with

Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
We only bind flags in one routine now; breaking out explicitly the set
of flags that should only be exposed to developers in the `hypershift`
CLI. The net effect of this change is to expose `--base-domain-prefix`
and `--external-dns-domain` to users of `hcp`.

This change also shows how to change the defaults in an option set for a
command - the `hcp create cluster` command has a unique default for the
control plane availability policy, and its now evident that this is the
case since it has to be done explicitly after building the default set
of options.

Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
@stevekuznetsov stevekuznetsov force-pushed the skuznets/delegate-resource-creation branch from 86aeff7 to 62120ec Compare June 7, 2024 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ci-tooling Indicates the PR includes changes for CI or tooling area/cli Indicates the PR includes changes for CLI area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/testing Indicates the PR includes changes for e2e testing do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants