Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃悰: skip APIServerELB DNS name resolution if internal #4976

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

r4f4
Copy link
Contributor

@r4f4 r4f4 commented May 10, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

Skip APIServerELB DNS name resolution check when the LB is internal. The check will never work in air-gapped systems.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #4975

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emojis
  • adds unit tests
  • adds or updates e2e tests

Release note:

Skip DNS name resolution check for internal APIServerELB.

The check will never work in air-gapped systems.
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 10, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @r4f4. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign vincepri for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@r4f4
Copy link
Contributor Author

r4f4 commented May 10, 2024

/cc @mtulio @patrickdillon

@k8s-ci-robot
Copy link
Contributor

@r4f4: GitHub didn't allow me to request PR reviews from the following users: mtulio, patrickdillon.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @mtulio @patrickdillon

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@Ankitasw Ankitasw added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 20, 2024
Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 20, 2024
@damdo
Copy link
Member

damdo commented May 20, 2024

/test pull-cluster-api-provider-aws-test

@r4f4
Copy link
Contributor Author

r4f4 commented May 20, 2024

/hold
While we double check if the name is resolvable in a SC2S region.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 20, 2024
@AndiDog
Copy link
Contributor

AndiDog commented May 27, 2024

There's a chat thread about why the DNS resolving wait code was likely there. It might be to check LB availability. Did you see problems or delays after this change, such as CAPA requiring an additional wait before reconciling further (e.g. because the LB/DNS wasn't available yet)? That's the only concern I have. Otherwise LGTM.

@r4f4
Copy link
Contributor Author

r4f4 commented May 27, 2024

There's a chat thread about why the DNS resolving wait code was likely there. It might be to check LB availability. Did you see problems or delays after this change, such as CAPA requiring an additional wait before reconciling further (e.g. because the LB/DNS wasn't available yet)? That's the only concern I have. Otherwise LGTM.

When we encountered the issue we were creating a cluster in C2S via an emulator (SHIFT). The issue did not happen in another emulator (Combine). We're trying to test on a real C2S/SC2S env to determine if the internal LB name is resolvable. If so, it's an emulator bug. If not, I'll test again and pay attention to the reconcile process.

@AndiDog
Copy link
Contributor

AndiDog commented May 29, 2024

Please also test on a real, public AWS region if you can. It would be interesting to see how removing this wait behaves, in order to find out whether it could be removed overall, or if then we need some retries for the control plane setup or so.

@nrb
Copy link
Contributor

nrb commented May 31, 2024

I'll wait to review until hearing back on whether or not this happens in the real AWS environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

APIServerLB DNS name resolution causes deploy failure on air-gapped systems
6 participants