Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[occm] The pods cannot access each other when across nodes #2482

Closed
jeffyjf opened this issue Nov 28, 2023 · 8 comments · May be fixed by #2484
Closed

[occm] The pods cannot access each other when across nodes #2482

jeffyjf opened this issue Nov 28, 2023 · 8 comments · May be fixed by #2484
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@jeffyjf
Copy link
Contributor

jeffyjf commented Nov 28, 2023

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

I useed CAPO deployed a two nodes cluser. And start OCCM with below configurations:

[Global]
auth-url=http://192.168.1.29:5000
username=admin
password=123456
region=RegionOne
tenant-id=666853c5323d486cbd81d372359f8942
domain-id=default

[Networking]
public-network-name=ext-net

[LoadBalancer]
enabled=true
floating-network-id=a03550ce-f3ff-4b71-bd37-00c12bcaa3f0
floating-subnet-id=f0db7750-6d43-4259-a9f5-4b171469b2c7
lb-provider=amphora
subnet-id=6cf6198d-8e89-4bf3-9c6e-5dc79509118f
network-id=c2d42f5b-1b1e-4d25-94eb-ea1a7449a8da

[Metadata]
search-order=metadataService,configDrive

[Route]
router-id=89aac4b9-9004-464d-8e01-409149144e28  # Note: router controller be enabled here. And OCCM start with --allocate-node-cidrs=true and --cluster-cidr=10.255.0.0/16

And the CNI configurations like below:

{
    "cniVersion": "0.3.1",
    "name": "mynet",
    "type": "bridge",
    "bridge": "mynet0",
    "isDefaultGateway": true,
    "forceAddress": false,
    "ipMasq": true,
    "hairpinMode": true,
    "ipam": {
        "type": "host-local",
        "subnet": "10.255.0.0/24"     # Node 1
        "subnet": "10.255.1.0/24"     # Node 2
    }
}

But, the pods of node1 cannot access the pods of node 2 and vice versa.

What you expected to happen:

All of the pods can access each other.

How to reproduce it:

Deploy a multiple nodes cluser and config OCCM and CNI plugin as above.

Anything else we need to know?:

IMO, This due to the node's security group has no ingress rule to permit the network packet of other node's pods through directly

Environment:

  • openstack-cloud-controller-manager(or other related binary) version: xxx
  • OpenStack version: xxx
  • Others: xxx
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 28, 2023
@jeffyjf jeffyjf changed the title [occm] The other node's pod cannot be access [occm] The pods cannot access each other when across nodes Nov 28, 2023
@dulek
Copy link
Contributor

dulek commented Nov 28, 2023

I think CPO expects CAPO or other deployment tool to manage this. Why can't it be done there?

@jeffyjf
Copy link
Contributor Author

jeffyjf commented Nov 29, 2023

I think CPO expects CAPO or other deployment tool to manage this. Why can't it be done there?

As I mentioned here, This is route controller's duty to ensure the containers on different nodes in one Kubernetes cluster can communicate with each other.

@kayrus
Copy link
Contributor

kayrus commented Dec 12, 2023

@jeffyjf are you sure that the problem lies in node's security groups? could it be related to #2491?

@jeffyjf
Copy link
Contributor Author

jeffyjf commented Dec 13, 2023

@jeffyjf are you sure that the problem lies in node's security groups?

Yep, I'm sure. I've already done test, add extra node's security group can deal with this issue.

could it be related to #2491?

They are different issues. For ingress network traffic, the AllowedAddressPairs just used to check destination address, the SecurityGroupRule used to check source address. They must all be set for a new node.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 12, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 11, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale May 11, 2024
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
5 participants