Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustermesh: fix a few issues in the new mcs api service controller #32555

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

MrFreezeex
Copy link
Member

@MrFreezeex MrFreezeex commented May 15, 2024

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

This commit does 3 small fixes:

  • Use the correct upstream MCS-API controller. The controller used
    is now the ones that sync the service IP to the ServiceImport resources.
    The rest of the controllers are Cilium specific and will (or already is)
    be implemented soon.
  • Also add a shortcut on creation to save a delete/recreate on
    of the derived service if there is no ServiceImport and the local is
    headless.
  • Fix the watch on Services to also issue a reconcile if the locally
    exported Service has changed

Related to #27902

Fix a few issues with the newly added MCS-API controllers

@MrFreezeex MrFreezeex requested review from a team as code owners May 15, 2024 17:26
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 15, 2024
@MrFreezeex
Copy link
Member Author

/test

@YutaroHayakawa
Copy link
Member

CC: @giorio94

Copy link
Member

@YutaroHayakawa YutaroHayakawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate more about how this annotation is used within the service importing/exporting flow? Currently it is very hard to follow it.

pkg/annotation/k8s.go Outdated Show resolved Hide resolved
@MrFreezeex
Copy link
Member Author

MrFreezeex commented May 16, 2024

Could you elaborate more about how this annotation is used within the service importing/exporting flow? Currently it is very hard to follow it.

Sure! Will try to explain it a bit better on the commit description/code comment but in the meantime here is an explanation:

This mcsapi implementation relies on a derived service that inherits mainly from the ServiceImport object. The spec of this derived service (named derived-$hash) is aggregated from all the remote services that has the same name/ns (+ if it's shared/has a Service export ofc).

As the derived service is already the aggregated/conflict resolved version we need to somehow have a way to find which cluster has what property before aggregating them, this is maybe a bit unclear in this PR because this part is not present yet. The aggregation/conflict resolution will be in another controller whose job is to create ServiceImport. I made already good progress on this but I need to polish it a bit more and I wanted to split a bit the review load (but then I realized that it made this PR a bit unclear sorry).

pkg/service/store/store.go Outdated Show resolved Hide resolved
@@ -236,6 +236,12 @@ func (r *mcsAPIServiceReconciler) Reconcile(ctx context.Context, req ctrl.Reques
if localSvc != nil {
svc.Spec.Selector = localSvc.Spec.Selector
svc.Spec.Ports = localSvc.Spec.Ports

// Use the local Service on creation to save up a potential round trip of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the relationship with the comment and the code. How does setting the ClusterIP save a "roundtrip"? What was the original reason for adding this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is triggered when the ServiceImport object is not yet created and you only have a ServiceExport.

Previously in that case the Service was always created as non headless which means that if the ServiceImport that would most likely be created right after (by a controller that doesn't exist yet) would retrigger a reconcile of this controller and if it is headless the "derived service" would then be deleted and recreated.

With this new condition we rely on the local service to determine the headlessness of the derived service so that we might save up the deletion+recreation/not have to switch the headlessness. This however won't be the case if there is a conflict on this property though.

I will update the comment to be a bit more explicit there.

pkg/service/store/store.go Outdated Show resolved Hide resolved
pkg/service/store/store.go Outdated Show resolved Hide resolved
@YutaroHayakawa
Copy link
Member

@MrFreezeex Hi, thanks for your clarification!

As the derived service is already the aggregated/conflict resolved version we need to somehow have a way to find which cluster has what property before aggregating them, this is maybe a bit unclear in this PR because this part is not present yet. The aggregation/conflict resolution will be in another controller whose job is to create ServiceImport.

Ok, then my next question is we could get this disaggregated information from ServiceExport + the actual exported service. Then why do we need to introduce this new annotation?

@giorio94
Copy link
Member

giorio94 commented May 17, 2024

Ok, then my next question is we could get this disaggregated information from ServiceExport + the actual exported service. Then why do we need to introduce this new annotation?

I second Yutaro's comment, here. It would be nice to have a diagram (maybe linked to the original issue) which depicts the flows you are thinking about, both in terms of what users would do, and what Cilium will automatically do to reconcile the information. That would help a lot in being able to properly review this PR (and the follow-ups), as we are otherwise missing the context about the next steps, and it becomes difficult to provide useful comments.

pkg/clustermesh/mcsapi/service_controller.go Outdated Show resolved Hide resolved
pkg/clustermesh/mcsapi/service_controller.go Outdated Show resolved Hide resolved
pkg/service/store/store.go Outdated Show resolved Hide resolved
pkg/annotation/k8s.go Outdated Show resolved Hide resolved
@MrFreezeex MrFreezeex force-pushed the mcs-api-svcspec branch 2 times, most recently from 0eb759c to 96c5712 Compare May 17, 2024 15:17
@MrFreezeex
Copy link
Member Author

@giorio94 @YutaroHayakawa @joamaki thanks for the feedback, I will try to create a dedicated CFP issue with some schema of what the ServiceImport auto creation is doing in more details and present a few other alternatives so that we can choose one that make sense with all the info at hands 👍.

@MrFreezeex MrFreezeex changed the title clustermesh: add a MCS-API spec annotation for ServiceImport sync clustermesh: fix a few misc issue in the new mcs api controller Jun 2, 2024
@MrFreezeex MrFreezeex changed the title clustermesh: fix a few misc issue in the new mcs api controller clustermesh: fix a few issues in the new mcs api service controller Jun 2, 2024
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex
Copy link
Member Author

MrFreezeex commented Jun 2, 2024

Hi! Since we are having a separate discussion for how data should be shared across clusters to allow automatic ServiceImport creation I dropped the second commit about that and only kept the first one which brought a few fixes. I wanted to keep same PR so that the assigned reviewers don't change (although now it should probably be a sig-clustermesh only change), hopefully that's ok 😅.

This commit does 3 small fixes:
- Use the correct upstream MCS-API controller. The controller used
  is now the ones that sync the service IP to the ServiceImport resources.
  The rest of the controllers are Cilium specific and will (or already is)
  be implemented soon.
- Also add a shortcut on creation to save a delete/recreate on
  of the derived service if there is no ServiceImport and the local is
  headless.
- Fix the watch on Services to also issue a reconcile if the locally
  exported Service has changed

Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
@MrFreezeex
Copy link
Member Author

/test

@MrFreezeex
Copy link
Member Author

/ci-e2e

@MrFreezeex
Copy link
Member Author

/ci-runtime

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cleanup changes look good to me, thanks!

And sorry for the delay in reviewing the CFP, I plan to share my comments by today.

@giorio94 giorio94 requested review from joamaki and removed request for ysksuzuki June 3, 2024 10:16
@giorio94 giorio94 added kind/cleanup This includes no functional changes. area/clustermesh Relates to multi-cluster routing functionality in Cilium. release-note/misc This PR makes changes that have no direct user impact. labels Jun 3, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. kind/cleanup This includes no functional changes. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants