Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add template support for network aliases #2250

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

benturner
Copy link

Closes #2249

@stevvooe am I on the right track here? Networks aren't part of the ContainerSpec so I think I have to do it this way... Obviously this needs tests but 'd like some guidance before I commit to this path. Thanks.

@stevvooe
Copy link
Contributor

These need to parameterize the user input, which is on https://github.com/docker/swarmkit/blob/master/api/types.proto#L503.

However, to have an effect cluster-wide, so that it works on aliases, for instance, the template expansion needs to happen in the orchestrator.

I haven't looked at this in awhile, so perhaps, @aaronlehmann can point you in the right direction.

@aaronlehmann
Copy link
Collaborator

However, to have an effect cluster-wide, so that it works on aliases, for instance, the template expansion needs to happen in the orchestrator.

@stevvooe: Not too familiar with how aliases work; can you explain why they need to be expanded on the manager side? With a quick grep I don't see anywhere that the value is interpreted on the manager.

NetworkAttachmentConfig is inside the TaskSpec - would we be comfortable changing that on the manager?

@stevvooe
Copy link
Contributor

NetworkAttachmentConfig is inside the TaskSpec - would we be comfortable changing that on the manager?

Probably not.

Aliases are network-level aliases for the task, so I assume they need to be made available across the network, unless they are announced here?

@aaronlehmann
Copy link
Collaborator

Do you mean service discovery? The managers are currently not involved in service discovery.

if err != nil {
return networks, errors.Wrap(err, "expanding alias failed")
}
aliases[indexAliases] = alias
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be overwriting the Aliasses entry in t.Networks, not in the copy that was made above.

networks := make([]*api.NetworkAttachment, len(t.Networks))

ctx := NewContextFromTask(t)
var err error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary to predeclare this variable.

@benturner
Copy link
Author

Perhaps the expansion needs to happen here then? https://github.com/docker/swarmkit/blob/master/manager/allocator/network.go#L641

Copy link
Collaborator

@aaronlehmann aaronlehmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think doing the template substitution in the executor, as this PR does, is the right approach. I don't see anywhere that network aliases are being used in manager code. That said, it would be great if others could confirm this is the case and that it won't change in the future if we add alternatives to gossip (cc @fcrisciani @abhinandanpb @sanimej @diogomonica @aluzzardi).

Note that for this to work in the context of docker, a similar change would be needed in the executor implemented in the moby/moby repo.

@stevvooe
Copy link
Contributor

@benturner Typically, we leave the spec untouched (I was really wrong above, please accept my apology). We need to figure the right place such that the aliases are run through templating, but then can propagate to the network correctly.

@aaronlehmann has included some individuals who can help here.

@benturner
Copy link
Author

benturner commented Jun 14, 2017

I'm worried that the executor is the wrong spot though... I followed the example of @stevvooe 's hostname templatization, but that only affects the individual container. The network alias, on the other hand, needs to be visible to multiple containers on multiple nodes, right? So doing the expansion elsewhere seems to make more sense to me...

@stevvooe
Copy link
Contributor

@benturner You're spot on, but I am not sure that it is in the manager. It is in some part of the executor, perhaps in the moby/moby version.

@aaronlehmann
Copy link
Collaborator

Yes, nodes gossip amongst themselves to discover the tasks on each network. So far, the manager isn't involved in this process.

@benturner
Copy link
Author

Ah, I see. Sneaky...
So in addition to the changes I have here already should I also be modifying the task spec's Networks? I see that https://github.com/docker/swarmkit/blob/master/agent/exec/dockerapi/container.go#L71 modifies the task spec's Runtime already.

@aaronlehmann
Copy link
Collaborator

It should probably expand t.Networks (I think the PR is currently doing this, but only by accident - #2250 (comment)), but I don't think it's necessary to expand t.Spec.Networks.

@benturner benturner force-pushed the 2249-template-network-alias branch from 88e0443 to 910cbb5 Compare June 15, 2017 16:07
@codecov
Copy link

codecov bot commented Jun 15, 2017

Codecov Report

Merging #2250 into master will increase coverage by 0.08%.
The diff coverage is 73.07%.

@@            Coverage Diff             @@
##           master    #2250      +/-   ##
==========================================
+ Coverage   60.34%   60.42%   +0.08%     
==========================================
  Files         125      124       -1     
  Lines       20391    20288     -103     
==========================================
- Hits        12305    12260      -45     
+ Misses       6692     6653      -39     
+ Partials     1394     1375      -19

@benturner benturner force-pushed the 2249-template-network-alias branch from 910cbb5 to 07a90e1 Compare June 15, 2017 16:07
@benturner
Copy link
Author

@aaronlehmann I have updated the PR with your suggested changes and test coverage, please take another look? Thanks.

@aaronlehmann
Copy link
Collaborator

LGTM

It would be amazing if you have time to make the corresponding changes to https://github.com/moby/moby/tree/master/daemon/cluster/executor/container and test it in a real Docker setup, for example to confirm that the expanded aliases are propagated through the cluster.

@benturner
Copy link
Author

Hrm, I was sort of assuming that after this landed docker would just do a re-vendoring (isn't that what it's called?) of swarmkit and it would magically work... Is the code over there not related to this code?

@aaronlehmann
Copy link
Collaborator

There are multiple executor implementations. The one you're editing here uses the docker remote API to control containers. It's only used for testing purposes, since when swarmkit is used as part of docker, it's not a separate daemon communicating with docker through its HTTP API. The code I linked to is the executor actually used inside of docker. It would need a similar change after vendoring swarmkit for this to take effect in docker.

@benturner
Copy link
Author

Ok, I'll tackle from moby/moby#33408 then. Thanks!

@stevvooe
Copy link
Contributor

@benturner There are two parts to it: you need the template expansion implemented in swarmkit and then there needs to be a executor code to call that in moby.

@benturner
Copy link
Author

Got it, thanks. So now once this gets merged into swarmkit how do I pull the template change into moby?

@aaronlehmann
Copy link
Collaborator

@benturner: You don't need to wait for it to be merged to pull the change into moby. You can change the swarmkit line in moby's vendor.conf to:

github.com/docker/swarmkit 07a90e1842ea870741c44258c898dd33771fb0dc https://github.com/benturner/swarmkit

And then run vndr github.com/docker/swarmkit (you can get the vndr tool with go get github.com/LK4D4/vndr).

Copy link
Contributor

@stevvooe stevvooe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to parameterize the driver opts, as was requested in the original issue.

@benturner
Copy link
Author

Hrm, where does that code live?

@stevvooe
Copy link
Contributor

@benturner It is on the same structure you are templating, the NetworkAttachment.

@benturner benturner force-pushed the 2249-template-network-alias branch from 07a90e1 to f9b7a40 Compare June 17, 2017 01:57
@benturner
Copy link
Author

@stevvooe I added the driver-opt support, please take another look? Thanks.

@benturner benturner force-pushed the 2249-template-network-alias branch from f9b7a40 to 06fa563 Compare June 17, 2017 02:25
Signed-off-by: Ben Turner <bent@silklabs.com>
@benturner benturner force-pushed the 2249-template-network-alias branch from 06fa563 to 01b18a7 Compare June 17, 2017 02:29
@benturner
Copy link
Author

benturner commented Jun 20, 2017

@aaronlehmann @stevvooe so after playing around with this a little I think we may have a problem... libnetwork expects that a service will have a consistent set of service aliases at creation time across all tasks for that service. If we allow templating in the way I have implemented here then you can easily construct aliases that are not consistent across tasks with more than one replica (e.g. {{.Service.Name}}-{{.Task.Slot}} -> foo-1 for one task, foo-3 for another). libnetwork just doesn't handle this presently (it ends up creating a single alias endpoint with whichever task happens to register first).

@fcrisciani and I discussed this a bit from the libnetwork perspective yesterday and he suggested that we should translate these per-task-aliases as task aliases (pointing to the individual container IP) rather than service aliases (pointing to the service VIP or RR dns endpoint). I think that makes sense, but then I don't know how we would call this feature a "templated service alias" any more...

Then there's the added complexity that some templated aliases will be consistent across all tasks (e.g. legacy-{{.Service.Name}}), and I would think that those should actually point to the service VIP, not each task IP...

Thoughts? Directions? I'm a bit at sea here...

@aaronlehmann
Copy link
Collaborator

I'm not sure I understand what problem templating network aliases is solving. It might help answer your questions if you can give a few example use cases.

@benturner
Copy link
Author

My primary motivation is moby/moby#33408 (comment) but @mark-church wanted something for driver-opts too in #2207

@stevvooe
Copy link
Contributor

@aaronlehmann Do we not have the concept of a per-task alias? That might be the hang up here. If the templating is not task-specific, then it doesn't make a lot of sense.

@fcrisciani
Copy link

@stevvooe @aaronlehmann in libnetwork we have already:

  • ServiceAliases: common to all tasks of a service and associated to the VIP of the service.
  • TaskAliases: unique per task and associated to the container IP.
    As for now I guess these 2 semantics should be enough to cover new aliases

@benturner
Copy link
Author

Per-task aliases exist in libnetwork, but I don't see anything like that in swarmkit...

@abhi
Copy link
Contributor

abhi commented Jun 20, 2017

@benturner aliases for service in the form of --network-opt was introduced by #2176 . The aliases were service aliases per network the service was connected to. From what I understood from the requirement the templating support was to be added for --network-opt which takes in network name , alias (service alias) and driver options per network.

@benturner
Copy link
Author

Sure, but that basically leaves us with the problems I outlined in #2250 (comment) right? libnetwork doesn't handle a service with an inconsistent set of aliases across tasks.

@benturner
Copy link
Author

@stevvooe @aaronlehmann how should I proceed here?

@thaJeztah
Copy link
Member

For per-task aliases, I think the Docker CLI would need to have something like the container-alias option as I mentioned in moby/moby#31964 (similar to container-label).

Sure, but that basically leaves us with the problems I outlined in #2250 (comment) right? libnetwork doesn't handle a service with an inconsistent set of aliases across tasks.

Doesn't the container get both the default name/alias, and the custom alias? Or is the problem that libnetwork service discovery wouldn't return the custom aliases (so they're only usable if you know the alias).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants