Replication controller #41

jdef · 2014-10-11T00:55:15Z

Development branch to support k8s replication controller.

In progress, unstable. Use at your own risk.

…ing anyway will resolve deadlock later

jdef · 2014-10-11T11:35:16Z

Deadlock was because I branched from master before merging the changes from multinode_support. After sync'ing up there appears to be no more deadlock.

jdef · 2014-10-11T12:18:13Z

Todo:

update README with instructions for building and running controller-manager (./bin/controller-manager -master=$(hostname):8080 -v=2)
cleanup comments in scheduler
testing

jdef · 2014-10-17T11:34:31Z

Replication control looks functional at this point. @ConnorDoyle PTAL

ConnorDoyle · 2014-10-17T17:04:44Z

Taking a look now, thanks @jdef

ConnorDoyle · 2014-10-17T19:49:49Z

scheduler/scheduler.go

+			return nil
+		}
+		taskId := &mesos.TaskID{Value: proto.String(task.ID)}
+		return k.Driver.KillTask(taskId)


Does the executor wait around for the pod to be destroyed by the local kubelet before sending back a TaskStatus with state TASK_KILLED?

No, that's a TODO item. Currently the executor sends a SET message to the kublet with the collection of all pods minus the pod to delete: this causes the kubelet to delete the pod when transitioning to the new desired end state. there appears to be an /events endpoint in the kubelet that we may be able to watch in order to pick up on such an event

@ConnorDoyle What's the right way to send the message "kill this pod that I launched but has not yet reported RUNNING back to master yet"?

I think the code here is doing the right thing, and it's the code in the executor that needs to account for various pod states when it receives the KILL signal, right?

…state with SET calls

…er to replace hardcoded etcd namespace, eliminate unsupported endpoints

…back into Go here and go panics when we tried to delete from slave.offers; thinking that the slave entry was undefined

…ey are in the running list

…king problems ahead...

jdef · 2014-10-24T22:07:38Z

current status:

replication control seems to work ok
guestbook examples (services) won't really work until we address Networking TBD. #5

thinking that the networking piece should be resolved in another PR (would be nice to have Vagrant set it all up for us).

… to the caller

…urrent and desired state; update desired state host at binding time

jdef · 2014-10-27T18:34:42Z

@ConnorDoyle PTAL. Planning to merge this soon.

ConnorDoyle · 2014-10-28T23:49:16Z

README.md

@@ -6,7 +6,15 @@ When [Google Kubernetes](https://github.com/GoogleCloudPlatform/kubernetes) meet

 [![GoDoc] (https://godoc.org/github.com/mesosphere/kubernetes-mesos?status.png)](https://godoc.org/github.com/mesosphere/kubernetes-mesos)

-Kubernetes and Mesos are a match made in heaven. Kubernetes enables the Pod (group of co-located containers) abstraction, along with Pod labels for service discovery, load-balancing, and replication control. Mesos provides the fine-grained resource allocations for pods across nodes in a cluster, and can make Kubernetes play nicely with other frameworks running on the same cluster resources. Within the Kubernetes framework for Mesos, the framework scheduler first registers with Mesos and begins watching etcd's pod registry, and then Mesos offers the scheduler sets of available resources from the cluster nodes (slaves/minions). The scheduler matches Mesos' resource offers to unassigned Kubernetes pods, and then sends a launchTasks message to the Mesos master, which claims the resources and forwards the request onto the appropriate slave. The slave then fetches the kubelet/executor and starts running it. Once the scheduler knows that there are resource claimed for the kubelet to launch its pod, the scheduler writes a Binding to etcd to assign the pod to a specific host. The appropriate kubelet notices the assignment, pulls down the pod, and runs it.


ConnorDoyle · 2014-10-29T01:49:12Z

@jdef looks great, left some comments which are mostly minor. Thanks!

…r readability

jdef · 2014-10-29T04:16:13Z

@adam-mesos @ConnorDoyle Thanks for the feedback, I've pushed some commits to address the concerns.

ConnorDoyle · 2014-10-29T04:19:18Z

Let's get this in! Thanks for all your hard work James.

Replication controller

James DeFelice added 4 commits October 11, 2014 00:12

build replication controller

9e9da5e

attempted to fix delete pod problems; introduced a dead-lock, committ…

15173f0

…ing anyway will resolve deadlock later

added race detection

9bb1230

sync to master

b896659

James DeFelice added 10 commits October 13, 2014 22:55

remove incorrect comment

32fb904

add steps for building and running a replication controller

44ab1a6

fixed incorrect references to localhost

4602ee1

mark replication control as completed

6e9fc27

apply gofmt

6852bae

race flag should be optional

107ca0c

added missing -address flag to framework startup

0e28562

updated examples to match reality

986c00c

initial revision

01f95c8

added docs for launching a replication controller

b9427db

ConnorDoyle reviewed Oct 17, 2014
View reviewed changes

James DeFelice added 9 commits October 23, 2014 12:11

stop watching etcd for pod state changes

5875c92

revert back to snapshot updates, but actually report the current pod …

12d1eaf

…state with SET calls

remove pod bind and delete deps on etcd; replace default k8s api serv…

af95889

…er to replace hardcoded etcd namespace, eliminate unsupported endpoints

attempt to fix SIGSEGV, perhaps related to #7: mesos-go C-impl calls …

7c137d0

…back into Go here and go panics when we tried to delete from slave.offers; thinking that the slave entry was undefined

added logging for proxy

5757945

replicate HACK for pods not showing in Running status, even though th…

7737efa

…ey are in the running list

attempt to get better (JSON) logging of returned pods

169b47d

populate pod network information from running "net" container, networ…

71c27c5

…king problems ahead...

log if TaskStatus.Data is missing

3d466e1

James DeFelice added 2 commits October 27, 2014 15:02

clean up bad error handling in getPidInfo, always pass the error back…

1d9fb75

… to the caller

centralize logic to delete offer state; better state management for c…

8bf9a2e

…urrent and desired state; update desired state host at binding time

James DeFelice added 2 commits October 28, 2014 20:58

formatting

4391f06

clarified overview; added bit about logging

e325e55

ConnorDoyle reviewed Oct 28, 2014
View reviewed changes

James DeFelice added 2 commits October 29, 2014 00:06

expanded package globs so make on ubuntu no longer complains

07ec8a0

address review comments: updated docs and shifted some code for bette…

e4346d0

…r readability

ConnorDoyle added a commit that referenced this pull request Oct 29, 2014

Merge pull request #41 from mesosphere/replication_controller

f06bb31

Replication controller

ConnorDoyle merged commit f06bb31 into master Oct 29, 2014

ConnorDoyle deleted the replication_controller branch October 29, 2014 04:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replication controller #41

Replication controller #41

jdef commented Oct 11, 2014

jdef commented Oct 11, 2014

jdef commented Oct 11, 2014

jdef commented Oct 17, 2014

ConnorDoyle commented Oct 17, 2014

ConnorDoyle Oct 17, 2014

jdef Oct 24, 2014

jdef Oct 24, 2014

jdef commented Oct 24, 2014

jdef commented Oct 27, 2014

ConnorDoyle Oct 28, 2014

ConnorDoyle commented Oct 29, 2014

jdef commented Oct 29, 2014

ConnorDoyle commented Oct 29, 2014

		@@ -6,7 +6,15 @@ When [Google Kubernetes](https://github.com/GoogleCloudPlatform/kubernetes) meet

		[![GoDoc] (https://godoc.org/github.com/mesosphere/kubernetes-mesos?status.png)](https://godoc.org/github.com/mesosphere/kubernetes-mesos)

		Kubernetes and Mesos are a match made in heaven. Kubernetes enables the Pod (group of co-located containers) abstraction, along with Pod labels for service discovery, load-balancing, and replication control. Mesos provides the fine-grained resource allocations for pods across nodes in a cluster, and can make Kubernetes play nicely with other frameworks running on the same cluster resources. Within the Kubernetes framework for Mesos, the framework scheduler first registers with Mesos and begins watching etcd's pod registry, and then Mesos offers the scheduler sets of available resources from the cluster nodes (slaves/minions). The scheduler matches Mesos' resource offers to unassigned Kubernetes pods, and then sends a launchTasks message to the Mesos master, which claims the resources and forwards the request onto the appropriate slave. The slave then fetches the kubelet/executor and starts running it. Once the scheduler knows that there are resource claimed for the kubelet to launch its pod, the scheduler writes a Binding to etcd to assign the pod to a specific host. The appropriate kubelet notices the assignment, pulls down the pod, and runs it.

Replication controller #41

Replication controller #41

Conversation

jdef commented Oct 11, 2014

jdef commented Oct 11, 2014

jdef commented Oct 11, 2014

jdef commented Oct 17, 2014

ConnorDoyle commented Oct 17, 2014

ConnorDoyle Oct 17, 2014

Choose a reason for hiding this comment

jdef Oct 24, 2014

Choose a reason for hiding this comment

jdef Oct 24, 2014

Choose a reason for hiding this comment

jdef commented Oct 24, 2014

jdef commented Oct 27, 2014

ConnorDoyle Oct 28, 2014

Choose a reason for hiding this comment

ConnorDoyle commented Oct 29, 2014

jdef commented Oct 29, 2014

ConnorDoyle commented Oct 29, 2014