New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validated kernel version / filesystem plugin for 1.3? #30706
Comments
Do you have any information on the panic? |
@justinsb can you post the panic? (it's available via the AWS console, in general). Are you doing heavy I/O on the docker root volume? Is aufs in the call trace? I've seen a panic with 1.2 and newer docker daemon running on a container: #27885. If that is what you see, my workaround was to use a different backing volume and not write to the container root (but not tried k8s 1.3). I don't know which version it is validated, but following links from the bug above it seems this issue is the one to track it and was not closed yet: #25893. Don't know if you already found it, it took me a while because I was looking at closed issues :) |
Meta: if anyone knows how to persuade debian / journald to log panics, that would be appreciated :-) Thanks for the links @rata - it looks a lot like the docker issue you linked to (but was different from your aufs issue): moby/moby#21081 Panics consistently appear to be in process scheduling. Looking at the kernel changelog for this, it seems very likely that a newer kernel will fix this (the source file in question changes a lot), but AFAICT GCE doesn't use a newer kernel either with containervm.
|
@justinsb cool. Just curious, what kernel in debian jessie (like |
@rata (I think that's what you mean, right?) |
@justinsb We need to investigate on this. We didn't change minimal kernel version and docker version for both 1.3 and 1.4 release. I did observe the same kernel panics before 1.3 release and we even introduces a node-problem-detector daemonset to make the issue visible. But with kubernetes 1.3 + docker 1.11.2 together on containervm e2e tests, I haven't seen such kernel panics. One possibility is the different configurations on the node? For example, on GCE, we switched to use kubernet, not using docker's network component at all. How about aws? |
Thanks @dchen1107 AWS is not using kubenet (though it should). I can probably upgrade the cluster most affected to use kubenet and see if it makes a diference. Also, for unknown reasons, we aren't logging the kernel panic into the journald logs. Would the NPD pick up on it if it is not in the logs, but only visible e.g. in uptime? We currently get the hint from the uptime, and then confirm by looking at the AWS console output. |
cc @fandingo |
@justinsb today NPD can only pick up the issue from the logs. But which logs, and what format with regex can be configured easily. In kernel panic case, Is there anyway to get console screenshot during reboot, something like startup script? |
Just switched to kubenet and had another panic within a few hours (in the scheduler again).
|
@dchen1107 I'm going to look at whether we can be sure to collect the panic output into the journald log. I would much rather do that "normally" instead of scraping the console output, but worst case it is a good suggestion! |
We've tried a lot of things, and the only thing that seems to work is to run a 4.4 kernel (we are running a 4.4.19 kernel I built). This also entails running overlayfs. I'm going to work on building an image that offers this as an option, and maybe we can consider making it the default on 1.4 on AWS. |
@justinsb Google GCI team is currently working together with node team to qualify GCI image for GKE/GCE replacing today's debian-based container-vm image. Since you are considering to rebase your image to newer kernel, why not sync with what GCI has so that we can provide better support for our aws users? Currently GCI image is on 4.4.14. cc/ @yinghan @kubernetes/goog-image @justinsb could you please provide the detail information on what are not working on your current image? I understood there are kernel panic, unfortunately we saw such panic at very rare rate in house. But looks like in your case, it occurs very frequently. We need to understand the discrepancy and try to minimize them so that we can consolidate our effort to provide the better service in general to the users. Meanwhile, since Kubernetes is an open-source project, we can only harden minimal kernel version (3.10?) and many other configurations, instead of forcing a homogeneous environment, especially for the users running Kubernetes on the private clouds. |
@dchen1107 Yes, for the AWS image I will try to sync with GCI as closely as I can, if we go to 4.4 as the default on AWS. Although if you're maintaining 4.4.14 + patches, I do wonder if official upstream (4.4.19) would be better than 4.4.14 without patches. I don't know how different these will be in practice though. My feeling is that testing effort will inevitably focus on 4.4 kernels & overlayfs, if GCI, CoreOS & Ubuntu 16.04 are all on 4.4 kernels, and remaining on an earlier kernel is just going to cause more and more problems. The panics are as I reported previously in the thread. The stack traces do not seem to change significantly: always in the scheduler, always segfaults slightly offset from 0x0. I have not been able to track down any root cause other than the m4 instance types, which are slightly newer chips, and typically with more cores. I originally thought it was a particular version of the ixgbevf driver, but the evidence does not appear to support that (unless all the versions I tried are bad, and the version in the 4.4 kernel fixes it, which is not impossible - the driver is pretty active). I totally understand that we can't mandate 4.4, but we should not shy away from saying "here is the configuration k8s tests, and we run a lot of tests". The node team does a lot of under-appreciated work to find a working configuration, and I want to remain reasonably close to that with the default AWS configuration. I tried getting the segfault to appear reliably in the journal and have given up. Capturing the aws console output and dumping it to a file seems more fruitful. Haven't yet had time to do that though... |
@dchen1107 Assigning to you for triage right now. AS it is a 1.4 P0. If you are the wrong person I apologize. Also @thockin and @vishh should have this on their radar. |
@matchstick No need to apologize, this is on my radar all the time. And I know @justinsb is working on this actively and diligently. The only reason I didn't assign to either me or @justinsb is that this is not in 1.4 milestone, and I cannot do much on aws's image besides offering the suggestions. @justinsb Do you think this should block 1.4 release? @justinsb yes, you and I are on the same page to align AWS node configuration with GCE ones so that we can get rid of one more discrepancy between AWS and GKE, and provide better support to our users on AWS. Thanks for all your support & effort on getting AWS configuration as close as possible to ours. GCI team is working on open source GCI. If that process is finalized, we can help to build an AWS kubernetes image based on GCI. Unfortunately we are not there yet. cc/ @yinghan @aronchick @mansoorj On another hand, node team maintains node e2e test infrastructure and performance the tests on a list of images. You can access the test result at https://k8s-testgrid.appspot.com/google-node. We didn't attempt to exclude any images from that project. If you build your image based 4.4.19, you can add your image to our image project, we can include that in our test metrics too. The only caveat is that we shouldn't make the breakage on those image block our submit-queue. |
I am leaving this in the 1.4 release as a blocking issue. Please continue to provide daily status updates to this issue or move it out of the release and into v1.4-nonblocking where it will no longer be tracked as a release blocking issue in the burndown meetings. |
I can probably take this as assigned to me. I don't think it should block the whole k8s release, but it is a pretty serious issue for AWS and I want to get it resolved for 1.4. I'm planning on taking a look at the kernel you are running in GCI and trying to figure out how many patches you are carrying (any links greatly appreciated). My concern is that if we ship an image with our own kernel, we then have to build the kernel/AMI going forward whenever there is e.g. a security issue. It is pretty attractive to me to just build from the official 4.4 Linux LTS kernel and rely on their work, rather than trying to maintain a set of patches and get involved in kernel cherry-picks. And I believe this problem may go away around the end of the year with debian, because the next version of debian should include the next Linux LTS kernel. And big thanks @dchen1107 for helping me figure out which kernel we should be running here! I'll update as I continue looking around (probably with more questions, I'm afraid). If you'd rather assign to me please do so! |
@vishh Debian Jessie has an older kernel which kernel panics on m4 instance types. I suspect we just haven't been seeing it because (1) it is rare and (2) it only happen on certain instance types and (3) it is non-trivial to capture a kernel panic on a systemd system, so it might be happening and we might just not be noticing. I don't know if the NPD would detect kernel panics on non-journald systems? Ubuntu Trusty is really old, and it seems very likely to fall victim to the same problems (although Ubuntu does a good job of backporting kernels). However, if we're going to run Trusty with a backported kernel, at that point we might as well use Ubuntu Xenial (which is 4.4, LTS, and systemd). I think that is another option, but I also want an option for Debian because we had to go with debian for 1.2 because there was no sufficiently supported version of Ubuntu at that time, and we want some continuity. |
@justinsb I leave this with 1.4 milestone and P0, but marked it as non-blocker. I assigned this one to you since you are the one doing the real work. I still leave myself as another assignee to help you figure out the short-term workaround and work on the long-term strategy on qualifying the image running on AWS. Building from the official 4.4 Linux LTS kernel sounds good to me for me. Again, let's add your image to node e2e test image project and run test suite against that daily. @vishh Please read the initial description @justinsb wrote. They are using debian jessie (3.16.0-4-amd64 #1 Debian 3.16.7-ckt25-2+deb8u3) for AWS already, and observing the kernel crashes with high frequency. The initial suggestion is making those kernel crash visible for the end user / cluster admin first. We added the support to NodeProblemDector, but @justinsb ran into some issues of logging the kernel panic into the journald logs on AWS nodes. |
@justinsb You can find the kernel sources for GCI from their release notes at: |
Just to throw my two cents in, we have to have a Debian tested version. Custom kernel is fine, but we need the .config on a stable configuration. Multiple parties are using Debian, and we need to make this happen. Let me know if you need any support with this. We have already scheduled an upgrade in the next couple of weeks. @justinsb we are in your debt for finding this one! |
@dchen1107 so here are some thoughts
Opinion? |
@chrislovecnm Agreed with you above. We, GCE / GKE side are moving toward the direction listed above: 2) and 3). Also I'm not suggesting to running GCI unless it is completely open sourced. |
@justinsb what is the status on your patched Kernel? You have that into production? |
Is this still a p0 (open since Aug!)? |
@dims this is listed as non-blocking but yah.... |
/remove-priority critical-urgent |
@dims I am not sure what priority to list, but we do not have a decent answer, and I am guessing that this is not documented well. |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
We're seeing some kernel panics with the default AWS images, which use Debian Jessie & aufs, with k8s 1.3 & Docker 1.11.2.
What is the validated kernel version and filesystem with k8s 1.3?
cc @dchen1107
The text was updated successfully, but these errors were encountered: