Kubernetes And Kernel Panics

<p>How Netflix&rsquo;s Container Platform Connects Linux Kernel Panics to Kubernetes Pods</p> <p><em>By Kyle Anderson</em></p> <p>With a recent effort to reduce customer (engineers, not end users) pain on our container platform&nbsp;<a href="https://netflixtechblog.com/tagged/titus" rel="noopener ugc nofollow" target="_blank">Titus</a>, I started investigating &ldquo;orphaned&rdquo; pods. There are pods that never got to finish and had to be garbage collected with no real satisfactory final status. Our Service job (think&nbsp;<a href="https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/" rel="noopener ugc nofollow" target="_blank">ReplicatSet</a>) owners don&rsquo;t care too much, but our Batch users care a lot. Without a real return code, how can they know if it is safe to retry or not?</p> <p>These orphaned pods represent real pain for our users, even if they are a small percentage of the total pods in the system. Where are they going, exactly? Why did they go away?</p> <p><a href="https://netflixtechblog.com/kubernetes-and-kernel-panics-ed620b9c6225"><strong>Click Here</strong></a></p>
Tags: Kernel Panics