At Moss, Java apps are at the core of our microservices. However, from time to time, they encounter issues. This is where the Moss Platform Team steps in. We are the team responsible for ensuring the smooth operation of applications on Kubernetes.
A few weeks ago, an issue occurred when some of our microservice pods began restarting unexpectedly, without any apparent reason and at varying hours. It became clear that something was wrong.
Our Service Level Objectives (SLOs) remained intact, thanks to the multiple redundancy layers in place for our workloads. However, our monitoring systems showed that certain pods were undergoing restarts.