ABSTRACT
Kubernetes is a complex system but not a complicated one. Its layered architecture is open to creating disasters caused by networking, security, configuration, and cloud provider limitations. Learning from others’ mistakes is the new reality of succeeding for cloud applications. This perfectly applies to Kubernetes-based software architectures and your software, too. If you don’t understand how other people will fail, it’s more likely that you’ll fail at some point.
In this whitepaper, we’ll walk you through five very interesting real-life Kubernetes failures that caused the software teams sleepless nights. The problems we’ll cover vary from resource usage throttles to the network issues, and we’ll give the background, root cause, impact, and ways to avoid those types of failures.