Building a Chaos Platform for Virtual Machines with OpenSource Tools

About the Talk and Speaker(s)

Safeer CM

Safeer CM

Senior Staff SRE

Flipkart
Rohini Choudhary

Rohini Choudhary

Senior SRE

Flipkart

Building a Chaos Platform for Virtual Machines with OpenSource Tools

Jan 24, 2024 4:15 PM(GMT)

Flipkart is the largest e-commerce platform in India. The Flipkart infrastructure follows a hybrid cloud strategy where our internal cloud platform powers a significant part of our workloads along with public clouds. This cloud platform is powered by our multi-DC infrastructure that provides virtual machines and bare metals servers. While a large part of our serverless workloads is on Kubernetes, the large fleets of VMs and BMs power our stateful data platforms and clusters. As part of our drive to achieve better resilience with chaos experiments, we explored several chaos tools. What we discovered was that most of the tools while working well with cloud-native workloads, were not suitable for running chaos experiments against stateful workloads running on our servers. This led us to invest in building a Chaos Platform that can perform chaos experiments against our server fleets. Our strategy was to completely use existing open-source tools including a few existing chaos products, and mix and match different tools, and integrate them to build the features that we needed for our chaos drills. The talk will open up with an introduction to chaos practices, requirements, and how we evaluated various open-source products for our chaos needs. Then we will move on to our VM Chaos solutions and how we brought together different open-source tools to build our own Chaos platform.

Fighting fire with fire: Why we cannot always prevent technical issues with more tech
Plan for Unplanned Work: Game Days with Chaos Engineering
Chaos Carnival
JAN 24 - 25, 2024VirtualVirtual
Chaos Carnival