Developers are under increasing pressure to create real-time products that make the most of a wide range of digital resources.
This means that DevOps teams have to cope with information drawn from all sorts of different sources. But how can they ensure they are getting an accurate picture?
We spoke to Iain Chidgey, VP EMEA at Sumo Logic to discuss how improving security and observability in the process can help.
BN: What are the challenges currently presented by modern application stacks?
IC: Developers have expanded their application stacks in recent years because customers expect real-time, always on services that make the most of digital. This stack expansion has helped developers build some great applications, but it has added to their complexity.
For site reliability engineers (SREs), this can lead to a struggle around the connections between the various applications within that stack. When things are as complex as they are today, it can be incredibly difficult to assess an issue, understand the total impact and prioritize what to do next to fix the problem accordingly. Alongside this, the longer it takes you to diagnose and fix each application issue, the higher the level of risk.
DevOps and SRE teams have to monitor and manage data from across the entire technology ecosystem, not just individual applications. Using observability data can help in this, but you have to be able to really connect the dots and get an accurate picture of what is taking place.
BN: How can observability tools help to get a full picture of the technology stack? What result does better observability deliver?
IC: Taking an end-to-end service approach involves bringing together all your data, so getting your monitoring, diagnosis and troubleshooting data together and then correlating it to give you that full picture of what is taking place. This is a vital step on the path to true, real-time observability.
Using that underlying application telemetry data, you should be able to quickly detect anomalous events and carry out rapid root cause analysis. Whether this is due to a security problem or a failure in one of your application components, you want to be able to find out what is taking place and how to fix it quickly. Ultimately, we all want to know what is going on and why.
This real-time holistic view should provide you with continuous intelligence and help you improve your applications. This helps you make them more reliable and resilient to issues, as well as keeping them secure. This kind of intelligence helps you understand any incident, go through the full failure chain and take fast, accurate steps to restoring services. It can also provide you with automated recommendations and suggestions for next steps.
BN: What role can benchmarking play to help organizations with their Kubernetes implementations?
IC: It seems like everyone is moving to Kubernetes for container management and orchestration. Using containers is due to be part of many companies’ multi-cloud strategy, and Kubernetes helps manage that approach. However, implementing Kubernetes itself is hard.
Benchmarking around Kubernetes can be valuable when it’s based on real-world data. For example, you can now get insight into CPU and memory sizing recommendations for your deployment. This is based on statistical analysis of the world’s leading Kubernetes users, so you can see where your approach compares. The ability to see what is going on at other organizations and how they are performing over time benefits everyone.
DevOps can use this data to compare their container image sizing and performance levels then use this information to see if they are over-provisioning and missing out on savings, or not providing enough resources to container instances and then affecting performance. This data is all anonymized to ensure privacy and security, so everyone can benefit over time from the wisdom of the crowd.