Elinor Sapir - Tackling Customer Issues in Cloud Native Environments WTF is SRE 2023
Tackling customer issues is often one of the most important responsibilities of developer teams. It can be hard, slow work, and oftentimes it is viewed as less interesting, as it seldomly leverages the same level of creativity and innovation associated with developing new features. Even more so, there are additional challenges associated with gaining insight into what took place in an environment that you are not familiar with, is hard to reproduce locally, and you are working with users who won’t necessarily be there to provide every single detail that is needed. Most critical is the fact that oftentimes the business expects you to have fixed customer issues yesterday. All of this really makes for a strong case to ensure that a robust framework is in place to deal with future issues.
This talk will explore various methods and strategies that engineering managers can employ to empower their teams to manage customer issues in cloud environments more effectively. These include process and behavioral methods, such as prioritization, managing client communications and expectations, and building the right team mindset around this theme. Equally as important, are the technical elements used to ensure that the right technical foundation and infrastructure have been put in place, including leveraging automated test cases, ensuring that you have a robust CI/CD infrastructure, and capturing and using logs to get the insight you need. We will then explore proactive measures that can be taken to minimize these cases altogether.