A resilience maturity journey
Executive Summary
- Stepping into a role in an already high-performing team on a maturity journey including a framework shift
- Dealing with incidents with the right mindset - making everyone feel safe to speak out as well as making sure people know their roles in an incident management team (IMT)
- Tech language vs Management language - the definition of an IT incident vs a strategic incident
- The more incidents spotted isn’t an indication of bad tech, it’s a maturing workforce who can better ID issues
- Learning from the incident to ensure the right things are prioritised, the right people are in the IMT, and the timing is right
- Post-incident review that focuses on management, not technical fixes
- Ongoing follow-up of long-term actions
- Changes to the Playbook to account for lessons learned e.g. definitions and escalations for specific types of incidents
Introduction
The Energy Collective is an electricity retailer in New Zealand. An online retailer, the technology stack is mostly original IP. The electricity industry is complex, added into that a retailer with an ethos of innovation and transparency, the risk appetite is less conservative than you might expect but what the company does best is drive innovation and empower staff to fail fast - everyone is engaged in risk-based decisions and not scared to take a leap.
However, no matter how engaged people are, when a small company grows quickly, the risks grow with it. Stepping into the role, one of my first tasks was to review the Incident Management Playbook and while this was in progress we did have an incident to manage!
- Incident overview
- IMT stood up
- Playbook worked through
- Incident closed with short and long-term actions in place
- Review of incident management / response, rather than technical solutions
- Actions to enhance response
Challenges
- Similar identification through the Vulnerability Disclosure Programme
- Tech language vs management language
Actions taken
- Definitions to ensure everyone understands scope
- Escalations processes for types of issue/incident based on identification and urgency
- Ongoing education around identification and escalation
- Scenarios including operational incidents
Results
- There is still work underway in terms of education and communication to the wider business to engage everyone in the resilience journey and give them a really good ‘why’ to be engaged with it
- We have a better understanding of our vulnerabilities and how to effectively manage them
Lessons learned
- Better understanding of our weak spots and how we can monitor them while we prioritise long term work to plug the gaps while maintaining the innovative business ethos
- Better understanding of the response types - who needs to be involved at what time for an effective response
- Better understanding of the opportunity costs involved in incident response and how more mature resilience counters this
Conclusion
The balance of risk and innovation in a fast-paced, highly regulated business means the standard risk management tools need to be less “stodgy”. Working with high performing and highly engaged teams opens up a more conversational and practical approach, making sure language - across technology, risk, and energy lexicons - is clear and response is quick and effective without hindering BAU processes.