In March 2021, The Bank of England’s Prudential Regulation Authority issued Supervisory Statement 2/21 (SS 2/21) The document was intended as a shakedown of regulated financial services industries and to make operational resiliency a priority. With an increase in cyber crime the regulator has concluded the threat is great enough to step in for the companies with which it regulates, writes James Hughes, Enterprise CTO, Rubrik.
In Europe, the Digital Operational Resilience Act (DORA) is due to come into force early next year seeking to shore up the financial services against the ever increasing cyber risk. Domestic and international companies alike are going to have to plan for and then deliver operational resilience at scale and pace, and where financial services lead, other sectors less mature in their operational resilience journey will surely follow.
But this new legislation isn’t simply a tightening up of security regulations. It’s not simply a case of formalising requirements. DORA represents quite a distinct shift in the way we approach resiliency; a move which puts security at the fore, requiring companies to steer away from detect and protect - (the traditional castle and moat security model) - to something placing a much greater emphasis on resilience and recovery.
Looking at the seven principles of operational resiliency, it becomes clear why this way of thinking is the key to navigating the ever changing world of risk. These principles arise from the financial services industry and have been developed to protect against incidents of all kinds; natural disasters, pandemic, technology failures, as well as, cyber threats. So, to start, here are the seven principles which to which we’ll offer a brief overview:
- Operational risk management
- Business continuity planning
- Mapping of critical operations
- Third-party party dependency management
- Incident management
- ICT resilience
Change starts at the top, and the same is true for good governance. Organisational governance structures should carry out effective operational resilience simulations and have in-built ability to recover from, and most importantly, learn from disruptive events. The focus should be on reducing the impact such events have on operations. It’s important boards of directors and senior leadership participate heavily in these ‘fire drills’, building personal readiness as well as organisational. As a bonus, preparing the governance for these kinds of challenges will improve an organisation's ability to work through the disruptive event itself should it arise.
Operational risk management
Less risk means greater resilience, which makes this principle central to any operational resiliency planning. Diligent auditing of risk, the identification and remediation of internal and external organisational threat is the primary goal of operational risk management, together with mitigating potential human errors and forecasting possible technological failures. Control and change management capabilities to identify and assess vulnerabilities quickly ensures effective operational risk management.
Follow The Stack on LinkedIn
Business continuity planning
Building on governance simulations, business continuing planning frameworks should include regular testing against drastic, but plausible, disruptions. Findings from these tests should inform future planning - lots of organisations hold regular simulations but fail to incorporate learnings effectively. Where possible, external as well as internal operations should play into simulations as this is a variable likely to be at play in a real-life scenario - again many organisations fail to plan for scenarios involving external operations.
Mapping interconnections and interdependencies
Mapping critical operations can reveal sometimes surprising interconnections and interdependencies. Tracing operations across the organisation demonstrates the aspects of the organisation that go into keeping critical operations running; information, people, technologies, processes and facilities. By mapping these business critical operations, the surface area of risk becomes evident and many organisations are surprised at how finely balanced their organisational ecosystem is - one disruption can send a disruptive ripple across the business.
Third-party dependency management
My driving instructor told me during my first lesson behind the wheel “assume every other car on the road is being driven by a pedestrian”, by which he meant, someone who can’t drive.
This is good advice too when managing relationships with third parties. Establish their credentials before building a relationship; ensure their operational resilience approach is at least as strong as your own before you open your business up to potential vulnerabilities.
All the training and readiness counts for little if the team doesn’t perform on match day. Incidents are going to happen, so it’s important they’re managed properly - effective response and recovery plans are the key.
A good plan needs to include:
- A inventory of incident response
- Classification of security
- Robust recovery procedure
- Clear communication plans
- ICT and cybersecurity measures need to be iron-clad.
The ability to recover from a large outage is a vital component of resiliency, but do you do it quickly? Whatever caused the outage in the first place may still be lurking in your system. Recovering whatever caused the disruption would create a vicious cycle of outages, so what’s the answer? Knowing the posture of the data you’re protecting and continually assessing whether it’s in a good state. Especially when it comes to malware hiding deep inside a file system. If you have absolute confidence in your data recovery, your operational risk is considerably reduced.
“Prepare in peacetime, so you know how to operate in wartime”.
Simulate large ‘restore at scale’ operations with pre-made orchestration activities. Allow your people to easily manage a large outage by making the technology side of bringing your systems back up simpler.
But what about the cloud? Yes, that’s covered via these regulations too: 3rd party risk management is becoming increasingly in the spotlight. Many FS orgs are the point-of-contact for the customer, but in reality, they need multiple services from many external organisations to run their firm.
This is the real crux of 3rd party management - am I confident that my providers are taking operational resiliency as seriously as we are? When it comes to the cloud, the providers are responsible for the systems, but not the data. Is that protected in the same way as your more traditional on-prem systems? If not, why not? Shouldn’t you be able to view all of your data posture in one single place? I suggest if you could, it would make the life of your operators, risk managers and audit considerably easier.
Operational resilience has been a cornerstone of doing business in the financial services industry for a while now, and legislation like SS 2/21 and DORA are applying the principles to ICT and cyber security in a thorough way. The result will surely be more robust businesses, and by applying the same operational resilience model, enterprises in all sectors too can add additional safeguards and mitigate against the world of risk.