Posted on Wednesday, August 17th, 2016 in Use cases

Lessons learned from a Multi Network & Multi Region Incident

When it all comes to a grinding halt

On Friday morning, the 27th of March, a fault in a 380-kV substation in Diemen triggered a large-scale power outage. Within seconds the power in large parts of Noord-Holland and Flevoland were cut off leaving over one million households and businesses in the dark. This also resulted in trains coming to a standstill at the peak of rush hour. At Schiphol Airport planes were not allowed to take off or advised to re-route to other airports. Hospitals and business organizations switched to backup generators or entirely closed their operations. The incident resulted in widespread malfunctions in other vital networks including a near total breakdown of telecommunication services, disconnecting the emergency services in affected areas. Governmental bodies as well as the companies involved scaled up their crisis teams and Incident Response Teams. All parties were simultaneously looking for the same intel: which areas were affected? What’s the impact of the outage? How much time would it take to get things back online? Although the original malfunction was relatively easy to fix, the aftermath of the power outage was felt till late in the evening and beyond.

Stroomstoring NS, schiphol
Image by JPstock / Shutterstock.com

Lessons learned

Such widespread outage affecting various security regions and (telecom) infrastructures is rare. For the Dutch Telecom Authority, a unique opportunity presented itself in the form of assessing the cooperation between regions, network operators and the emergency services involved. The lessons learned have been noted in a report commissioned by the Dutch Parliament.

Download:
Stroomstoring Noord Holland, 27 maart 2015 (PDF)

Lessen uit de crisisbeheersing en telecommunicatie
Inspectie Veiligheid & Justitie / Agentschap Telecom

The report outlines several important lessons:

  • The inter-regional coordination and implementation of GRIP 5 (Coordinated Regional Incident Management) always requires a high level of customization;
  • Expectations of what information and how available information is shared was not clear;
  • The organization of safety regions regarding continuity of the main structure of available communication facilities, and also unfamiliarity with any possible alternatives;
  • Radio networks are not prepared for a long-term regional power outage;
  • Communication channels with citizens are not adequately in place when power failure occurs;
  • 112 continuous access is not guaranteed.

Based on these findings, it is reasonable to conclude that several factors negatively affected resolving the power outage. Firstly, the lack of key information available to all parties involved. Secondly, the ability to communicate pertinent updates when required. And finally, whilst a swift response was required without risking safety, missing critical data meant key stakeholders took longer than anticipated to assess the current and forecasted impacts of the outage and taken actions support public safety.

Gathering intel and assessing impact across networks

Gathering and sharing valuable information especially when it involves incidents impacting multiple regions was a major concern. The report concluded that expectations on what and how available information is shared was not clear to all parties involved.

During a power outage such as the Diemen case network operators are the center of gravity; they receive firsthand intel on network availability and their efforts are crucial in re-establishing the power grid and other networks affected. The network operators function as linchpin between response & repair teams, the emergency services, other Critical Infrastructure operators and governmental bodies involved. Sharing reliable and up-to-date data, insights and impact assessments are key in fighting the crisis at hand.

Without power it’s difficult to communicate

Some issues identified are inherent to the nature of the incident: telecommunication nowadays depends on the availability of power and data lines. VoIP, mobile and internet services, all modern means of communication come to an end without powered up modems and servers. Telecom operators ensure their power backup and take other measures to prolong the stamina of their core communication systems. Nevertheless, during the power outage the number of power backups failing increased. This subsequently resulted in analog lines being the only option to communicate flawlessly.

Grasping the essence of CI interactions

We have to understand how critical infrastructures interact, where and why their operations and security hinge upon each other and what interdependencies and risks have to be taken into account. The SIM-CI suite, a cloud based platform for Critical Infrastructure Simulations and management, might just be the answer.

360° view on Critical Infrastructure

By offering a 360° view on Critical Infrastructures, considering network and asset interdependencies and the ability to run various near-real time simulation scenarios, the SIM-CI platform enables network operators to assess impact and mitigate damage. Offering a complete set of sophisticated tools for Integrated Asset Management, Workflow Management, Incident Management and secure communications.

Header image: Anton Havelaar / Shutterstock.com