Whether used for professional or leisure purposes, for safety-critical applications or e-commerce, the Internet in particular has become an integral part of our everyday lives, affecting the way societies operate.  However, the Internet was not intended to serve all these roles, and, as such, is vulnerable to a wide range of challenges.  Malicious attacks, software and hardwired faults, human mistakes (eg software and hardware misconfigurations)  and large-scale natural disasters threaten its normal operation.

Resilience, the ability of a network to defend against and maintain an acceptable level of service in the presence of such challenges, is viewed today, more than ever before, as a major requirement and design objective.  These concerns are reflected in, among other ways, in the Cyber Storm III exercise carried out in the USSA in September 2010 and the "cyber stress tests" conducted in Europe by the European Network and Security Agency (ENISA) in November 2010, both aimed at assessing the resilience of the Internet.

.....

The EU-funded ResumeNet project argues for resilience as a critical and integral property of networks.   It advances the state of the art by adopting a systematic approach to resilience, which takes into account the wide variety of challenges that may occur.  At the core of [their] approach is a coherent resilience framework, which includes implementation guidelines, processes and toolsets that can be used to underpin the design of resilience mechanisms at various levels in the network.  In this article [the authors] propose a framework, which forms the basis of a systematic approach to resilience.  Central to the framework is a control loop, which defines necessary conceptual components to ensure network resilience.  The other elements - a risk assessment process, metrics definitions, policy-based network management and information sensing mechanisms - emerge from the control loop as necessary elements to realise the systematic approach.

Framework for Resilience

resilience control loop
 


The resilience framework builds on work by Sternbenz et al [1], whereby a number of resilience principles are defined, including a resilience strategy, called D2R2 + DR: Defence, Detect, Remediate, Recover and Diagnose and Refine.  The strategy describes a real-time control loop to allow dynamic adaption of networks in response to challenges , and a non-real time control loop that aims to improve the design of the network, including the real-time loop operation, reflecting on past operation al experience.

 

Smith et al, "Network Resilience: A Systematic Approach", IEEE Communications Magazine, July 2011, pp 88-97


[1 - JPG Sternbenz et al, "Resilience and Survivability in Communications Networks: Strategies, Principles, and Survey of Disciplines", Elsevier Computer Networks, Special Issue on Resilient and Survivable Networks, vol 54, no 8, June 2010 pp 1243-42.]