Carnegie Mellon University

Disaster Recovery Services

Disaster Recovery lays out the resources and activities to re-establish information technology services at an alternate site following a disruption of IT Services in the event of a disaster or significant business disruption.

The provision of Disaster Recovery Services begins with training and awareness in which end users are educated in using the Fusion Framework, a third-party cloud solution from Fusion Risk Management that resides on a Salesforce platform, for Application Record Management, Disaster Recovery Planning and Exercising.

Disaster Recovery Services facilitates the coordination, guidance, and assistance in the creation and the ongoing management of Application Records, Disaster Recovery Planning, and Disaster Recovery Exercises, which occur on an annual basis.

Application Records 

Applications concept

  

Application Records is a data collection activity that is achieved by leveraging Technology System Descriptions and one-on-one interviews with technology owners and administrators of an Application/IT Service to understand:

  • Description of the Application/IT Service and its service availability requirements
  • Dependencies that the Application/IT Service has in order to operate (i.e., other Applications/IT Services, Components – Servers, Network Connectivity, Vendor)
  • Potential risk impact that the University could experience in the event the Application/IT Service could not recover and restore service within its recovery objective(s)

  

Disaster Recovery Planning documents the actions and activities that technology teams will execute to:

  • Assess and Respond to the impact of a disruption
  • Recover and Restore the Application/IT Service within the required recovery objective
  • Resume and Validate Application/IT Services to users, mitigating the business impact of a disruption to dependent users of the Application/IT Service

Planning

picture of file folders

Solutions

picture of pencil and dominos

  

Three Disaster Recovery Solutions are available:

  • High Availability – dedicated machines at both Primary and Secondary Data Centers with load balancing to enable automatic failover
  • Active/Passive – dedicated machines at both Primary and Secondary Data Center with manual intervention required to enable failover
  • Bridged Net Image Restore – dedicated machines at the Primary Data Center with virtual machines on reserve at the Secondary Data Center with manual intervention required to enable failover

  

Disaster Recovery Exercising tests a Disaster Recovery Plan by enabling technology teams and their business partners to execute the actions and activities described in the Disaster Recovery Plan. Exercise information and results (i.e., Recovery Time Capability, or RTC) are maintained within the Fusion Framework.

The Recovery Time Capability (RTC) of an Application/IT Services is based on how long it takes to execute the Disaster Recovery Plan. The RTC is populated within the Application Record of an Application/IT Service to demonstrate and enable transparency to dependent users if the Application/IT Service meets their recovery requirements.

Exercises

Are you ready image