WEBVTT

00:00:01.040 --> 00:00:02.430
Writing the plan.

00:00:03.030 --> 00:00:03.710
Now,

00:00:03.880 --> 00:00:07.070
we usually say writing the plan and we use the singular

00:00:07.070 --> 00:00:09.430
often a business continuity plan,

00:00:09.430 --> 00:00:14.260
but there is quite often for a large organization 100 different plans.

00:00:14.310 --> 00:00:17.740
Our recovery in the case, for example, of a fire,

00:00:17.740 --> 00:00:20.510
is very different than it is in the case of malware,

00:00:20.520 --> 00:00:21.270
for example.

00:00:21.690 --> 00:00:24.730
But we write plans to deal with the various types of

00:00:24.730 --> 00:00:27.450
situations we could expect to face.

00:00:27.990 --> 00:00:33.420
A plan should be thorough, it should address all types of situations.

00:00:33.570 --> 00:00:37.640
We say yes, but there can always be things happen we didn't expect,

00:00:38.000 --> 00:00:39.790
but if I've written good plans,

00:00:39.790 --> 00:00:43.280
those could be adjusted to whatever type of incident this is.

00:00:44.070 --> 00:00:48.660
We get the team together because we want the business continuity plan

00:00:48.660 --> 00:00:52.680
and disaster recovery plans to address all areas,

00:00:53.340 --> 00:00:56.390
not just IT or not just the business,

00:00:56.550 --> 00:01:01.590
but we have to look at everything from finance to operations and logistics.

00:01:02.000 --> 00:01:05.340
A plan should be a series of steps and actions.

00:01:05.590 --> 00:01:07.850
We should try to minimize verbiage.

00:01:08.110 --> 00:01:11.150
We don't want a person to have to read pages of

00:01:11.150 --> 00:01:14.110
documentation in the middle of a crisis.

00:01:14.290 --> 00:01:16.810
Instead, we want them to read and say do this,

00:01:17.010 --> 00:01:19.630
then do this, check, do this, check off,

00:01:19.630 --> 00:01:24.910
and all these things mean we move towards the actual

00:01:25.000 --> 00:01:27.790
resumption of business processes.

00:01:28.620 --> 00:01:33.320
We should write the plan for what we often call a worst case scenario,

00:01:33.500 --> 00:01:38.050
the most resource intensive situation because then we can always use a

00:01:38.050 --> 00:01:41.050
part of the plan if it's not a worst case scenario,

00:01:41.550 --> 00:01:44.450
and that means that in a worst case scenario,

00:01:44.460 --> 00:01:50.220
any type of lesser incident or situation would still be addressed in that plan.

00:01:52.020 --> 00:01:56.030
One of the problems we have is that during a crisis,

00:01:56.040 --> 00:01:58.410
we have an elevated level of risk.

00:01:58.680 --> 00:02:00.320
We know that, for example,

00:02:00.320 --> 00:02:03.450
many of the normal controls we would have had in place,

00:02:03.760 --> 00:02:06.770
separation of duties, for example, are missing.

00:02:06.940 --> 00:02:10.539
We have people making decisions that go beyond what was

00:02:10.550 --> 00:02:13.010
their normal budgetary authority.

00:02:13.530 --> 00:02:18.900
So this is an elevated security risk as well we have to watch for.

00:02:19.820 --> 00:02:22.640
We want to have teams that are ready to go.

00:02:22.890 --> 00:02:26.240
We assign roles and responsibilities, as well as,

00:02:26.240 --> 00:02:29.230
of course, the leaders, but for every leader,

00:02:29.230 --> 00:02:30.600
there should be a deputy,

00:02:30.770 --> 00:02:34.390
a person who can fill in if that leader was not available.

00:02:35.120 --> 00:02:35.990
Ideally,

00:02:35.990 --> 00:02:39.460
we want to have people on the teams that understand more than

00:02:39.460 --> 00:02:44.160
just their area so that if another team was in some ways

00:02:44.160 --> 00:02:46.740
impaired from being able to do their job,

00:02:46.750 --> 00:02:50.180
there is cross‑training and there is support that can be provided.

00:02:51.300 --> 00:02:54.880
An important thing in a crisis is to have clear

00:02:54.880 --> 00:02:57.500
leadership and lines of reporting.

00:02:57.810 --> 00:03:01.480
We define who's in charge, who makes the decisions,

00:03:01.490 --> 00:03:05.940
who talks to the media so that we have good and clearly

00:03:05.940 --> 00:03:09.670
understood reporting relationships and it's not such that

00:03:09.670 --> 00:03:12.790
everybody's just doing whatever they think is best.

00:03:13.740 --> 00:03:17.670
We need to ensure that the people on our teams have the appropriate

00:03:17.670 --> 00:03:21.520
training so they can execute their responsibilities,

00:03:21.650 --> 00:03:25.740
as well as the tools they would need in order to do their job.

00:03:27.930 --> 00:03:31.640
An important part in any crisis is communication,

00:03:32.080 --> 00:03:35.080
communication with our employees, managers,

00:03:35.090 --> 00:03:38.900
our customers, all of the stakeholders, or in other words,

00:03:38.910 --> 00:03:42.950
all of the people who could be affected by this crisis.

00:03:43.230 --> 00:03:47.320
We want management to know what's going on so they can provide

00:03:47.320 --> 00:03:50.790
direction and certainly answer questions from the media.

00:03:51.290 --> 00:03:55.960
We quite often have to report to government and regulatory agencies,

00:03:56.230 --> 00:03:59.820
let's say if we had a spill of diesel fuel or some other type of

00:03:59.820 --> 00:04:04.270
environmental issue, or even an injury to an employee or a customer.

00:04:05.060 --> 00:04:09.200
We want to communicate with our customers so they have confidence that we are

00:04:09.200 --> 00:04:12.670
there to support and help them and it's not such that we are going to

00:04:12.670 --> 00:04:15.480
disappear and their warranties are now worth nothing.

00:04:16.730 --> 00:04:21.130
This is especially important when we're dealing with a privacy breach.

00:04:21.450 --> 00:04:25.890
We want all of our customers to be confident that we have done

00:04:25.890 --> 00:04:28.460
everything we can to protect their information,

00:04:28.550 --> 00:04:33.390
but also that we're being upfront about what had happened and how we

00:04:33.390 --> 00:04:35.890
will prevent that from happening in the future.

00:04:36.890 --> 00:04:39.190
We need to communicate with our suppliers.

00:04:39.470 --> 00:04:44.380
We quite often rely on them some of the raw materials we'll need

00:04:44.470 --> 00:04:47.850
and we don't want them to stop shipping those products because

00:04:47.850 --> 00:04:49.380
they're afraid they'll never get paid.

00:04:50.260 --> 00:04:52.290
And of course, our shareholders.

00:04:52.560 --> 00:04:54.430
By law, in many cases,

00:04:54.430 --> 00:04:57.650
we have to communicate with our shareholders all at the same

00:04:57.650 --> 00:05:01.720
time so they're all aware of what's going on if this is

00:05:01.720 --> 00:05:04.030
something that could affect share price.

00:05:05.480 --> 00:05:07.210
When we talk about reporting,

00:05:07.400 --> 00:05:11.890
we want to do regular reports on the status of the crisis to management,

00:05:12.380 --> 00:05:16.210
and this quite often can be done through an emergency operation center,

00:05:16.630 --> 00:05:21.030
the heartbeat or control point where we'll actually manage

00:05:21.040 --> 00:05:25.030
all the various teams and activities, and from this point,

00:05:25.040 --> 00:05:28.690
we can communicate to management what's going on.

00:05:29.530 --> 00:05:33.300
We should have checklists, our plans are action‑oriented,

00:05:33.520 --> 00:05:37.840
so we can show milestones and progress we've made towards

00:05:37.840 --> 00:05:40.930
addressing various types of systems or issues.

00:05:42.240 --> 00:05:44.960
And then, of course, we want to get back to normal.

00:05:45.230 --> 00:05:48.440
We will call this the process of restoration.

00:05:48.850 --> 00:05:53.440
To restore to normal means I will recover the business functions

00:05:53.510 --> 00:05:56.890
at whatever is now going to be my primary site.

00:05:57.420 --> 00:06:01.490
Now, normally, when we recovered after the incident,

00:06:01.550 --> 00:06:05.630
we recovered our most critical business processes first,

00:06:05.810 --> 00:06:12.010
but when I restore, I'm going to recover the actual less important areas.

00:06:12.300 --> 00:06:16.355
That will allow me to test my migration plan,

00:06:16.355 --> 00:06:17.220
my networks,

00:06:17.225 --> 00:06:22.675
and my systems before I jeopardize my most critical business processes by

00:06:22.675 --> 00:06:27.600
trying to move them into whatever the new normal is going to be.

00:06:28.920 --> 00:06:32.570
No plan can be trusted unless it's been tested,

00:06:32.880 --> 00:06:38.270
and we do tests of the plan with the intention of finding any deficiencies.

00:06:38.630 --> 00:06:41.790
The point of the test is to find something that could go

00:06:41.790 --> 00:06:45.170
wrong so we can fix it before the incident.

00:06:45.790 --> 00:06:50.030
The testing also helps us to train our staff so they develop

00:06:50.030 --> 00:06:53.110
skills and know how to respond effectively.

00:06:53.950 --> 00:06:55.560
The test should be thorough,

00:06:55.820 --> 00:07:01.390
they should be as accurate and realistic as possible so we know that this

00:07:01.510 --> 00:07:05.100
is how things would work in a real world situation.

00:07:06.060 --> 00:07:08.720
When we test, it's always good to start small.

00:07:09.080 --> 00:07:13.360
Do some little tests of just individual processes before we

00:07:13.360 --> 00:07:16.030
move on to more complex types of tests.

00:07:16.370 --> 00:07:22.540
One of the problems is that very often from any incident and from any,

00:07:22.540 --> 00:07:27.430
we could say, test, there have been lessons that have been identified.

00:07:27.930 --> 00:07:31.020
It is important that those become lessons learned.

00:07:31.340 --> 00:07:36.560
We apply what we learned so we improve it so it doesn't just happen again.

00:07:38.080 --> 00:07:41.000
In summary, in this module,

00:07:41.010 --> 00:07:44.530
we set out the requirements for disaster recovery planning.

00:07:45.090 --> 00:07:48.780
This is for the most serious types of incidents that

00:07:48.780 --> 00:07:52.130
would require relocation of operations.
