WEBVTT

00:07.370 --> 00:12.560
Preparation inside of our networks is a key for determining or for incident response.

00:12.560 --> 00:18.500
We need to understand that the key concept of preparation is preparing for the worst case scenario.

00:18.530 --> 00:24.410
We often see this in different aspects of policies and procedures, but the technical and non-technical

00:24.410 --> 00:31.100
steps we take now could save us a lot of time, effort and energy, as well as costs for future problems.

00:31.100 --> 00:34.790
Preparation is truly key in our security industry.

00:34.820 --> 00:41.210
An Incident Response Plan, or IRP, serves as a central document for managing the entire incident response

00:41.210 --> 00:43.430
process within an organization.

00:43.430 --> 00:49.040
Its policies outline the requirements for the plan and assigns roles, responsibilities, and typically

00:49.040 --> 00:50.060
remains concise.

00:50.060 --> 00:55.070
Establishing an internal governance necessary for building the incident response capabilities within

00:55.070 --> 00:56.210
our organization.

00:56.210 --> 01:01.470
An incident response plan should be broad enough to where it impacts the entire organization as a whole,

01:01.470 --> 01:04.980
identifying key roles without identifying people.

01:04.980 --> 01:12.390
We want our IRP to say the director of blank or this technical department does this, or you get the

01:12.390 --> 01:12.960
picture.

01:12.960 --> 01:14.820
We won't want to identify that.

01:14.820 --> 01:18.480
John Smith, the director of engineering, is going to do this.

01:18.480 --> 01:25.170
That sets us up for failure and is broad enough in scope yet singular and notion to where we can quickly

01:25.170 --> 01:31.530
identify the person or persons responsible within the incident response plan to accommodate for what

01:31.530 --> 01:32.910
we're trying to achieve.

01:32.940 --> 01:38.190
It's important that the process developed during the incident response plan is communicated properly

01:38.190 --> 01:43.110
and deemed to be vital and executing effectiveness and the incident response teams at a variety of team

01:43.110 --> 01:46.650
members and stakeholders involved to ensure that it's adequately.

01:46.680 --> 01:52.050
We're providing the specific needs of the industry or of the enterprise environment.

01:52.050 --> 01:58.140
This makes it more manageable, it has more effective communication processes, and it's best to have

01:58.170 --> 02:02.000
an internal and external stakeholders involved throughout the entirety.

02:02.030 --> 02:07.100
Effective internal communication is critical and often facilitated through a command center.

02:07.130 --> 02:12.260
This serves as a hub for decision makers and stakeholders and provides updates and feedbacks to response

02:12.260 --> 02:13.100
activities.

02:13.100 --> 02:18.080
If you're in the middle of a disaster recovery plan and your workers don't know whether to communicate

02:18.080 --> 02:24.590
or to facilitate communication transparently, that can lead to some real problems during a hurricane.

02:24.590 --> 02:30.170
One time, I was out in the middle of the field fixing telecommunication equipment, only to find out

02:30.170 --> 02:34.220
that there was no command and control infrastructure in place that I was aware of.

02:34.220 --> 02:37.850
I contacted my manager, but obviously his cell phone didn't work.

02:37.880 --> 02:43.100
I had to call all the way to Georgia, only to find that Georgia had no idea what I was talking about.

02:43.100 --> 02:47.690
All they knew was that half our systems were offline and they had no idea what was going on.

02:47.720 --> 02:53.870
Having an effective communication and command and control outline for internal communications is paramount

02:53.900 --> 03:00.920
to making sure that we have an incident response plan focused with precise Communication.

03:00.920 --> 03:07.160
We also have to have alternate communication needs because of that problem that we faced during that.

03:07.280 --> 03:12.770
During that telecommunications issue, we were all assigned satellite telephones with known command

03:12.770 --> 03:18.650
and control lines that we could communicate up the line if one of the, uh, one of the tracks of communication

03:18.680 --> 03:20.030
were no longer available.

03:20.030 --> 03:25.640
We need to have external communication both during the incident response and after the incident response.

03:25.640 --> 03:31.460
And this needs to be managed by trained professionals to avoid misleading information and ensure compliance

03:31.460 --> 03:33.080
with regulatory bodies.

03:33.110 --> 03:39.020
Again, while I was with the telecommunications industry, we had a problem where telecommunications

03:39.020 --> 03:45.440
and 911 services weren't operating properly that needed to be properly communicated via external communication

03:45.470 --> 03:49.730
up the line, so that the FCC was aware of our issues.

03:49.730 --> 03:56.360
We had a fiber line cut one time between Phoenix and Flagstaff, where served all the communication

03:56.360 --> 03:58.070
needs for northern Arizona.

03:58.100 --> 04:03.690
The FBI needed to be involved as well as some other agencies to investigate the problem.

04:03.690 --> 04:09.150
However, as a member of the telecommunications industry, we were responsible for actually fixing the

04:09.150 --> 04:09.690
issue.

04:09.720 --> 04:15.000
Those external communications need to go through the right offices at the right time to report properly

04:15.000 --> 04:17.190
for compliance and regulatory needs.

04:17.310 --> 04:21.120
Training is often overlooked and is the groundwork for effective response.

04:21.120 --> 04:22.980
Regardless of where it's coming from.

04:22.980 --> 04:29.130
You need to understand that it's not enough just to have an incident response plan in a bookshelf somewhere

04:29.130 --> 04:30.180
collecting dust.

04:30.180 --> 04:35.010
I can't tell you how many organisations I've been with where I go in through the onboarding process,

04:35.010 --> 04:39.660
and I'm looking around and they're like, oh, read all these documentations about our different policies,

04:39.660 --> 04:44.790
but there's no incident response plan on file when you finally ask for one, it's pulled out of some

04:44.790 --> 04:48.000
shelf out in the middle of nowhere, and no one even knows what you're talking about.

04:48.000 --> 04:50.160
It hasn't been updated in over a decade.

04:50.160 --> 04:54.510
This is quite often a problem, especially with large enterprise environments that don't have a lot

04:54.540 --> 04:56.670
of incidents occurring throughout the years.

04:56.670 --> 05:02.500
And in many cases, non-technical staff make up the majority of an organization, and invariably these

05:02.500 --> 05:06.640
users will be the ones who are exposed to the signs of potential security incidents.

05:06.640 --> 05:12.070
We need them to be aware of our security incidents and how to evaluate or how to report those incidents

05:12.070 --> 05:12.550
up.

05:12.550 --> 05:18.790
It's not unforeseen for a user to click on a phishing link, and as soon as they click on it to realize

05:18.790 --> 05:21.340
they made a mistake, it happens all the time.

05:21.340 --> 05:26.050
We need to be aware of those issues and train them properly so that they can report up the chain and

05:26.050 --> 05:29.980
effectively communicate that they actually screwed up, and we need to fix it.

05:30.010 --> 05:35.350
It's far better for an employee to report it right away than for us to get it after the fact 3 or 4

05:35.350 --> 05:40.030
days later, because our systems weren't picking it up properly, and incident response plan must be

05:40.030 --> 05:44.020
tested and it must be tested at least annually throughout your entire organization.

05:44.020 --> 05:48.910
This is critical for developing proficiency and dusting off the book to make sure that everything is

05:48.910 --> 05:53.740
up to date, that the organization hasn't changed since the incident response plan was put into motion.

05:53.740 --> 05:58.750
We need to provide exercises to evaluate the plan, the procedures and the personnel readiness, and

05:58.750 --> 06:03.110
these exercises should include document reviews to make sure they're set up properly.

06:03.140 --> 06:08.300
We should be able to do tabletop exercises where the team gets together and they simulate responses

06:08.300 --> 06:10.070
to hypothetical situations.

06:10.070 --> 06:15.770
We can do walkthroughs where participants familiarize themselves with response steps and roles without

06:15.770 --> 06:17.750
involving the actual equipment or data.

06:17.780 --> 06:23.330
And finally, we can do full scale exercises where we literally shut down machines to see what happens.

06:23.330 --> 06:29.180
When I again, when I was in telecommunications, we would often shut off a power to a facilitating

06:29.180 --> 06:31.340
site to see if the generator would kick in.

06:31.340 --> 06:34.790
And we were required to do this for some sites at least every 30 days.

06:34.820 --> 06:40.370
If the generator failed to kick on, our battery backup system should, if they didn't kick on the site,

06:40.400 --> 06:41.450
would just go dead.

06:41.480 --> 06:42.890
But you get my point.

06:42.890 --> 06:47.930
We sometimes need to do a full scale exercise to ensure that everything is working properly and the

06:47.930 --> 06:50.120
way that we identified it or managed it.

06:50.150 --> 06:56.660
Disaster Recovery Plan, or a DRP, is a structured approach aimed to swiftly restoring IT systems,

06:56.660 --> 06:58.670
data and infrastructure.

06:58.670 --> 07:03.330
Post a disruptive event to minimize downtime and maintain business continuity.

07:03.360 --> 07:07.590
We're going to talk about a business continuity plan in a second, but you need to understand that a

07:07.590 --> 07:11.130
disaster recovery plan is not the same as business continuity plan.

07:11.130 --> 07:16.080
Well, yes, a disaster recovery plan can be utilized in a business continuity plan.

07:16.110 --> 07:20.100
A disaster recovery plan is of its own making.

07:20.100 --> 07:25.740
This identifies the different recovery strategies that we can utilize, how we can utilize different

07:25.770 --> 07:30.120
resources, and acquisition of parts and even labor if needed.

07:30.120 --> 07:35.040
Once we've identified a disaster recovery plan, it goes through a risk assessment identifying the different

07:35.040 --> 07:38.340
risks that we expect to see within our own environment.

07:38.340 --> 07:41.970
When I was in Phoenix, Arizona, we weren't worried about hurricanes or tornadoes.

07:41.970 --> 07:45.510
It wasn't something that our disaster recovery plan really planned for.

07:45.540 --> 07:51.900
Yes, there was a side note in there that basically said we would follow the overall arching plan that

07:51.900 --> 07:56.490
the organization had set forth, but I can't tell you the last time we actually looked at that plan

07:56.520 --> 08:00.480
because we haven't had a tornado or a hurricane ever in Phoenix, Arizona.

08:00.510 --> 08:04.380
However, one thing that we did have was a heat disaster recovery plan.

08:04.410 --> 08:09.660
We identified very strategically if the temperatures inside of our internal server room reached a certain

08:09.660 --> 08:14.940
temperature and the systems did not work properly, the exact details of what we were supposed to do

08:14.940 --> 08:20.280
in that event, and we went through the entire process on a day to day basis, especially during the

08:20.310 --> 08:23.550
summer, to make sure we were aware of those plans.

08:23.580 --> 08:28.860
I remember every May before the summer started to get really hot, that we would make sure that our

08:28.890 --> 08:35.400
AC repair guy was on the line and realized that our SLA was one hour for specific sites.

08:35.970 --> 08:38.100
We need a business impact analysis.

08:38.130 --> 08:42.030
What is the impact of this critical function going down?

08:42.030 --> 08:45.750
If I have an interior server room going down, the impact could be very high.

08:45.780 --> 08:52.080
We need to understand the business impact and identify which prioritization takes precedence in a real

08:52.080 --> 08:52.860
disaster.

08:52.860 --> 08:57.450
If you have a hurricane hitting Florida, there's going to be certain points that are more important

08:57.450 --> 09:00.530
and have a higher priority because of business impact than others.

09:00.530 --> 09:02.450
It's just a matter of doing business.

09:02.450 --> 09:04.640
We need to have recovery point objectives.

09:04.670 --> 09:06.080
What if this happens?

09:06.080 --> 09:10.700
How often should we or how soon should we recover from this issue?

09:10.730 --> 09:14.720
If I have a hurricane in Florida, we expect the server room to go down.

09:14.720 --> 09:19.070
How long should it stay down before we go over and and fix the issue?

09:19.100 --> 09:21.200
That's a recovery point objective.

09:21.200 --> 09:24.410
We need to have data backups and recovery in place.

09:24.410 --> 09:28.640
When I say data backups, what I'm talking about is how often do we backup our systems.

09:28.640 --> 09:32.750
Sometimes those backups are nightly, sometimes they're weekly, sometimes they're monthly.

09:32.780 --> 09:37.460
It really depends on what data we're storing and how important that data is to the system overall.

09:37.460 --> 09:39.800
And how do we recover from that data.

09:39.800 --> 09:44.870
We've gone over this in Security+ many, many times, so I'm really not going to go that into depth

09:44.870 --> 09:45.200
with it.

09:45.200 --> 09:49.940
Just understand that data backup and recovery is a real thing when it comes to disaster recovery.

09:49.940 --> 09:56.330
We need to have response procedures and reduce allocation or resource allocation if something occurs

09:56.330 --> 09:57.410
within our systems.

09:57.410 --> 10:00.560
If I have a hurricane that's expected to hit the Florida line.

10:00.560 --> 10:05.480
I need to have resources on standby to fly into Florida if flooding takes place.

10:05.510 --> 10:07.610
How are those resources coming to play?

10:07.640 --> 10:10.850
How am I going to get those resources if the roads are closed down?

10:10.880 --> 10:13.220
What happens if air flight isn't available?

10:13.250 --> 10:17.330
The overall resource allocation needs to not just be assets, but people.

10:17.330 --> 10:19.640
They need to be identified ahead of time.

10:19.640 --> 10:24.320
If Florida is going to get hit by a major hurricane, it's not behooves us as an organization to say,

10:24.320 --> 10:30.590
hey, let's pull from Alabama and Georgia, maybe even New York or Phoenix to facilitate having more

10:30.590 --> 10:36.140
resources on the ground to bring those systems back online sooner rather than later if something were

10:36.140 --> 10:36.860
to occur.

10:36.890 --> 10:38.480
And finally, training.

10:38.660 --> 10:41.930
We need to understand that people need to be trained.

10:41.960 --> 10:46.610
Nothing's worse than having a bunch of professionals that have no idea what they're doing, even though

10:46.610 --> 10:50.900
in a normal day to day occurrence, they would have no problem operating at peak performance.

10:50.930 --> 10:55.430
Training is vital to make sure that people know what they're doing, who they're communicating with,

10:55.430 --> 10:57.500
and what their expected actions are.

10:57.500 --> 10:59.000
In a disaster recovery.

10:59.030 --> 11:02.820
Are we expecting people in the Florida area to come into the office next day?

11:02.850 --> 11:08.880
I was an organization to where if we had a disaster, an environmental disaster take place, we were

11:08.880 --> 11:14.850
required within two hours to call into an 800 line to let the company know that we were available and

11:14.850 --> 11:15.930
that we were safe.

11:15.960 --> 11:20.280
Once they identified that, they'd maybe hit us up and say, are you able to come to work?

11:20.310 --> 11:23.400
That was part of our training and our response plan.

11:23.430 --> 11:29.010
I was further responsible for many disaster recovery aspects, in which case, if something occurred,

11:29.010 --> 11:34.350
I was expected to go fix the issue and be able to stand by myself without direct communication within

11:34.350 --> 11:38.190
the department and fix any various issues that I may identify.

11:38.220 --> 11:42.570
This is part of our disaster recovery plan, and you need to be trained properly to be able to take

11:42.570 --> 11:43.410
those actions.

11:43.410 --> 11:44.670
What are you allowed to do?

11:44.700 --> 11:47.670
What aren't you allowed to do, and what are your expectations?

11:47.700 --> 11:53.280
A business continuity plan refers to an organization's ability to continue its essential job functions

11:53.280 --> 11:58.410
and operations during and after any disruptive incident or crisis.

11:58.440 --> 12:00.160
This isn't just a disaster.

12:00.160 --> 12:02.260
We're not talking about hurricanes or tornadoes.

12:02.290 --> 12:04.840
This could mean a terrorist attack.

12:04.870 --> 12:08.950
It could mean the servers drop off air because of a patching problem.

12:08.980 --> 12:15.460
It can mean any number of things, but a business continuity plan is very broad in its scale, but details

12:15.460 --> 12:17.860
very specific incidents that are occurring.

12:17.860 --> 12:24.580
Within that we could see notions of technology failures, cyber attacks, or even disasters because

12:24.610 --> 12:27.460
disaster recovery is part of business continuity.

12:27.460 --> 12:30.550
While disaster recovery plan is specific to disasters.

12:30.580 --> 12:34.270
A business continuity plan is more designed around an overarching structure.

12:34.270 --> 12:39.520
We can use a lot of the same resources that have been allocated for a DRP within a business continuity,

12:39.520 --> 12:44.620
with the expectation that both aren't going to occur simultaneously, and this happens quite a bit.

12:44.650 --> 12:51.010
However, the main goal of a BCP is to minimize downtime and protect the organization's reputation and

12:51.010 --> 12:52.420
thereby its finances.

12:52.420 --> 12:57.640
We need to maintain consumer confidence and ultimately ensure the organization's survival.

12:57.640 --> 13:00.100
If it's faced with an adversity action.