WEBVTT

00:00.420 --> 00:01.890
This is major one, lesson six,

00:02.640 --> 00:03.990
hedging your biases.

00:06.900 --> 00:08.279
This lesson, we have two main

00:08.280 --> 00:09.280
objectives.

00:09.900 --> 00:11.669
The first is to go over the

00:11.670 --> 00:13.049
importance of collaboratively

00:13.050 --> 00:14.809
assessing attack mappings

00:15.630 --> 00:17.399
is going to cover a bit of why some

00:17.400 --> 00:19.229
of that step five is so important

00:19.230 --> 00:21.000
from the attack mapping process.

00:21.780 --> 00:23.299
Then I'm going to get into some

00:23.760 --> 00:26.220
about analyst and source biases

00:26.550 --> 00:28.139
and some ways to hedge against them.

00:32.700 --> 00:34.799
Step five of the process we gave,

00:34.800 --> 00:36.149
which is to compare with other

00:36.150 --> 00:38.309
analysts, can be really important in

00:38.310 --> 00:40.409
doing attack mapping or frankly,

00:40.410 --> 00:42.270
any other intelligence analysis.

00:43.470 --> 00:44.793
This can help hedge against

00:45.450 --> 00:46.819
analyst biases.

00:47.910 --> 00:49.649
So I'll be getting into a number of

00:49.650 --> 00:51.450
different reasons why

00:51.690 --> 00:53.129
different analysts might come up

00:53.130 --> 00:54.299
with different answers.

00:55.890 --> 00:57.599
But everyone has a different set of

00:57.600 --> 00:59.119
experiences and background that

00:59.340 --> 01:00.779
they're drawing upon as they do

01:00.780 --> 01:02.100
intelligence analysis.

01:03.270 --> 01:04.446
So where one analyst may

01:05.129 --> 01:07.049
see not application layer

01:07.050 --> 01:08.969
protocol and another may see

01:08.970 --> 01:10.829
custom command and control protocol,

01:11.820 --> 01:13.709
it's important to figure out why

01:13.710 --> 01:15.180
these differences exist,

01:16.140 --> 01:17.757
be consistent and how you map and

01:17.970 --> 01:19.200
apply techniques.

01:20.190 --> 01:22.019
If other analysts can't review

01:22.020 --> 01:23.489
your mappings, try to make sure that

01:23.490 --> 01:24.989
you're at least doing it the same

01:24.990 --> 01:27.059
way as you go across different

01:27.060 --> 01:28.829
reports and how you're using a given

01:28.830 --> 01:29.830
technique.

01:33.450 --> 01:35.459
So it can also be tempting and

01:35.460 --> 01:36.989
once your experience for the tack,

01:36.990 --> 01:39.209
you'll probably occasionally skip

01:39.210 --> 01:40.889
steps and the mapping process,

01:42.040 --> 01:43.412
you may be going straight to

01:43.920 --> 01:45.635
identifying an applicable technique

01:45.840 --> 01:46.919
or some technique.

01:47.520 --> 01:48.839
You know, maybe you don't need to go

01:48.840 --> 01:51.120
through a long process to identify

01:51.690 --> 01:53.258
fishing, spearfishing attachment

01:53.790 --> 01:54.790
in a report.

01:55.590 --> 01:58.050
But it is important to remember

01:58.080 --> 02:00.419
that this does increase your bias.

02:01.230 --> 02:03.329
It's drawing from availability

02:03.330 --> 02:05.309
bias, the techniques that you

02:05.310 --> 02:06.420
have in your head that you're

02:06.450 --> 02:08.580
already familiar with versus

02:08.610 --> 02:10.109
the full range of techniques.

02:10.590 --> 02:12.449
So it's probably something you're

02:12.450 --> 02:13.649
eventually going to work up to

02:13.650 --> 02:15.509
doing, but it wants

02:15.510 --> 02:16.979
to always be done with a little bit

02:16.980 --> 02:17.980
of caution.

02:21.240 --> 02:22.612
So I'm going to get into one

02:23.100 --> 02:24.864
of my favorite areas of cyber threat

02:24.900 --> 02:26.759
intelligence, which is

02:26.760 --> 02:29.099
biases in intelligence reporting.

02:30.060 --> 02:31.049
I'm going to talk about this

02:31.050 --> 02:32.569
specifically in terms of attack

02:33.120 --> 02:35.159
map data. But these exist

02:35.160 --> 02:36.483
all throughout cyber threat

02:36.630 --> 02:37.630
intelligence.

02:38.550 --> 02:40.290
The first thing about biases

02:40.650 --> 02:41.973
is it's important for us to

02:42.090 --> 02:44.129
recognize that they exist

02:44.550 --> 02:46.216
and to understand some of what our

02:46.500 --> 02:48.389
biases are in

02:48.390 --> 02:49.649
cyber threat intelligence.

02:50.730 --> 02:52.445
I'm going to get into two key types

02:52.620 --> 02:54.479
of bias in

02:54.600 --> 02:56.266
areas like the technique, examples

02:56.550 --> 02:57.710
that are in attack.

02:57.720 --> 02:59.789
And so these these biases exist

03:00.180 --> 03:01.919
in the data that the attack team

03:01.920 --> 03:03.145
puts out in attack groups

03:03.900 --> 03:05.468
and software mappings as well as

03:05.550 --> 03:07.229
work. You do yourself in mapping

03:07.230 --> 03:08.230
attack.

03:08.550 --> 03:09.689
And so the two types, I'm going to

03:09.690 --> 03:11.909
get into our bias

03:11.910 --> 03:14.190
introduced by us as consumers

03:14.880 --> 03:16.399
and bias that's inherent in the

03:16.680 --> 03:18.180
types of sources we use.

03:19.320 --> 03:20.741
Understanding these biases is

03:21.270 --> 03:22.544
the critical first step in

03:23.190 --> 03:24.900
effectively leveraging the data.

03:28.420 --> 03:29.547
So the first bias is in

03:30.280 --> 03:32.289
the set of sources that you use.

03:32.890 --> 03:33.879
So these are actually the

03:33.880 --> 03:35.499
percentages for

03:35.950 --> 03:38.020
reports that we have in

03:38.620 --> 03:40.719
tax groups and software packages

03:40.720 --> 03:42.250
and attacked up Miodrag.

03:43.060 --> 03:44.949
The vast majority of

03:44.950 --> 03:46.509
the material that we are able to

03:46.510 --> 03:48.078
leverage is coming from security

03:48.610 --> 03:50.139
vendors is mostly coming from

03:50.140 --> 03:51.189
incident response.

03:51.970 --> 03:53.949
In some cases, governments have put

03:53.950 --> 03:56.470
out reports things like indictments

03:57.220 --> 03:58.659
and in some cases there's high

03:58.660 --> 04:00.277
quality press reporting that gets

04:00.400 --> 04:01.968
into activities that adversaries

04:02.110 --> 04:03.110
have done.

04:03.700 --> 04:05.366
And it's not that there's anything

04:05.620 --> 04:07.449
bad with this, but it's important to

04:07.450 --> 04:08.822
understand the biases in the

04:09.160 --> 04:10.189
specific sources that

04:11.020 --> 04:12.640
are making up our intelligence.

04:13.090 --> 04:14.805
And your set of sources is probably

04:14.860 --> 04:16.119
going to look a little bit different

04:16.120 --> 04:18.429
than this attack has some specific

04:18.430 --> 04:20.096
constraints in using free and open

04:20.500 --> 04:22.209
source threat intelligence

04:22.210 --> 04:23.210
reporting.

04:26.960 --> 04:28.234
Another set of biases that

04:28.820 --> 04:30.649
we're going to have as

04:30.650 --> 04:31.699
we go through data

04:32.540 --> 04:33.863
is novelty and availability

04:34.790 --> 04:35.790
bias.

04:36.680 --> 04:38.101
And as the attack team, we're

04:38.570 --> 04:39.570
absolutely

04:40.880 --> 04:42.649
hit with novelty bias.

04:43.490 --> 04:45.189
So somebody is putting out a report.

04:45.200 --> 04:46.670
It's talking about

04:47.360 --> 04:48.781
technique. We've seen a bunch

04:48.800 --> 04:49.800
before, a group.

04:50.000 --> 04:51.139
We've seen a bunch before.

04:51.140 --> 04:52.999
So Fuzzy Duck is

04:53.000 --> 04:54.709
using power shell again.

04:55.940 --> 04:57.649
It might not be as interesting for

04:57.650 --> 04:59.509
us and it might not come up to the

04:59.510 --> 05:01.029
top of the queue as fast for us

05:01.160 --> 05:03.350
adding it to attack, whereas

05:03.560 --> 05:05.809
some brand new technique we have out

05:05.810 --> 05:07.670
transmen of data manipulation,

05:07.910 --> 05:09.679
very few actors do that.

05:10.590 --> 05:12.679
We don't have a report in on Apte

05:12.680 --> 05:13.459
Leet.

05:13.460 --> 05:15.126
So actively using transmitted data

05:15.410 --> 05:17.027
manipulation is going to be a lot

05:17.060 --> 05:18.060
more novel to us and

05:18.920 --> 05:20.149
we just need to be aware of these

05:20.150 --> 05:21.150
biases.

05:21.350 --> 05:22.673
And are they blinding us to

05:23.630 --> 05:24.859
important activity?

05:26.300 --> 05:27.829
As we wrap ourselves, we can have

05:27.830 --> 05:29.240
that availability bias

05:30.140 --> 05:31.365
attack as big as hundreds

05:32.180 --> 05:33.846
of techniques, and there are going

05:33.980 --> 05:35.597
to be a subset of techniques that

05:35.930 --> 05:37.879
you remember in your head that you

05:37.880 --> 05:39.379
know that you're familiar with.

05:39.770 --> 05:41.779
You can immediately go and map to

05:42.470 --> 05:44.149
and you're more likely to find those

05:44.150 --> 05:45.410
techniques in reporting.

05:45.440 --> 05:47.149
So it's important to recognize

05:47.900 --> 05:49.468
if you found a certain frequency

05:49.790 --> 05:51.407
of a technique to understand what

05:51.470 --> 05:53.389
that may mean in terms of

05:53.510 --> 05:54.949
how many times it occurs.

05:58.410 --> 06:00.509
There are also biases in the sources

06:00.510 --> 06:01.510
we use.

06:02.020 --> 06:03.020
So to get into to

06:03.970 --> 06:05.619
availability and visibility,

06:06.850 --> 06:08.379
the people that are creating the

06:08.380 --> 06:10.329
reports we're using have their own

06:10.330 --> 06:11.589
availability bias.

06:12.460 --> 06:14.028
They have behaviors they've seen

06:14.170 --> 06:15.339
adversaries do before.

06:15.340 --> 06:16.809
They understand as they're looking

06:16.810 --> 06:18.879
at their data and are able

06:18.880 --> 06:20.799
to go in and analyze

06:20.800 --> 06:21.909
those behaviors a lot more

06:21.910 --> 06:22.910
effectively.

06:23.110 --> 06:25.059
Where is there is a wider range

06:25.060 --> 06:26.619
of things that adversaries might

06:26.620 --> 06:27.620
actually be doing.

06:29.120 --> 06:30.459
Also, visibility, bias.

06:30.790 --> 06:32.439
So a lot of the data that we're

06:32.440 --> 06:33.669
talking about is coming from

06:33.670 --> 06:34.670
incident response.

06:35.080 --> 06:36.795
So it's likely that it's only types

06:37.000 --> 06:38.666
of data that can be gathered after

06:38.980 --> 06:40.779
an intrusion that are being included

06:40.780 --> 06:41.780
in the reports.

06:42.580 --> 06:44.001
So there are certain types of

06:44.410 --> 06:46.329
behaviors that appear more

06:46.720 --> 06:48.420
depending on the type of sensing and

06:48.430 --> 06:49.509
the type of data that you're

06:49.510 --> 06:50.882
actually able to use in your

06:50.980 --> 06:51.980
reporting.

06:57.190 --> 06:58.709
It's also victim novelty biases

06:59.650 --> 07:00.880
in the reporting we use

07:01.840 --> 07:03.699
victim bias is

07:03.700 --> 07:05.649
that some victims are going to be

07:05.650 --> 07:07.180
potentially more interesting

07:07.600 --> 07:09.429
to be reported upon and

07:09.430 --> 07:11.370
so more likely to generate a report.

07:11.830 --> 07:13.269
So it could be some particular

07:13.270 --> 07:15.309
industry, a really big name

07:15.310 --> 07:16.780
company, an incident happening

07:17.260 --> 07:19.089
to the other

07:19.090 --> 07:20.919
way that who the victim is impacts

07:20.920 --> 07:22.390
the reporting is that in a lot

07:22.750 --> 07:23.979
of cases, companies are getting

07:23.980 --> 07:25.569
permission from victims before

07:25.570 --> 07:26.991
reporting on them, even if it

07:27.400 --> 07:28.420
is anonymously.

07:29.380 --> 07:30.703
So it can have a big impact

07:31.600 --> 07:33.069
which type of industry, what kind of

07:33.070 --> 07:34.749
reporting requirements they have

07:35.050 --> 07:37.029
if the report even comes out at all

07:37.060 --> 07:38.410
based on who the victim is.

07:40.380 --> 07:42.095
And given that we're working off of

07:42.420 --> 07:43.547
in a lot of cases, free

07:44.250 --> 07:45.328
reports, there is some

07:46.320 --> 07:48.269
novelty bias that we need to watch

07:48.270 --> 07:49.270
out for, too.

07:49.770 --> 07:51.044
And a lot of cases, threat

07:51.480 --> 07:53.309
intelligence reports are coming out

07:53.310 --> 07:54.500
of a marketing budget.

07:55.170 --> 07:57.029
So there is some pressure where if

07:57.390 --> 07:58.979
there's a group that's had frequent

07:58.980 --> 07:59.980
past reporting,

08:00.810 --> 08:02.339
maybe it's a little less likely.

08:02.340 --> 08:03.929
We're going to see a report on this.

08:04.170 --> 08:05.940
And we've seen this out in the wild

08:06.060 --> 08:07.949
where groups like

08:07.950 --> 08:09.469
Apte 10, there was no reporting

08:10.350 --> 08:12.089
on them for a number of years, even

08:12.090 --> 08:13.349
though people were seeing them out

08:13.350 --> 08:15.209
in the wild, but they just weren't

08:15.210 --> 08:16.529
doing anything particularly new in

08:16.530 --> 08:17.530
industry.

08:17.970 --> 08:18.970
Interesting.

08:19.350 --> 08:20.918
Lo and behold, they break into a

08:21.030 --> 08:22.500
bunch of service providers and

08:23.040 --> 08:24.719
suddenly there's reporting out there

08:24.720 --> 08:25.847
again, whereas it might

08:26.670 --> 08:28.349
be more interesting to get a report

08:28.350 --> 08:30.239
out on the new group on the block,

08:30.600 --> 08:32.519
APTE one three

08:32.520 --> 08:34.199
three eight instead of a lead.

08:36.110 --> 08:37.789
So these aren't bad,

08:37.940 --> 08:39.459
they exist, they're things that

08:39.830 --> 08:41.779
we need to recognize and

08:41.900 --> 08:43.158
there are some strategies we can

08:43.159 --> 08:45.020
take for hedging these biases.

08:45.890 --> 08:47.719
And the first is that step

08:47.720 --> 08:49.289
five that we gave

08:49.820 --> 08:50.820
collaborate.

08:51.410 --> 08:53.059
If you're collaborating with others,

08:53.060 --> 08:54.824
it can help mitigate especially your

08:55.010 --> 08:56.210
own biases,

08:57.170 --> 08:59.539
diversity of thought and diversity,

08:59.540 --> 09:01.129
period on your teams makes for

09:01.130 --> 09:02.149
stronger teams,

09:03.560 --> 09:04.939
adjust and calibrate your data

09:04.940 --> 09:06.919
sources, understand how your data

09:06.920 --> 09:08.330
is potentially skewed

09:08.750 --> 09:10.639
and adjust for that as you work with

09:10.640 --> 09:11.640
it.

09:11.990 --> 09:13.754
Try to work with as diverse a set of

09:13.820 --> 09:15.019
sources as you can.

09:16.190 --> 09:17.513
And oftentimes the absolute

09:18.020 --> 09:19.637
best data is going to be that you

09:19.880 --> 09:21.349
gather yourself where you've got

09:21.350 --> 09:23.016
full access to all the information

09:23.090 --> 09:24.315
around it and have a much

09:25.070 --> 09:26.539
better idea of how the data is

09:26.540 --> 09:27.540
shaded.

09:28.620 --> 09:30.359
And finally, in working with all of

09:30.360 --> 09:32.549
this, you were talking about gaps,

09:32.550 --> 09:33.779
we're talking about places we might

09:33.780 --> 09:35.639
not see, but we do

09:35.640 --> 09:38.009
have is an opportunity to prioritize

09:38.010 --> 09:39.010
the known.

09:40.110 --> 09:41.609
Hopefully everything we're talking

09:41.610 --> 09:43.619
about are things that we do know

09:44.760 --> 09:46.559
as opposed to worrying about the

09:46.560 --> 09:47.560
unknown.

09:48.550 --> 09:50.079
It does mean that we may not be able

09:50.080 --> 09:51.919
to say that, you know,

09:51.940 --> 09:53.459
spearfishing attachment is more

09:53.950 --> 09:55.539
popular than

09:55.870 --> 09:57.760
a supply chain compromise,

09:58.510 --> 10:00.069
but it does mean that we can say,

10:00.070 --> 10:01.899
OK, we've seen both of these

10:01.900 --> 10:03.220
existing out in the wild.

10:08.070 --> 10:09.269
So we've gone over a couple of

10:09.270 --> 10:10.838
things in here, talked about why

10:11.250 --> 10:13.169
it's so important to work with

10:13.170 --> 10:14.787
other analysts to collaboratively

10:14.970 --> 10:16.769
assess attack mappings as well as

10:16.770 --> 10:17.970
other threat intelligence.

10:18.510 --> 10:20.369
And I've gotten into some key types

10:20.370 --> 10:22.259
of bias that you're trying

10:22.260 --> 10:24.119
to hedge as you do that

10:24.120 --> 10:25.492
collaboration and other work

10:26.040 --> 10:27.240
with threat intelligence.

10:28.840 --> 10:30.369
So I've just gotten into the

10:30.760 --> 10:31.760
second part of

10:32.890 --> 10:34.809
this attack for CGI Journey,

10:35.380 --> 10:36.997
getting into a number of steps to

10:37.210 --> 10:39.129
help you map narrative data

10:39.130 --> 10:40.130
to attack.

10:41.800 --> 10:43.299
Next up, we're going to be going

10:43.300 --> 10:44.868
into how you can apply that same

10:45.340 --> 10:47.380
process to work with rawData.

