WEBVTT

00:00.769 --> 00:04.215
>> Hello everyone and
welcome back to the course,

00:04.215 --> 00:06.735
Identifying Web
Attacks Through Logs.

00:06.735 --> 00:08.550
In the last video,
we talked about

00:08.550 --> 00:10.665
log analysis and its challenges.

00:10.665 --> 00:13.050
We also finished our reviews.

00:13.050 --> 00:14.580
In the previous module,

00:14.580 --> 00:16.140
>> we talked about
important things

00:16.140 --> 00:19.035
>> related to web applications
and log analysis.

00:19.035 --> 00:21.765
This module will
be more hands-on.

00:21.765 --> 00:24.510
We'll perform some web
applications attacks,

00:24.510 --> 00:27.030
and then do the log analysis.

00:27.030 --> 00:30.345
To start, we'll talk about
web application attacks.

00:30.345 --> 00:33.620
As such, our learning
objective of this video is

00:33.620 --> 00:35.180
understanding the
differences between

00:35.180 --> 00:37.235
infrastructure and
application attacks,

00:37.235 --> 00:40.430
introducing the OWASP
top 10 project,

00:40.430 --> 00:42.725
reviewing some common
web application attacks

00:42.725 --> 00:44.950
and understanding
the URL components.

00:44.950 --> 00:47.265
Let's begin. First,

00:47.265 --> 00:50.290
let's remember some web
application components.

00:50.290 --> 00:54.050
Do you remember when we
talked about TCPIP and HTTP,

00:54.050 --> 00:57.845
and I said that HTTP uses
TCPIP to communicate?

00:57.845 --> 00:59.570
For web applications to work,

00:59.570 --> 01:01.114
they need a lot of components.

01:01.114 --> 01:02.150
Let's check some of them.

01:02.150 --> 01:03.844
>> In the top layer,

01:03.844 --> 01:05.615
>> we have the web application.

01:05.615 --> 01:08.885
PHP and HTTP are
related to this latter.

01:08.885 --> 01:10.430
After, we talked about

01:10.430 --> 01:12.780
the web server that
holds the application.

01:12.780 --> 01:15.840
Web server softwares
like Apache and Nginx

01:15.840 --> 01:17.104
>> are in this layer.

01:17.104 --> 01:18.910
>> There are two
components they need

01:18.910 --> 01:20.130
>> to run in some place.

01:20.130 --> 01:21.505
>> This place is a server,

01:21.505 --> 01:23.530
but it could also be
a personal system

01:23.530 --> 01:25.945
like Microsoft Windows or Linux.

01:25.945 --> 01:29.050
This is also true for
virtual machines.

01:29.050 --> 01:31.780
We can also add database servers

01:31.780 --> 01:33.935
and application
servers to this layer.

01:33.935 --> 01:36.010
In the last layer, we have

01:36.010 --> 01:37.180
the network hardware and

01:37.180 --> 01:39.790
services that make the
communications possible.

01:39.790 --> 01:42.430
However, this is only one way to

01:42.430 --> 01:45.170
understand the web applications
and their components.

01:45.170 --> 01:46.705
It's important to know

01:46.705 --> 01:48.700
that each component
can be attacked,

01:48.700 --> 01:50.470
and since web applications

01:50.470 --> 01:52.150
depend on all of
these components,

01:52.150 --> 01:53.620
an attack on any layer

01:53.620 --> 01:55.895
can affect the entire
web application.

01:55.895 --> 01:57.330
The three layers under

01:57.330 --> 01:59.505
the web application
are infrastructure.

01:59.505 --> 02:02.105
In this course, we'll
focus on the top layer,

02:02.105 --> 02:04.275
the web application attacks.

02:04.275 --> 02:05.980
This is a typical infrastructure

02:05.980 --> 02:07.975
which would support
a web application.

02:07.975 --> 02:09.760
Another design is possible,

02:09.760 --> 02:12.305
but will be so
different from this.

02:12.305 --> 02:13.945
To access our page,

02:13.945 --> 02:16.105
the user will send a
request to the web server,

02:16.105 --> 02:19.390
and the web server will
access the other components.

02:19.390 --> 02:21.100
That means that

02:21.100 --> 02:23.215
all this infrastructure
can help with logs.

02:23.215 --> 02:25.240
Again, if we have more logs,

02:25.240 --> 02:28.405
we have more information
during the investigation.

02:28.405 --> 02:30.370
You have the same
web application

02:30.370 --> 02:31.810
and the same infrastructure.

02:31.810 --> 02:33.625
How do you think
you can identify

02:33.625 --> 02:35.875
malicious user and an attack?

02:35.875 --> 02:37.935
To identify an attack,

02:37.935 --> 02:39.620
you need to know
about the attack and

02:39.620 --> 02:40.910
the web server
logs will help you

02:40.910 --> 02:42.870
>> to identify the attack.

02:43.489 --> 02:46.835
>> As we said before,
the web applications

02:46.835 --> 02:48.655
are client-server oriented.

02:48.655 --> 02:50.840
Based on this model,
we can classify

02:50.840 --> 02:53.600
web application
attacks in two types.

02:53.600 --> 02:56.315
Client-server side,
that usually explores

02:56.315 --> 02:58.160
a vulnerability and
uses an endpoint

02:58.160 --> 03:00.865
where it's located,
i.e., the web client.

03:00.865 --> 03:04.265
The second classification
is server-side attacks.

03:04.265 --> 03:06.740
In this case, the
target is the server.

03:06.740 --> 03:10.100
In this course, we'll focus
on server-side attacks.

03:10.100 --> 03:12.125
Since the web
server is a target,

03:12.125 --> 03:15.840
we can use its logs to
identify the attack.

03:15.940 --> 03:18.230
To talk about attacks,

03:18.230 --> 03:20.830
we need to talk about
vulnerabilities.

03:20.830 --> 03:23.960
One definition of a
vulnerability is from NIST,

03:23.960 --> 03:25.985
which says that a vulnerability

03:25.985 --> 03:28.205
is a weakness in an
information system,

03:28.205 --> 03:31.385
system security procedures,
internal controls,

03:31.385 --> 03:32.720
or implementation that could be

03:32.720 --> 03:35.515
exploited or triggered
by a threat source.

03:35.515 --> 03:37.670
In our course, we'll change

03:37.670 --> 03:39.800
information systems
for web applications.

03:39.800 --> 03:41.210
The attacker is someone who

03:41.210 --> 03:42.785
tries to exploit
the vulnerability

03:42.785 --> 03:46.710
and all the vulnerable things
are the attack surface.

03:46.760 --> 03:49.380
Here's some more definitions.

03:49.380 --> 03:52.615
Risk, the possibility of
something bad happening.

03:52.615 --> 03:53.400
Target,

03:53.400 --> 03:56.990
>> for us, web servers
and web applications.

03:56.990 --> 03:57.829
>> Attacks,

03:57.829 --> 03:59.570
>> which are basically
any action that

03:59.570 --> 04:01.205
someone is performing
trying to exploit

04:01.205 --> 04:02.810
a vulnerability or not to

04:02.810 --> 04:05.580
cause any impact on
the web application.

04:06.800 --> 04:09.110
We are talking about a tax,

04:09.110 --> 04:10.280
but do you know what

04:10.280 --> 04:13.190
the most common web
application attacks are?

04:13.190 --> 04:14.915
To answer this question,

04:14.915 --> 04:17.605
we'll use our definition
from the last few slides.

04:17.605 --> 04:19.235
Based on that definition,

04:19.235 --> 04:22.370
we need a vulnerability
to have an attack.

04:22.370 --> 04:24.860
It's actually better to ask,

04:24.860 --> 04:27.845
what are the most
common vulnerabilities?

04:27.845 --> 04:29.300
To answer this question,

04:29.300 --> 04:32.440
we'll use the OWASP
Top 10 project.

04:32.440 --> 04:33.965
OWASP, which means

04:33.965 --> 04:36.320
Open Web Application
Security Project,

04:36.320 --> 04:39.535
is a project that
catalyzes the catalogs,

04:39.535 --> 04:42.770
the Top 10 vulnerabilities
for web applications.

04:42.770 --> 04:47.465
In this course, we'll use a
version launched in 2017.

04:47.465 --> 04:50.285
The first version is from 2003.

04:50.285 --> 04:54.145
Check the OWASP website if
you want more information.

04:54.145 --> 04:56.150
Here we have the comparison

04:56.150 --> 04:59.155
between 2013 and 2017 projects.

04:59.155 --> 05:00.800
In this course, we'll use

05:00.800 --> 05:02.960
examples of some
attacks like injection,

05:02.960 --> 05:06.170
broken authentication,
security misconfiguration,

05:06.170 --> 05:07.505
cross-site scripting

05:07.505 --> 05:09.310
using components with
vulnerabilities,

05:09.310 --> 05:11.375
and the last one, which
is not an attack,

05:11.375 --> 05:14.100
but it's still related
to our course.

05:15.380 --> 05:17.845
To talk about web attacks,

05:17.845 --> 05:20.695
we need to understand
the URL components.

05:20.695 --> 05:25.135
URL stands for Uniform
Resource Locator.

05:25.135 --> 05:28.495
It's a type of Universal
Resource Identifier.

05:28.495 --> 05:30.910
User agents use the URL

05:30.910 --> 05:33.205
to request information
from the web server.

05:33.205 --> 05:36.205
Each web application has
one resource locator,

05:36.205 --> 05:37.990
which makes it
possible for our web

05:37.990 --> 05:40.235
server to host
main applications.

05:40.235 --> 05:43.380
URL is also known
as a web address,

05:43.380 --> 05:45.195
and has multiple parts.

05:45.195 --> 05:48.460
Now, to understand
its components.

05:49.310 --> 05:52.495
A scheme that identifies
the protocol, host,

05:52.495 --> 05:54.460
or domain that can
be followed or not

05:54.460 --> 05:56.740
by a port path that
identifies the resource they

05:56.740 --> 05:58.595
want to access and

05:58.595 --> 06:01.535
the query that's used to
pass some information.

06:01.535 --> 06:03.905
If we look at the
Cybrary login page,

06:03.905 --> 06:05.525
we can find the components.

06:05.525 --> 06:09.620
A scheme or protocol
in this case is https,

06:09.620 --> 06:13.910
www.cybrary.it is
the host or domain.

06:13.910 --> 06:15.740
You can see here
that we don't have

06:15.740 --> 06:17.004
>> the port information.

06:17.004 --> 06:21.555
>> It'll use the 443 because
of the HTTPS scheme.

06:21.555 --> 06:23.720
After the slash is the path

06:23.720 --> 06:26.000
and after the question
mark is the query.

06:26.000 --> 06:28.700
It's important to know that
most of the attacks are

06:28.700 --> 06:31.340
performed in the path or
in the query components.

06:31.340 --> 06:33.710
If you want to know
more about this,

06:33.710 --> 06:36.240
check these two websites.

06:37.250 --> 06:40.470
Another important
thing is encoding.

06:40.470 --> 06:42.470
URL's can only be sent over

06:42.470 --> 06:45.005
the network using the
ASCII character set.

06:45.005 --> 06:46.685
To respect this rule,

06:46.685 --> 06:49.400
some of the characters need
to be encoded in ASCII.

06:49.400 --> 06:51.080
The encoding words change

06:51.080 --> 06:53.180
the unsupported
character for a percent,

06:53.180 --> 06:54.830
followed by two numbers.

06:54.830 --> 06:56.900
The two numbers are
the hexadecimal digits

06:56.900 --> 06:58.220
of the encoded character.

06:58.220 --> 07:00.920
For example, the space
isn't converted to

07:00.920 --> 07:04.120
percent 20, like
in this example.

07:04.120 --> 07:07.040
Another use is to convert
different rights systems that

07:07.040 --> 07:09.970
don't use layering choice
like Arabic or Chinese.

07:09.970 --> 07:13.354
Also, encoding is used
to perform attacks,

07:13.354 --> 07:15.080
although a percent
in the request

07:15.080 --> 07:17.765
doesn't mean that this
is a malicious request.

07:17.765 --> 07:20.830
Percent is used in both
good and bad actions.

07:20.830 --> 07:23.790
For example, this
cybrary request

07:23.790 --> 07:27.555
has multiple percent
signs, but it's safe.

07:27.555 --> 07:29.685
To make things more clear.

07:29.685 --> 07:31.575
Let's look at this request.

07:31.575 --> 07:35.100
We have this big request here
with many percent signs.

07:35.100 --> 07:36.725
If you know about SQL,

07:36.725 --> 07:38.480
you'll also notice
some SQL works

07:38.480 --> 07:40.730
like SELECT, WHERE and others.

07:40.730 --> 07:44.595
Could you find those
words? It's hard to find.

07:44.595 --> 07:46.520
There are many percent signs

07:46.520 --> 07:48.530
and to help with
finding those words,

07:48.530 --> 07:49.710
we can decode it.

07:49.710 --> 07:52.390
There are many sites that
can help with decoding.

07:52.390 --> 07:54.500
After the decode, we'll be

07:54.500 --> 07:56.825
able to find out what
it really means.

07:56.825 --> 08:01.130
Now, it's easy to see the
SQL words in a real request.

08:01.130 --> 08:03.230
In this course,
you'll learn that

08:03.230 --> 08:06.665
this particular request is
an SQL injection attack.

08:06.665 --> 08:09.530
Then one more thing, a
typical user will make

08:09.530 --> 08:12.700
many requests throughout
this one page.

08:12.700 --> 08:14.630
This means that the user will

08:14.630 --> 08:17.390
request different paths
and different queries.

08:17.390 --> 08:20.690
Here's an example of one
user requesting one website.

08:20.690 --> 08:24.295
One access generated
three lines of logs.

08:24.295 --> 08:27.365
Other requests are from
the same IP address,

08:27.365 --> 08:30.920
the same date and time, but
all are different requests.

08:30.920 --> 08:33.440
It's common behavior
in modern webpages to

08:33.440 --> 08:36.275
have many requests
throughout a single webpage.

08:36.275 --> 08:37.925
Knowing your web application

08:37.925 --> 08:39.965
will help you to
identify this behavior.

08:39.965 --> 08:41.990
Post assessment question, is

08:41.990 --> 08:43.975
this information true or false?

08:43.975 --> 08:46.310
Considering a basic
web infrastructure,

08:46.310 --> 08:48.965
only web servers are
susceptible to attacks.

08:48.965 --> 08:50.715
This information is false.

08:50.715 --> 08:52.940
Remember, web
applications depend on

08:52.940 --> 08:54.440
many components and all of them

08:54.440 --> 08:55.220
>> can be targets.

08:55.220 --> 08:56.989
>> Next question.

08:56.989 --> 08:59.120
>> Which of these
vulnerabilities are present in

08:59.120 --> 09:02.630
the 2017 OWASP Top 10 project?

09:02.630 --> 09:06.755
The answer is injection and
security misconfiguration.

09:06.755 --> 09:08.360
The other options are related to

09:08.360 --> 09:10.129
>> infrastructure attacks.

09:10.129 --> 09:13.345
>> For the last question,
check this information.

09:13.345 --> 09:17.540
Web requests with a percent
sign on it are malicious.

09:17.540 --> 09:20.245
Is it true or false?

09:20.245 --> 09:22.985
This information is false.

09:22.985 --> 09:25.160
A percent sign is not
always malicious.

09:25.160 --> 09:26.540
It can be used to transfer to

09:26.540 --> 09:28.025
a different write
system or to use

09:28.025 --> 09:29.180
unsupported characters.

09:29.180 --> 09:31.949
>> Video summary.

09:31.949 --> 09:34.340
>> In this lesson, we talked
about the differences between

09:34.340 --> 09:36.230
web application and
infrastructure attacks

09:36.230 --> 09:37.755
based on a layer approach.

09:37.755 --> 09:40.340
The definition of an attack
and a vulnerability.

09:40.340 --> 09:42.895
The OWASP Top 10 project.

09:42.895 --> 09:46.990
We also reviewed URL
components and URL encoding.

09:46.990 --> 09:50.270
In the next video, we'll
begin our log analysis,

09:50.270 --> 09:53.460
starting with
vulnerability scans.

