WEBVTT

00:06.980 --> 00:12.650
When I first started into cybersecurity for the first time ever, we went through this entire process

00:12.650 --> 00:18.410
about learning through logs and going through that process at the very first time as a brand new person,

00:18.410 --> 00:20.630
it looked incredibly confusing to me.

00:20.660 --> 00:22.940
I was looking at all these little numbers in Wireshark.

00:22.970 --> 00:26.900
It was talking about numbers over here for IP addresses and port numbers.

00:26.900 --> 00:31.370
And then there's this little section talking about, well, this is the rudimentary of what it's supposed

00:31.370 --> 00:31.880
to be.

00:31.910 --> 00:35.330
And it's very easy to get very confused very, very quickly.

00:35.360 --> 00:39.260
Cybersecurity is more than just talking about technology.

00:39.260 --> 00:41.150
It's about the people involved.

00:41.150 --> 00:46.100
It's about interfacing that technology with the different aspects of human life.

00:46.100 --> 00:49.190
And it's about interfacing human life with technology.

00:49.220 --> 00:53.420
Now, I know I just went three different routes about technology and human beings, but I really want

00:53.420 --> 00:58.250
you to understand the perplexities that are involved with human beings and cybersecurity.

00:58.280 --> 01:04.520
Cybersecurity has been blowing up from the get go from different avenues of job growth, from technology,

01:04.550 --> 01:11.660
from all aspects of the life cycle that we as human beings not only record technology, but how we interface

01:11.660 --> 01:13.100
with technology as a whole.

01:13.130 --> 01:19.040
From the days when I actually had hair still on my head and we started talking about DSLRs, where we

01:19.040 --> 01:24.920
literally sacrificed a computer to get online before we could even interface with America Online, a

01:24.920 --> 01:27.260
tool that most of you probably never even used.

01:27.350 --> 01:33.560
Um, you have to understand that logging in its simplistic nature, is where every cybersecurity analyst

01:33.560 --> 01:34.310
begins with.

01:34.340 --> 01:39.350
We need to understand how those logs interface with technology, where we're getting this information,

01:39.350 --> 01:45.350
and how that interface is transitioned from the computer standpoint all the way to the pain point where

01:45.350 --> 01:46.550
it goes to the SEM.

01:46.940 --> 01:50.750
Cybersecurity is an intermix of understanding human beings.

01:50.750 --> 01:53.300
How are human beings interfacing with that technology?

01:53.300 --> 01:54.740
Are they clicking on a link?

01:54.740 --> 01:56.750
Are they using it for gaming?

01:56.750 --> 01:58.790
Are they using it for different programs?

01:58.790 --> 02:03.660
How do we make technology useful not only in an ethical but a secure manner.

02:03.660 --> 02:06.420
And that really encompasses cybersecurity as a whole.

02:06.420 --> 02:12.240
And your job as an analyst to make sure that people are using it in a secure, ethical form.

02:12.240 --> 02:16.140
But when it's not used in that outline, how do we correct it?

02:16.170 --> 02:22.260
How do we identify the problems that are associated with technology and how people are using that technology,

02:22.260 --> 02:26.790
whether from an evil standpoint, from a mistaken standpoint, or from the way they're supposed to be

02:26.790 --> 02:27.570
utilizing it?

02:27.600 --> 02:32.970
Not only that, but as a cybersecurity expert, you need to understand what does baseline traffic look

02:32.970 --> 02:34.560
like in a day to day basis.

02:34.560 --> 02:39.930
But how is that information utilized when something's not being used the way it's supposed to?

02:39.960 --> 02:45.480
Oftentimes, brand new analysts come into the fact of logging and they go, oh, this looks like malicious

02:45.510 --> 02:46.770
when it's really not.

02:46.770 --> 02:51.060
And then we look at some traffic and we identify as malicious when again, it's really not.

02:51.060 --> 02:57.210
And so your job with cybersecurity as an analyst really kind of begins here at that logging standpoint.

02:57.210 --> 03:02.280
And how we utilize logs to identify malicious use versus real use.

03:02.280 --> 03:08.250
As we go through, we're going to discuss logging software defined networks, operating system hardening,

03:08.250 --> 03:11.100
public key encryption, and how encryption utilizes it.

03:11.130 --> 03:13.020
We're going to go over data protection.

03:13.050 --> 03:19.410
We're going to go over, uh, different aspects of technology and networking throughout this entire

03:19.410 --> 03:20.070
course.

03:20.070 --> 03:22.080
And it really begins here.

03:23.070 --> 03:26.550
Logging really begins with detection and monitoring of systems.

03:26.550 --> 03:31.050
How do we detect information that's going from one system to another?

03:31.080 --> 03:37.350
Every system, whether it's an IoT device or a PC or a macintosh, a Linux, it doesn't matter what

03:37.350 --> 03:37.620
it is.

03:37.620 --> 03:40.080
It could even be your smart fridge that's on the line.

03:40.080 --> 03:46.620
Any computer device, regardless if it's online or not, produces a log of some kind that can be utilized

03:46.620 --> 03:48.210
by a cybersecurity analyst.

03:48.240 --> 03:53.670
Your job as an analyst is to identify and monitor those different logs going into the system.

03:53.700 --> 03:57.930
Now, that sounds a lot more perplexing than what it has to be, because if you think about it, and

03:57.930 --> 04:02.550
I just told you that every network device on your entire system is producing a log, and that you have

04:02.550 --> 04:07.160
to utilize that log to find out what's going on in your network, it can get overwhelming very, very

04:07.160 --> 04:07.760
easily.

04:07.790 --> 04:11.750
We're going to get into Sims and how Sims interact with our network a little bit later.

04:11.750 --> 04:17.090
But for today, I just want you to understand that we can use logs to detect and monitor different outputs

04:17.090 --> 04:20.600
within our system to define what's going on.

04:20.630 --> 04:23.480
To do that, we can use those logs for incident investigation.

04:23.480 --> 04:27.500
If something did happen, how do I find out what it was interfacing with?

04:27.530 --> 04:29.900
How was it acting on our network as a whole?

04:29.900 --> 04:33.590
We can use it for compliance and auditing standards to make sure that we're doing what.

04:33.620 --> 04:35.510
Third party audits require of us.

04:35.540 --> 04:38.210
Are there legal requirements that we need to interface with?

04:38.210 --> 04:42.110
Are there auditing requirements that are internal to our own company that we have to utilize?

04:42.110 --> 04:44.120
Logging provides that aspect as well.

04:44.120 --> 04:46.640
We can use logging for alerts and notifications.

04:46.640 --> 04:48.800
If something's going wrong with our system.

04:48.800 --> 04:50.120
It produces alerts.

04:50.120 --> 04:52.190
That alert is then confined to a log.

04:52.190 --> 04:57.500
That log is then read by different aspects of our network that then produce alerts to us, and notifications

04:57.500 --> 04:58.880
to tell us what's going on.

04:58.880 --> 05:02.540
We can also use logs for preventive measures once something stopped.

05:02.540 --> 05:05.990
If I have an antivirus program that stopped a virus, I want to know about it.

05:05.990 --> 05:09.130
I don't want it to just go, Yeah, I stopped it and not tell me about it.

05:09.160 --> 05:12.670
Think about a denial of service attack or distributed denial of service attack.

05:12.700 --> 05:14.770
It's going through and reading our systems.

05:14.770 --> 05:15.940
We need to know about that.

05:15.940 --> 05:21.040
We don't want it to just correct it and then just not tell us about it, because it could be providing

05:21.040 --> 05:23.110
or a precursor to a bigger attack.

05:23.140 --> 05:26.980
This all provides us with a baseline establishment of logs.

05:27.010 --> 05:28.240
What do I mean by that?

05:28.270 --> 05:33.520
We have logs and day to day use of our computer systems, and we often talk about malware infecting

05:33.520 --> 05:36.790
those logs or how those logs are used after an incident.

05:36.790 --> 05:39.190
But what does normal traffic look like?

05:39.190 --> 05:45.190
In order to provide in order to see malicious traffic, we need to first understand what a normal system

05:45.190 --> 05:48.190
is operating on on a normal work day.

05:48.220 --> 05:54.070
Now, baseline logging or baseline establishment should be on a day to day basis, a month to month

05:54.070 --> 05:56.500
basis, as well as a yearly basis.

05:56.500 --> 06:02.470
And what I mean by that is that Christmas traffic on December 25th, which is a Wednesday, is going

06:02.470 --> 06:04.870
to be different than Christmas traffic.

06:04.990 --> 06:10.940
Uh, on December 25th, on a Thursday it could differentiate between those two different days just because

06:10.940 --> 06:12.980
one is on Wednesday versus Thursday.

06:12.980 --> 06:17.840
But it could also be different because one is on Christmas Day versus Christmas Day, and how it falls

06:17.840 --> 06:20.480
into the different aspects of what we're looking at.

06:20.510 --> 06:25.760
What I mean is, what is Christmas Day on a Thursday in comparison to Valentine's Day on a Tuesday?

06:25.790 --> 06:28.010
Does it have some interjections in there?

06:28.040 --> 06:29.450
Does it have some differences?

06:29.450 --> 06:33.620
We need to have a baseline assessment of our network on a given day.

06:33.620 --> 06:35.420
We also need to be able to look at trends.

06:35.450 --> 06:40.400
2023 could be vastly different from 2020, depending on the different network and what's going on in

06:40.400 --> 06:42.800
our systems, even if the days are the same.

06:42.800 --> 06:45.080
And that's where an annual baseline comes into play.

06:45.080 --> 06:49.730
We want to continually update our baseline, but we also want to be able to use it for what's going

06:49.730 --> 06:50.390
on in our systems.

06:50.390 --> 06:55.670
We don't want to have this knee jerk reaction of, hey, my traffic just went up by a one gigabyte moving

06:55.670 --> 06:58.940
out of our network on a Tuesday from week to week.

06:58.970 --> 07:02.150
If that occurs, that could be establishment of a of a malware.

07:02.150 --> 07:07.730
But it could also be a problem where we just developed a new product that's being sold online.

07:07.730 --> 07:11.740
And now all of a sudden people are reading into our network a lot more than they were even a week ago.

07:11.770 --> 07:17.320
This all provides that baseline of establishment and logging, and provides us cues in how we utilize

07:17.320 --> 07:20.950
logging to prevent and detect different attacks throughout our network.

07:21.880 --> 07:26.140
We often look at log ingestion as the collection of different logs.

07:26.140 --> 07:31.390
We can do this through collecting different sources like we talked about before, from IoT devices to

07:31.420 --> 07:32.620
smart refrigerators.

07:32.620 --> 07:37.840
But more likely it's going to be from firewalls, IPS, IDs, different clients on our network as a

07:37.840 --> 07:38.440
whole.

07:38.440 --> 07:44.320
We can look at the collection methodology based on different aspects of technical infrastructure used

07:44.320 --> 07:45.310
inside of our network.

07:45.310 --> 07:47.770
But we can also import logs as well.

07:47.770 --> 07:52.180
When we talk about importing logs, we're really referring to I've got logs on other systems.

07:52.180 --> 07:57.460
How do I import them into the major system that I'm using to read all these logs?

07:57.460 --> 08:02.410
I want you to picture yourself inside of a network infrastructure with a thousand different clients,

08:02.440 --> 08:08.440
5000 different IoT devices, not counting the firewalls, the servers, the switches, the routers.

08:08.440 --> 08:10.420
That's way too much for one person.

08:10.420 --> 08:16.130
And so we can use something called a sim, uh, which is going to take all those logs and import them

08:16.130 --> 08:21.800
into a central repository that allows us to not only automate them through AI and machine learning,

08:21.800 --> 08:29.630
but allow you to go through and look at those logs in a more easily readable format to depreciate.

08:29.660 --> 08:33.860
Hey, this system is connected to this system, which is connected to this system, and follow the logs

08:33.860 --> 08:34.820
where you're supposed to.

08:34.850 --> 08:38.150
This is where we import logs into a central repository.

08:38.180 --> 08:40.220
We also are going to process those logs.

08:40.220 --> 08:45.320
We don't want to do that manually because as I just said, going through five different logs is going

08:45.350 --> 08:47.330
to make you pull your hair out and be bald like me.

08:47.330 --> 08:48.650
That's not something we want.

08:48.680 --> 08:54.110
We want to use analysis to go through those logs using automated processes that make it easier.

08:54.110 --> 08:58.550
And then when we're done using analysis to go through those logs, we want to store them both in the

08:58.550 --> 09:04.190
raw format, meaning how the logs were initially provided to us, but also in the format that has been

09:04.190 --> 09:10.730
reframed for our SIM architecture, meaning that certain things have been moved or rerouted or even

09:10.730 --> 09:16.540
removed, uh, configured in a different manner so that it's readable by that one central repository.

09:16.540 --> 09:22.150
So in aspect we're actually having two logs, the raw format and the format that's been configured for

09:22.150 --> 09:23.950
our SIM to read through.

09:26.170 --> 09:28.450
There's also different logging levels.

09:28.480 --> 09:32.200
Different logging levels are attributed to where I'm getting the logging from.

09:32.200 --> 09:34.210
Is it a different source IP address?

09:34.210 --> 09:38.710
Uh, a windows machine may be different from a mac machine, which would be different from a Linux machine,

09:38.710 --> 09:40.660
which is also different from an IoT or a.

09:40.870 --> 09:41.710
You get the point.

09:41.740 --> 09:45.310
Different source IPS provide us with different logging levels.

09:45.310 --> 09:50.950
And those operating systems, or those machines operating at those IP addresses may provide a bigger

09:50.950 --> 09:56.980
log or a less intrinsic log, or less accurate log, or less detailed than a different machine.

09:57.010 --> 10:03.130
Windows machines often provides a very detailed log of what's going on, where an IoT device may not

10:03.130 --> 10:07.810
provide us all that much detailed information, so our IP addresses matter in that.

10:07.840 --> 10:11.170
In that perplexity, we also have to look at destination IP addresses.

10:11.170 --> 10:14.640
Where is the machine interacting with another machine.

10:14.640 --> 10:16.860
Is it going from machine to machine or peer to peer?

10:16.890 --> 10:21.390
Where I've got one client talking to another windows client, or is it going through a switch first?

10:21.420 --> 10:26.100
Those matter and getting those logs from those different devices, playing a bigger picture.

10:26.100 --> 10:29.940
And we need to be able to track those different pictures throughout our logging sphere.

10:29.970 --> 10:35.400
We also need to understand different ports, protocols and services that are operating from those different

10:35.400 --> 10:36.030
clients.

10:36.060 --> 10:40.530
We want to be able to track usernames and domain names as they're going through, and where the logs

10:40.530 --> 10:45.420
are going from to where they are going to, and the different domains in which they interact.

10:45.450 --> 10:46.830
The nature of the event.

10:47.010 --> 10:48.150
Do we have an alert?

10:48.180 --> 10:53.130
Is it just a remedial log that's saying, hey, I did this, I just opened up communication protocol

10:53.160 --> 10:54.750
or is there malware involved?

10:54.750 --> 10:57.750
Whether I'm an IPS, the nature of the event matters.

10:57.750 --> 11:03.210
We also want to be able to provide the event ID, and an event ID on a windows machine is going to be

11:03.210 --> 11:07.500
different than an event ID on an IPS, which is going to be different from a different machine, and

11:07.500 --> 11:08.670
so on and so forth.

11:08.670 --> 11:10.560
And then we need to look at timestamps.

11:10.590 --> 11:16.400
Timestamps are critical in identifying different logs because it's going to be able to interface properly.

11:16.400 --> 11:21.110
I want you to imagine if you've got a log that's not timestamped, and it happened at 12:01 a.m. last

11:21.110 --> 11:24.560
night, and then I've got another log at 12:01 p.m. today.

11:24.560 --> 11:32.390
We may identify those different logs in the same complex algorithm and inadvertently combine them to

11:32.420 --> 11:34.970
match each other, which just screws everything up.

11:34.970 --> 11:41.330
So those timestamps need to be very detailed and very accurate, not only from Am to PM, but also from

11:41.330 --> 11:43.970
day to day, year, month, date.

11:43.970 --> 11:47.570
Those all need to be properly tracked in our logging levels.

11:48.350 --> 11:52.850
When we talk about time synchronization, which I kind of hit on a little bit before, we have specific

11:52.850 --> 11:54.980
protocols that introduce those.

11:54.980 --> 11:57.020
The first one is the Network Time protocol.

11:57.020 --> 12:02.840
This one is usually used within our logging algorithms to identify when something occurred across the

12:02.840 --> 12:03.440
board.

12:03.440 --> 12:06.410
We use this almost in every network that we utilize today.

12:06.410 --> 12:10.220
But then we have something that's called Precision Time Protocol or PTP.

12:10.670 --> 12:16.900
PTP is more utilized on satellite networks or and I don't mean satellite as in dishes, but satellite

12:17.080 --> 12:19.270
as in different from our home network.

12:19.300 --> 12:25.990
Let's say that I've got a piece of machinery or an IoT device that's off site operating somewhere else,

12:25.990 --> 12:28.600
and I need very precise timing on it.

12:28.600 --> 12:33.040
Maybe it's a temperature gauge or a SCADA that's associated with it.

12:33.130 --> 12:37.690
Um, with precision time protocol, I went down to the 10th of a 10th of a second.

12:37.690 --> 12:42.340
And so it needs to be very precise in nature, and it needs to be fluid within the entire network.

12:42.340 --> 12:44.110
That's where PTP comes into play.

12:44.110 --> 12:47.560
And then we have something where GPS like synchronization.

12:47.710 --> 12:52.480
When I worked in telecommunications in the cellular network, we would use GPS time synchronization

12:52.480 --> 12:57.730
because we had a lot of different satellite sites or sites that were off from the main server network,

12:57.760 --> 13:01.300
those different cellular towers all over God's green creation.

13:01.300 --> 13:07.060
Those all use GPS synchronization so that they could talk to one another and provide synchronization

13:07.060 --> 13:10.210
in order to report back to the main server architecture.

13:10.210 --> 13:14.440
And so we use GPS synchronization as comparison to network or PTP.

13:14.590 --> 13:18.270
Throughout this entire episode we talked about the different logging mechanisms.

13:18.270 --> 13:19.920
We talked about the importance of timing.

13:19.920 --> 13:25.380
We talked about how the different logs interface with one another, and how we can use a SIM to digest

13:25.380 --> 13:28.170
and import those logs to give us a bigger picture.

13:28.170 --> 13:34.680
We really went through and identified logging at a baseline level versus logging at a level that may

13:34.710 --> 13:37.980
indicate different compromises or indicators of compromise.

13:37.980 --> 13:43.710
As we're going through the entire structure, logging really provides that forefront of where we are

13:43.740 --> 13:45.360
versus where we are.

13:45.360 --> 13:49.380
When an incident occurs or when increased traffic occurs.

13:49.380 --> 13:55.500
It's important to realize that logging serves as a baseline indicator for our system as a whole, but

13:55.500 --> 14:01.590
it takes human intelligence and you as a SOC analyst to identify the different logging levels and not

14:01.590 --> 14:04.980
have a knee jerk reaction where, hey, I just went up by one megabyte.

14:05.010 --> 14:06.540
Obviously, that's an incident.

14:06.600 --> 14:11.910
Human intelligence interacting with logging is what cybersecurity is all about, and how you can be

14:11.910 --> 14:17.640
a better analyst by understanding those different complexities of logging on your enterprise environment

14:17.640 --> 14:18.420
as a whole.
