WEBVTT

00:00:01.210 --> 00:00:05.690
So far, we've been looking at cryptography as a way to protect data,

00:00:05.690 --> 00:00:09.180
either as we transmit it or as we store it.

00:00:09.490 --> 00:00:12.820
And we looked at two main types of processes,

00:00:12.820 --> 00:00:17.390
protecting it using a symmetric algorithm or protecting its

00:00:17.390 --> 00:00:21.070
confidentiality using an asymmetric algorithm.

00:00:21.530 --> 00:00:24.690
The problem with asymmetric is it tends to be slow.

00:00:24.700 --> 00:00:28.420
So you don't want to use it for a large message, for example.

00:00:29.140 --> 00:00:32.479
But there's another benefit we have with asymmetric that we

00:00:32.479 --> 00:00:34.570
didn't have with symmetric algorithms,

00:00:34.570 --> 00:00:38.630
and that is the ability to provide for proof of origin.

00:00:38.900 --> 00:00:41.130
Who did this message come from?

00:00:41.800 --> 00:00:47.080
So let's say Alice is going to send a copy of a signed contract to Bob.

00:00:47.940 --> 00:00:50.620
She can take that message,

00:00:50.670 --> 00:00:54.860
run it through the same asymmetric algorithm, but in this time,

00:00:54.860 --> 00:00:57.860
she's going to encrypt it with her private key,

00:00:58.090 --> 00:01:02.440
the key she keeps secret and nobody else would have. That

00:01:02.440 --> 00:01:06.330
will generate cipher text to that message that can be now

00:01:06.330 --> 00:01:08.450
sent through that insecure channel.

00:01:08.800 --> 00:01:14.000
And when Bob receives that, he can run that through the same crypto system

00:01:14.000 --> 00:01:19.170
with the same asymmetric algorithm, but the only key in the world that will

00:01:19.170 --> 00:01:22.790
decrypt this message would be Alice's public key.

00:01:23.180 --> 00:01:28.280
He knows that Alice's public key is linked to Alice's private key.

00:01:28.680 --> 00:01:32.680
So therefore, if he can decrypt it and read the message, he

00:01:32.680 --> 00:01:35.280
knows it had to have come from Alice.

00:01:35.840 --> 00:01:38.950
Now the problem this, of course, is that anyone with

00:01:38.950 --> 00:01:41.530
Alice's public key could actually read it.

00:01:41.930 --> 00:01:45.630
But in the case here, we're not talking about confidentiality.

00:01:45.810 --> 00:01:49.210
We're talking about who sent it, proof of origin.

00:01:49.700 --> 00:01:53.620
This allows us to establish something we talked about way back in

00:01:53.620 --> 00:01:59.780
the first course, Nonrepudiation. Alice cannot deny having sent it

00:01:59.790 --> 00:02:02.280
because it was open with her public key.

00:02:02.290 --> 00:02:05.510
It had to have been encrypted with her private key.

00:02:07.670 --> 00:02:11.100
This is very much linked to another process we're going to

00:02:11.100 --> 00:02:16.580
look at here now called message integrity. Message integrity

00:02:16.590 --> 00:02:19.280
uses message authentication codes.

00:02:19.280 --> 00:02:22.250
In other words, the message is authentic.

00:02:22.250 --> 00:02:24.510
It has not been changed or altered.

00:02:26.320 --> 00:02:29.940
That means that we can store data such as a log.

00:02:29.950 --> 00:02:34.710
We have a MAC code for it. And we know when you go back to look at it again,

00:02:34.930 --> 00:02:38.630
there has not been a bit that's been changed or something

00:02:38.630 --> 00:02:40.660
which has been altered from the original.

00:02:41.130 --> 00:02:45.270
We can also use this in transmission where we know now that the

00:02:45.270 --> 00:02:49.830
message was not affected by, say, noise or any interruption on the

00:02:49.830 --> 00:02:52.030
network as it was being transmitted.

00:02:53.220 --> 00:02:55.410
There's a number of different types of message

00:02:55.410 --> 00:02:57.770
authentication codes we've used over the years.

00:02:58.190 --> 00:03:01.750
You're familiar with many of them. If you ever did an old dial up modem,

00:03:01.750 --> 00:03:04.350
you'd set things up for parity bits.

00:03:05.180 --> 00:03:07.280
We know that there's been checksums.

00:03:07.280 --> 00:03:10.880
We often saw this when we send a large file, and we'd have trailer

00:03:10.880 --> 00:03:15.180
records that would have a total of the number of transactions and the

00:03:15.190 --> 00:03:18.320
dollar value of those transactions in that record.

00:03:18.950 --> 00:03:24.030
And we had things like cyclic redundancy checks built into, it used to be

00:03:24.030 --> 00:03:28.690
when you loaded a game off of a disk. Then, it would see if there are any

00:03:28.690 --> 00:03:32.440
errors in what we call the CRC 32 check.

00:03:33.360 --> 00:03:38.400
But the main way we do this today is using a hash function, a hash

00:03:38.400 --> 00:03:44.080
function such as Message Digest 5, Secure Hashing Algorithm 1, 2, or

00:03:44.080 --> 00:03:48.280
3, or the European Standard RIPEMD‑160.

00:03:49.190 --> 00:03:52.260
We also have something we called hash maccing and

00:03:52.260 --> 00:03:54.430
sometimes also called keyed hashing.

00:03:54.640 --> 00:03:58.330
So these are the main hash functions that we're using today.

00:03:59.230 --> 00:04:04.470
We use these very often either to just prove that a message

00:04:04.470 --> 00:04:09.700
hasn't been changed or to also prove who the message came from

00:04:09.800 --> 00:04:11.850
and that the message was not changed.

00:04:12.290 --> 00:04:15.200
This is something we call a digital signature. We'll

00:04:15.200 --> 00:04:16.950
look at that in just a second.

00:04:18.149 --> 00:04:20.540
How does a hash function work?

00:04:21.000 --> 00:04:27.000
We take our message, and we run it through a system that has within it a hashing

00:04:27.010 --> 00:04:32.250
algorithm. That generates something we will call a digest.

00:04:32.470 --> 00:04:36.080
It's often sometimes called a thumbprint, a fingerprint.

00:04:36.300 --> 00:04:41.040
But anyway, it's a value that was calculated from that message.

00:04:41.960 --> 00:04:46.840
We can append that digest or attach it to the message itself.

00:04:47.800 --> 00:04:51.670
Then, send a message through our transmission channel.

00:04:52.010 --> 00:04:53.800
And when it's received,

00:04:54.270 --> 00:04:59.550
the receiver will now have the message that they will hash. They will make

00:04:59.550 --> 00:05:04.140
sure that the digest from the message they received is the same as the

00:05:04.140 --> 00:05:06.900
digest that was sent along with the message.

00:05:07.450 --> 00:05:12.470
This is very good because these hashing algorithms are very sensitive to any

00:05:12.470 --> 00:05:19.650
change. Even if one bit was flipped in transmission, the digest would be at

00:05:19.650 --> 00:05:24.500
least 48% different from what it had been originally.

00:05:24.760 --> 00:05:28.480
So a massive change in the digest would be noticed.

00:05:28.910 --> 00:05:34.580
And so this is one good way just to verify that there was not an error in,

00:05:34.580 --> 00:05:39.940
say, communications. But we talked about a digital signature.

00:05:40.510 --> 00:05:45.480
This is certainly a form of, not only message integrity, but

00:05:45.480 --> 00:05:48.380
what we looked at before, nonrepudiation.

00:05:49.360 --> 00:05:53.200
We can take that message and run it through a hashing algorithm.

00:05:53.760 --> 00:05:55.800
We create the digest.

00:05:55.810 --> 00:06:07.120
Maybe we use, for example, SHA‑2 for this. SHA‑2 generates a digest of 256,

00:06:07.120 --> 00:06:10.980
384, or 512 bits depending on which setting we're using.

00:06:12.130 --> 00:06:15.350
We then take that digest and encrypt it.

00:06:15.940 --> 00:06:20.280
This time, we encrypt it with the sender's private key.

00:06:20.740 --> 00:06:24.600
Just as like we did when Alice was sending the message to Bob,

00:06:24.610 --> 00:06:28.520
she encrypted that message with her private key.

00:06:29.220 --> 00:06:33.690
And when I do that encrypted digest with a private key.

00:06:33.890 --> 00:06:38.470
I then generate something very special, a digital signature.

00:06:38.900 --> 00:06:42.860
A digital signature is a combination of a hash

00:06:42.860 --> 00:06:45.810
function and asymmetric cryptography.

00:06:46.540 --> 00:06:51.690
And that means that I can now send this message through a completely

00:06:51.700 --> 00:06:57.930
untrusted channel. When it's received at the far end, the person who

00:06:57.930 --> 00:07:01.640
receives it has this message and the digital signature.

00:07:02.410 --> 00:07:06.740
They can decrypt the digital signature, and we know the only key in the world

00:07:06.740 --> 00:07:09.840
that would decrypt it would be the sender's public key.

00:07:11.090 --> 00:07:15.130
They can take the message they received, and they can hash it using

00:07:15.130 --> 00:07:19.720
the same hashing algorithm as the sender used. That will generate

00:07:19.730 --> 00:07:22.500
the digest of the message received.

00:07:23.140 --> 00:07:27.470
They can compare that with the digest that had been

00:07:27.470 --> 00:07:29.880
decrypted from the digital signature.

00:07:30.300 --> 00:07:35.410
And then we would know two things, one, that this message truly

00:07:35.410 --> 00:07:41.350
came from that sender and, two, that the message was not altered as

00:07:41.350 --> 00:07:43.690
it was transmitted through the network.

00:07:44.670 --> 00:07:49.200
Now, you can notice that this does not provide confidentiality.

00:07:49.200 --> 00:07:51.410
The message itself was not encrypted.

00:07:52.190 --> 00:07:56.790
This is intended as a digital signature just to prove

00:07:56.910 --> 00:08:00.790
proof of origin and message integrity.

00:08:00.860 --> 00:08:06.280
So we have nonrepudiation. That message came from that person.

00:08:06.620 --> 00:08:09.560
Nobody could have altered or changed that.

00:08:10.230 --> 00:08:15.150
And this is a very useful tool, of course, in today's world.

00:08:15.220 --> 00:08:20.750
And just to stress one thing here, a digital signature is created by

00:08:20.750 --> 00:08:24.870
encrypting a hash of the message with a private key.

00:08:25.570 --> 00:08:27.950
It has nothing to do with, say, for example,

00:08:27.950 --> 00:08:31.470
encrypting the message itself, nor does it have anything to

00:08:31.470 --> 00:08:35.510
do with a digitized signature, which is just basically a

00:08:35.510 --> 00:08:37.620
scan of a person's autograph.

00:08:38.500 --> 00:08:40.880
It's a very useful tool we use here.

00:08:42.409 --> 00:08:49.090
The key points review. Hashing is primarily used to ensure integrity of data.

00:08:49.620 --> 00:08:51.670
It is a one‑way function as well.

00:08:52.050 --> 00:08:58.170
So if I have a hash of a message, it would be computationally infeasible to

00:08:58.170 --> 00:09:02.910
figure out what the actual message it was just from the hash.

00:09:03.720 --> 00:09:09.570
It's very accurate to even the smallest changes to the original message.
