WEBVTT

0
00:02.320 --> 00:10.990
In this lecture we'll talk about the properties and the use cases of hash algorithms. Though all cryptographic

1
00:10.990 --> 00:14.980
hash functions have different mathematical foundations

2
00:14.980 --> 00:17.310
they do have the same properties.

3
00:18.440 --> 00:23.740
Let's consider a cryptographic hash function and see what are its properties.

4
00:23.750 --> 00:27.320
The first property is called determinism.

5
00:27.320 --> 00:36.020
This means that the output of a hash function doesn't change between executions if it runs on the same

6
00:36.080 --> 00:37.110
input.

7
00:37.160 --> 00:43.330
It doesn't matter when or on what machine you calculate the hash, it will be the same

8
00:43.430 --> 00:54.000
if you use the same algorithm and input. In these examples I've calculated the hash of word Linux into

9
00:54.050 --> 01:02.930
different ways: using the sha512 some command and using the Openssl command; is the

10
01:02.930 --> 01:10.460
same hash. If I calculate the hash using an online tool we get the same result.

11
01:13.100 --> 01:20.660
Another important property of hash functions is that two different inputs always have different hashes

12
01:21.200 --> 01:28.940
or we can say that if two inputs like two files have the same hash they must be equal.

13
01:29.270 --> 01:38.370
If a single beat changes across inputs it will produce an entirely different hash. Let's calculated the

14
01:38.370 --> 01:42.870
hash of the word computer using sha256

15
01:46.730 --> 01:53.430
computer and the algorithm is sha256.

16
01:53.460 --> 02:03.310
This is the hash; now let's calculated the hash of the word computer1 using the same algorithm.

17
02:03.310 --> 02:07.630
We notice that two different words have two different hashes.

18
02:07.870 --> 02:11.140
Each input has its own unique hash.

19
02:11.140 --> 02:19.320
Keep this in mind! Now, theoretically speaking, it's possible to have collisions and that means different

20
02:19.350 --> 02:25.580
inputs with the same hash; but with a good cryptographic hash function

21
02:25.590 --> 02:33.510
there isn't a known way of finding a collision with all the computing resources in the world.

22
02:33.510 --> 02:42.610
We'll discuss hash collisions in a dedicated lecture later in the course. A hash function is one-way,

23
02:42.880 --> 02:50.920
meaning that it’s computationally infeasible to determine the original message from its hash value. Hashes

24
02:50.920 --> 02:55.720
work based on an irreversible mathematical transformation.

25
02:55.720 --> 03:03.310
In other words, given a hash, it would be impossible or extremely difficult to determine the deterministic

26
03:03.400 --> 03:11.090
steps taken to reproduce the input of that hash. Look at this hash.

27
03:11.160 --> 03:18.780
There is no way, no matter how much computational power you possess, to find the file or the text that

28
03:18.780 --> 03:20.430
has produced this hash.

29
03:23.090 --> 03:31.150
Hash functions always produce a fixed length output regardless of size of the input.

30
03:31.160 --> 03:43.050
For example SHA256 always produces a hash of 256 bits and sha-512 a hash of 512 bits.

31
03:43.100 --> 03:54.600
In fact the number after the Sha word means the length of the output hash. Let's see some examples. I'll

32
03:54.600 --> 04:06.140
calculate the hash of two characters, a and b, using SHA256.

33
04:06.230 --> 04:14.490
This is the hash which has a length of 256 bits or 64 hexadecimal characters.

34
04:14.540 --> 04:17.420
Now I calculated the hash of a phrase!

35
04:25.780 --> 04:35.610
And the output hash has the same length 64 symbols in base 16. Let's

36
04:35.620 --> 04:44.690
see now the hash of a file. I'm not interested in how big the file is; it's hate will have a length of

37
04:44.780 --> 04:49.920
256 bits or 64 hexadecimal characters.

38
04:57.440 --> 05:00.520
Sorry the command is SHA256sum.

39
05:00.590 --> 05:03.450
Okay.

40
05:03.460 --> 05:04.390
Permission denied.

41
05:04.990 --> 05:08.630
Okay let's choose another file etc/password

42
05:08.650 --> 05:09.280
or

43
05:15.010 --> 05:16.960
etc group/password.

44
05:19.560 --> 05:20.400
Or 

45
05:20.430 --> 05:21.740
/bin/ls

46
05:25.920 --> 05:27.670
and so on.

47
05:27.990 --> 05:31.050
We see the hash it's the same length.

48
05:31.110 --> 05:41.350
If I had the file of one gigabyte its hash would be of the same length. Another property of hash algorithms

49
05:41.410 --> 05:49.270
is that the avalanche effect. Changing one bit in the input should create an avalanche effect and its

50
05:49.270 --> 05:54.420
result is an entirely different hash. Let's an example!

51
05:56.790 --> 06:05.320
Let's take a bigger file, for example a dictionary file that comes with Kali Linux. Let's see how many

52
06:05.320 --> 06:08.470
characters and the words are in that file.

53
06:11.350 --> 06:24.560
usr/share/dict and american-english; there are almost 1 million characters. Let's calculate the hash

54
06:24.860 --> 06:25.800
of that file.

55
06:33.370 --> 06:37.750
This is the hash and I'll change a single character.

56
06:37.850 --> 06:40.330
I am opening the file using vim

57
06:44.070 --> 06:49.380
and at the end of the file I'll add a whitespace ; okay.

58
06:49.650 --> 06:50.840
And I'm saving the file.

59
06:53.600 --> 06:57.610
Okay I must be root in order to be able to edit the file.

60
07:00.140 --> 07:00.820
No problem.

61
07:00.830 --> 07:01.660
It's very easy.

62
07:08.770 --> 07:13.900
So at the end I'm adding a white space and I'm saving the file.

63
07:14.080 --> 07:15.430
Let' see its hash!

64
07:22.870 --> 07:24.650
And we noticed that

65
07:24.670 --> 07:27.180
its hash is entirely different.

66
07:27.530 --> 07:29.410
It's in fact a new hash.

67
07:29.410 --> 07:38.950
This is the avalanche effect. These five properties we've just discussed are the most important ones

68
07:39.070 --> 07:40.840
of any hash function.

69
07:41.260 --> 07:50.710
A good cryptographic hash function must also provide compression, efficiency and collision and preimage

70
07:50.830 --> 07:58.580
resistance. Compression means that the length of the output should be small enough.

71
07:58.620 --> 08:06.150
Efficiency means that the output hash should be computed easily enough in terms of time and hardware

72
08:06.300 --> 08:12.940
resources and collision and preimage resistance means to be secure enough.

73
08:12.990 --> 08:20.310
I'll go deeper into these two properties in a dedicated lecture where we'll talk about the attacks on

74
08:20.310 --> 08:21.540
hash algorithms.