1 00:00:00,150 --> 00:00:07,170 Now, this process of determining how this how frequently a letter appeals in a plain text in a ciphertext 2 00:00:07,170 --> 00:00:13,590 is called frequency analysis, not understanding frequency analysis is an important step in hacking 3 00:00:13,590 --> 00:00:14,760 the Virginia cipher. 4 00:00:15,210 --> 00:00:21,480 So we will use the legal frequency analysis to break the Virginia cipher in the next session. 5 00:00:21,810 --> 00:00:27,360 So in this particular session, we are going to cover the frequency of the letter frequency, the method 6 00:00:27,360 --> 00:00:30,600 ski and the reverse method or the reverse keyword arguments. 7 00:00:30,930 --> 00:00:32,670 Parsing functions has value. 8 00:00:32,670 --> 00:00:39,180 Instead of calling the functions and converting dictionaries to list using the keys, values and items 9 00:00:39,180 --> 00:00:39,680 matter. 10 00:00:40,230 --> 00:00:44,100 So let's understand analyzing the frequency of letters in a text. 11 00:00:44,410 --> 00:00:49,370 Now, when you flip a coin, about half the time it comes up heads and half the time it comes up as 12 00:00:49,380 --> 00:00:49,860 sticks. 13 00:00:50,310 --> 00:00:54,350 Now that is the frequency of heads and this should be about the same. 14 00:00:54,840 --> 00:01:02,370 We can represent the frequency as a percentage by dividing the total number of lines and even by total 15 00:01:02,370 --> 00:01:08,670 number of items at that particular event and then multiplying the quotient by one hundred so we can 16 00:01:08,670 --> 00:01:15,390 learn much more about a coin from its frequency of hits and things, whether the coin is fair or unfairly 17 00:01:15,390 --> 00:01:18,080 weighted or even if it has 200 points. 18 00:01:18,510 --> 00:01:23,430 We can also learn much about the ciphertext from a frequency of its letters. 19 00:01:23,760 --> 00:01:27,870 Like some letters in English, alphabets are used more often than others. 20 00:01:28,140 --> 00:01:37,140 For example, the letters E all appear most frequently in the English words, whereas a little G excuse 21 00:01:37,140 --> 00:01:37,380 it. 22 00:01:37,380 --> 00:01:39,440 Appeals are less frequently in English. 23 00:01:39,840 --> 00:01:45,720 So we'll use this differences in the letter frequencies in English language to connect the Virginia 24 00:01:45,720 --> 00:01:46,920 encrypted messages. 25 00:01:47,400 --> 00:01:49,800 Now we will see a graph. 26 00:01:50,130 --> 00:01:57,750 Basically, you can have it when you can compile your other sources for the frequency analysis and then 27 00:01:57,960 --> 00:02:03,690 you can sort those little frequencies in order to order of the greatest frequency to the least frequency, 28 00:02:03,690 --> 00:02:04,380 for example. 29 00:02:04,500 --> 00:02:12,060 OK, now C or likewise, the letters that appear most often a.D.A ciphertext and a simple substitution 30 00:02:12,060 --> 00:02:18,120 ciphertext are more likely to have been encrypted from the most commonly found English letters like 31 00:02:18,120 --> 00:02:19,800 80 or so. 32 00:02:19,800 --> 00:02:25,230 Similarly, the letters that at least often in the ciphertext are more likely to have been encrypted 33 00:02:25,230 --> 00:02:28,760 from excuser, for example, in the plain. 34 00:02:29,080 --> 00:02:35,730 So if we come to some matching letter frequencies to find the letter frequencies in a message, we will 35 00:02:35,730 --> 00:02:43,020 use an algorithm that simply Aldo's the letter in a string by the highest frequency to the lowest frequency. 36 00:02:43,380 --> 00:02:50,400 Then the algorithm uses this ordered string to calculate what in this particular section is called Frequency 37 00:02:50,400 --> 00:02:57,150 Match School, which we will use to determine how similar the strings letters frequency is to that of 38 00:02:57,150 --> 00:02:58,140 the standard English. 39 00:02:58,410 --> 00:03:04,470 So to calculate the frequency med school for the ciphertext, we start with zero and then add a point 40 00:03:04,470 --> 00:03:05,220 each time. 41 00:03:05,520 --> 00:03:12,810 One of the most frequently of frequent English letters at the CDC or ION appeals among the six most 42 00:03:12,810 --> 00:03:19,650 frequent letters of the ciphertext, will also add a point to score each time one of the least frequency 43 00:03:19,650 --> 00:03:23,070 letters like the V or appeals. 44 00:03:23,310 --> 00:03:30,480 Among the six least frequency of frequent letters of the ciphertext, the frequency mathkour of a string 45 00:03:30,480 --> 00:03:32,610 can range from zero to twelve. 46 00:03:32,940 --> 00:03:38,370 Knowing the frequency, mascord of a ciphertext can reveal important information about the original 47 00:03:38,370 --> 00:03:39,100 plaintext. 48 00:03:39,510 --> 00:03:48,570 So, for example, if we go for using a frequency analysis on a Virginia Saiful, so to Harkavy Saiful 49 00:03:48,570 --> 00:03:51,570 we need to decrypt the sub D individually. 50 00:03:51,930 --> 00:03:59,670 That means we can't rely on using English word or detection because we won't be able to decrypt enough 51 00:03:59,670 --> 00:04:01,890 of the message using just one subject. 52 00:04:02,080 --> 00:04:08,970 Instead, we will decrypt the letters and repeat with one subject and preform frequency analysis to 53 00:04:08,970 --> 00:04:15,950 determine which decrypted ciphertext produces a little frequency that most closely matches that of a 54 00:04:15,960 --> 00:04:16,860 regular English. 55 00:04:17,160 --> 00:04:21,720 In other words, we will need to find which decryption has the highest frequency. 56 00:04:21,730 --> 00:04:26,510 Matsuko, which is a good indication that we have found the correct subject. 57 00:04:26,940 --> 00:04:31,040 We repeat this process for the second, third, fourth, fifth suppy as well. 58 00:04:31,650 --> 00:04:37,590 So just for now, we are just guessing that the Guilin this five letters because that are twenty six 59 00:04:37,590 --> 00:04:39,520 descriptions for each subject. 60 00:04:39,540 --> 00:04:46,960 So in a beginning for the computer only has to perform twenty six plus twenty six plus one plus one 61 00:04:46,960 --> 00:04:51,960 is six and that is one fifty six decryption for the five letter. 62 00:04:51,960 --> 00:04:58,890 Q So this is much easier than performing scriptures for every possible combinations which would be somewhere 63 00:04:58,890 --> 00:04:59,370 around. 64 00:04:59,910 --> 00:05:08,160 One one eight one three seven six or something like that, so all these are more steps to have the Saiful, 65 00:05:08,790 --> 00:05:10,890 which we will learn in the next session, obviously. 66 00:05:10,890 --> 00:05:17,220 But when we write the hacking program for now, let's write a module that will form a frequency analysis 67 00:05:17,220 --> 00:05:19,560 using the following helpful function. 68 00:05:19,560 --> 00:05:25,350 First is getting the letter count, which will take the string parameter and return a dictionary that 69 00:05:25,350 --> 00:05:29,560 has a count of how often each letter appears in the string. 70 00:05:29,880 --> 00:05:35,640 Second is get the frequency order, which takes a string parameter and returns a string of twenty six 71 00:05:35,640 --> 00:05:40,890 letters order from most frequent or least frequent in the string parameter. 72 00:05:40,920 --> 00:05:46,260 And finally, English frequency math score, which takes a string parameter undertones and integer from 73 00:05:46,260 --> 00:05:50,610 zero to 12 indicating Alekos frequency match score. 74 00:05:50,820 --> 00:05:56,630 So understanding the background of what frequency analysis will do in the next session. 75 00:05:56,650 --> 00:06:01,050 Now we would start by creating the source code for matching little frequencies. 76 00:06:01,320 --> 00:06:03,550 We would create a separate file for it. 77 00:06:03,600 --> 00:06:10,020 And in that, after creating the file, we will also understand how the file has been created, which 78 00:06:10,020 --> 00:06:16,080 will be helpful for us in the next session to know how much or what are other techniques of hacking 79 00:06:16,080 --> 00:06:17,690 your original cipher also. 80 00:06:17,880 --> 00:06:23,760 So we would see how writing a particular frequency analysis program in the next session. 81 00:06:23,920 --> 00:06:25,860 That's from the session for now. 82 00:06:26,220 --> 00:06:28,150 We will see in the next one. 83 00:06:28,320 --> 00:06:29,310 Thank you very much.