1 00:00:00,490 --> 00:00:07,720 Even though we have not reached a point where we have defined malware analysis, it is basically a process 2 00:00:07,720 --> 00:00:10,120 of understanding what a malware does. 3 00:00:11,430 --> 00:00:14,820 Malware is developed by the hacker using some programming language. 4 00:00:15,540 --> 00:00:21,390 The easiest way to understand what an application does is by reading the code line by line. 5 00:00:22,390 --> 00:00:28,690 But as we discussed earlier, it is not the actual file with set of instructions that will be downloaded 6 00:00:28,690 --> 00:00:29,620 by the end user. 7 00:00:30,230 --> 00:00:34,940 Instead, it will be a compile version that is a binary file. 8 00:00:36,280 --> 00:00:42,700 So the common sense says, if you want to understand what the program is, then read the instructions 9 00:00:42,700 --> 00:00:43,970 in the binary file. 10 00:00:44,800 --> 00:00:46,610 However, it is not so easy. 11 00:00:47,290 --> 00:00:54,700 The process of compilation changes the set of Line-by-line instructions into a machine readable format. 12 00:00:56,040 --> 00:01:03,870 If we open a binary file in a text editor, this is what we get, most of it seems gibberish except 13 00:01:03,870 --> 00:01:07,980 for a few traces of readable words, which we call strings. 14 00:01:09,840 --> 00:01:11,500 Compare it to a real life analogy. 15 00:01:12,240 --> 00:01:14,200 Let's bring back our Pizza example. 16 00:01:15,150 --> 00:01:21,350 Before the pizza is cooked, we can easily identify all the ingredients that goes into making the pizza. 17 00:01:22,230 --> 00:01:27,210 But once it undergoes the process of cooking, how much of the ingredients can we get back? 18 00:01:27,990 --> 00:01:30,210 The raw materials would have changed their form. 19 00:01:30,540 --> 00:01:33,470 Ingredients like salt will not be visible. 20 00:01:33,930 --> 00:01:38,580 That is, it is almost impossible to get ingredients back once the pizza is cooked. 21 00:01:40,220 --> 00:01:46,550 Similarly, once the program is written and compiled, it is very difficult, if not impossible, to 22 00:01:46,550 --> 00:01:49,430 get the original set off line by line instructions. 23 00:01:50,150 --> 00:01:53,120 This is why malware analysis seems difficult. 24 00:01:53,690 --> 00:01:59,750 The process involves the use of various tools and skill set that will help in understanding the function 25 00:01:59,750 --> 00:02:02,650 of a malware from such binary files.