1 00:00:00,420 --> 00:00:06,630 Before we head into analysing a sample, let us discuss about the scenarios of how we get to the point 2 00:00:06,630 --> 00:00:09,570 of malware analysis, scenario one. 3 00:00:10,550 --> 00:00:15,380 A user suspect that she has received an email which appears to be a phishing email. 4 00:00:16,460 --> 00:00:23,510 The user has reported this email as phishing, and it has reached the common mailbox used by SOC team 5 00:00:23,510 --> 00:00:27,660 to capture all phishing emails during analysis 6 00:00:27,680 --> 00:00:33,920 you notice that this particular email has a file attachment named Invoice dot xlsx. 7 00:00:35,390 --> 00:00:37,430 This will lead to malware analysis. 8 00:00:38,710 --> 00:00:39,640 Scenario two. 9 00:00:40,670 --> 00:00:47,390 SOC team has detected that one of the system is communicating to a blacklisted IP address, and upon 10 00:00:47,390 --> 00:00:56,440 detailed inspection, they suspect that it is caused by a process which is running as 11 00:00:57,170 --> 00:00:57,560 invoice dot xls dot exe . 12 00:00:59,620 --> 00:01:04,960 This sample is collected Password-Protected and sent to you for malware analysis. 13 00:01:06,100 --> 00:01:13,060 We have already built our Malware Analysis Lab or the sandbox, we have used tools like virtual box 14 00:01:13,060 --> 00:01:19,420 as hypervisor Windows 7 operating system and downloaded all the tools necessary for basic static 15 00:01:19,420 --> 00:01:27,520 analysis tools like HashCalc, Exeinfo PE, UPX, bintext and PE studio. 16 00:01:28,510 --> 00:01:30,850 Now let's put these tools into action. 17 00:01:32,110 --> 00:01:36,770 As a best practice, we should always collect the sample as a password-protected zip. 18 00:01:38,220 --> 00:01:43,050 In the previous demo, I have already downloaded the sample onto the sandbox. 19 00:01:44,600 --> 00:01:49,010 I will double check that there is no Internet connectivity by running IP configure. 20 00:01:50,290 --> 00:01:57,490 If you recall, we have actually disable the network interface, I will now extract the sample, the 21 00:01:57,490 --> 00:02:02,830 password most of the people used to compress a malware file is infected. 22 00:02:03,750 --> 00:02:06,480 I will enter the password to extract the sample. 23 00:02:07,420 --> 00:02:11,050 And here we see the invoice dot xlsx file. 24 00:02:13,060 --> 00:02:15,940 As we discussed in scenario number one and two. 25 00:02:17,740 --> 00:02:19,630 For a common user, it looks. 26 00:02:20,860 --> 00:02:22,750 To be a simple spreadsheet file. 27 00:02:24,090 --> 00:02:31,440 And when files with such names are sent to people in finance or procurement or logistics team, they 28 00:02:31,440 --> 00:02:37,950 are bound to double click and open the file as they would be frequently receiving such files like invoices, 29 00:02:37,950 --> 00:02:42,350 financial statement, order detail, purchase order, etc.. 30 00:02:44,010 --> 00:02:48,600 Because we are doing a static analysis, we have to work through this file without running it. 31 00:02:49,850 --> 00:02:54,960 So first, I will tell the operating system to show the extension of files. 32 00:02:57,250 --> 00:03:04,250 Immediately, we noticed that it is not an xls file, but an exe file. 33 00:03:05,470 --> 00:03:11,410 And if I change the folder view to details, we get to know it is an application. 34 00:03:12,880 --> 00:03:17,140 Which is obviously an executable and definitely not a spreadsheet. 35 00:03:18,680 --> 00:03:22,190 This file already showing the signs of a Malware. 36 00:03:24,240 --> 00:03:27,780 But let's proceed further and get the file hash of this malware. 37 00:03:29,160 --> 00:03:31,440 We will use the HashCalc tool. 38 00:03:32,520 --> 00:03:33,900 That we have already installed. 39 00:03:35,380 --> 00:03:39,520 I have enabled MD5, SHA1, SHA256 algorithm. 40 00:03:41,840 --> 00:03:47,960 We just need to drag and drop the sample onto the tool and instantly we have the hash values for each 41 00:03:47,960 --> 00:03:48,520 algorithm. 42 00:03:49,940 --> 00:03:52,430 Let's take this hash and check in virustotal. 43 00:03:53,810 --> 00:03:59,590 Remember, we do not have Internet in the sandbox, so I have to use my host machine to do so. 44 00:04:00,780 --> 00:04:02,550 Let's visit virustotal.com. 45 00:04:04,350 --> 00:04:08,940 Now, I'll paste the MD5 hash of the sample in the text box here. 46 00:04:10,810 --> 00:04:16,630 virus scan will tell us if this is a known malicious file and also tell us how many vendors can detect 47 00:04:16,640 --> 00:04:17,240 this file as malware 48 00:04:19,010 --> 00:04:24,380 In this case, it is 58 out of 72 scanners detect this as malware 49 00:04:25,500 --> 00:04:28,860 Looking at the result, the file is obviously a malware file. 50 00:04:30,560 --> 00:04:34,780 However, let's head back to our sandbox and continue working on the file. 51 00:04:41,020 --> 00:04:49,270 Now we will submit the file to Exeinfo PE, and it surely says that it is a 32 bit executable. 52 00:04:51,000 --> 00:04:57,630 And it also provides some additional details, like the sample is packed with UPX and down here, it 53 00:04:57,630 --> 00:05:00,120 gives us the unpacking instructions. 54 00:05:02,850 --> 00:05:05,490 Now, let us use UPX to unpack it. 55 00:05:06,670 --> 00:05:10,090 Before that, I want you to take a note of the file size here. 56 00:05:11,130 --> 00:05:13,350 Right now, it is 43 kb. 57 00:05:15,320 --> 00:05:22,970 Now, I will move to the directory where UPX file is stored and launch a command line tool by typing 58 00:05:23,570 --> 00:05:24,410 cmd in the address bar. 59 00:05:25,490 --> 00:05:28,790 This will take me directly to the folder where I run the command. 60 00:05:29,780 --> 00:05:37,940 As instructed by Exeinfo PE tool, to unpack the file, we need to use the command upx dot exe dash d 61 00:05:38,180 --> 00:05:39,380 and supply the file name. 62 00:05:44,560 --> 00:05:50,530 The extraction is successful, and if you notice, the file size is now 56 kb. 63 00:05:52,070 --> 00:05:58,420 I decompress the original sample again so that we have both packed and unpacked version. 64 00:06:01,790 --> 00:06:08,030 Now that we have both packed and unpacked files, let us see what difference it makes in string analysis. 65 00:06:09,140 --> 00:06:13,730 As we have learned earlier, we will use bintext to do string analysis. 66 00:06:14,970 --> 00:06:18,660 I will first load packed file and observe the readable strings. 67 00:06:22,310 --> 00:06:29,810 So we start seeing some strings here, it starts with the contents of DOS stub, which holds the message 68 00:06:30,230 --> 00:06:32,570 this program cannot be run in DOS mode. 69 00:06:34,250 --> 00:06:38,560 There is a word TLOSS at this point, we are not sure what it is. 70 00:06:40,360 --> 00:06:44,560 Here it says message box a, which appears to be a function. 71 00:06:46,400 --> 00:06:49,870 As you see, most of these strings do not make any sense. 72 00:06:52,010 --> 00:06:56,870 Here we see an http request but the url is not complete. 73 00:06:57,820 --> 00:07:02,350 We also see a broken or a half part of what looks like an IP address. 74 00:07:04,250 --> 00:07:09,860 Then we see a list of libraries being used and down here we see few functions. 75 00:07:10,850 --> 00:07:13,820 Right now, we see a total of nine functions. 76 00:07:15,520 --> 00:07:20,470 bintext repeats the whole file twice, here is the end of the file. 77 00:07:22,700 --> 00:07:25,850 From this point onwards, it is a repetition of the same strings. 78 00:07:27,750 --> 00:07:32,460 Throughout this packed binary file, we could hardly find any meaningful strings. 79 00:07:34,050 --> 00:07:38,520 At this point, there is not much to draw conclusion about the sample. 80 00:07:40,980 --> 00:07:43,910 Now, let's load the unpacked sample. 81 00:07:45,510 --> 00:07:49,410 I will load the 56 kb unpacked binary file. 82 00:07:51,970 --> 00:07:57,310 Right off the start, we are seeing more strings than what we saw in the packed version. 83 00:07:58,550 --> 00:08:07,130 Like here, we see some error messages, plus we see Microsoft visual C++ runtime libraries being imported. 84 00:08:08,620 --> 00:08:18,920 We also noticed few functions like get the last active pop up, get active window and message box a functions. 85 00:08:20,250 --> 00:08:27,660 Here, all the libraries are listed, but if you notice the functions there are way more functions 86 00:08:27,660 --> 00:08:31,250 we see here than what we saw in the packed file. 87 00:08:33,380 --> 00:08:37,520 And now we have some HTML code embedded in the malware. 88 00:08:39,130 --> 00:08:44,120 Looks like its job is to print the message your computer is in danger. 89 00:08:44,710 --> 00:08:48,870 Windows Security Centre has detected spyware or adware infection. 90 00:08:49,480 --> 00:08:52,720 It is strongly recommended to use special anti spyware. 91 00:08:55,010 --> 00:08:56,390 Here is something more interesting. 92 00:08:57,370 --> 00:09:02,250 There appears to be something like a hostname download bravesentry.com. 93 00:09:03,730 --> 00:09:06,160 And here is an IP address too. 94 00:09:08,670 --> 00:09:15,330 And if you see down here, it looks like our simple spreadsheet file is connecting to an url and running a 95 00:09:15,330 --> 00:09:18,000 http get request. 96 00:09:19,590 --> 00:09:28,050 It is accessing some registry settings in software, Microsoft Windows current version, Internet settings. 97 00:09:30,870 --> 00:09:34,380 Then the same popup message looks to be appearing again. 98 00:09:36,490 --> 00:09:44,560 Here we see the mention of the file in a specific path, that is C program files bravesentry and a 99 00:09:44,570 --> 00:09:45,070 file is 100 00:09:45,480 --> 00:09:46,120 bravesentry.exe 101 00:09:48,440 --> 00:09:51,260 Further, it is accessing few more registry paths. 102 00:09:53,410 --> 00:10:00,600 Down here, we noticed that it is accessing another registry path software, Microsoft Windows current 103 00:10:00,610 --> 00:10:01,510 version run. 104 00:10:02,630 --> 00:10:07,490 And it looks like it is adding a key value, their called install dot dat. 105 00:10:08,770 --> 00:10:10,590 Which is possibly another file. 106 00:10:12,550 --> 00:10:22,290 This registry path, software, Microsoft Windows Current Version Run, has a special significance which 107 00:10:22,300 --> 00:10:24,390 will be discussed in later modules. 108 00:10:26,990 --> 00:10:33,680 We see the same registry path again, software, Microsoft Windows current version run, but this time 109 00:10:34,190 --> 00:10:38,300 with the value of C Windows Update dot exe. 110 00:10:41,520 --> 00:10:45,840 Based on the string analysis, we can highlight the following details. 111 00:10:47,240 --> 00:10:53,060 That the file contains some user interaction, like if it contains any inputs or outputs to the user 112 00:10:54,260 --> 00:10:59,570 list of all the libraries used, functions used, network activity. 113 00:11:00,730 --> 00:11:07,330 Registry values being accessed, modified, created or deleted, also files being accessed, modified, 114 00:11:07,330 --> 00:11:09,010 created or deleted. 115 00:11:10,600 --> 00:11:16,210 By now, we already have a pretty good picture of what this malware is possibly doing. 116 00:11:17,880 --> 00:11:22,260 Now, let us see what PE studio has to reveal about this sample. 117 00:11:23,330 --> 00:11:28,550 We will launch The PE studio first and then load the unpacked sample. 118 00:11:30,060 --> 00:11:33,620 It'll take some time to extract all the information about the file. 119 00:11:35,360 --> 00:11:41,990 Here we see the properties of the file, like file hash in various algorithms. 120 00:11:43,350 --> 00:11:44,730 First few bytes in hex. 121 00:11:45,540 --> 00:11:51,900 That is 4D 5A, which is the file signature for executable file. 122 00:11:53,250 --> 00:11:57,180 The same is again confirmed in first few bytes in ASCII. 123 00:11:58,300 --> 00:12:01,540 If it is MZ, it is an executable. 124 00:12:02,850 --> 00:12:10,200 It also gives the overall file size, since we have loaded the unpacked version, it is 125 00:12:10,200 --> 00:12:12,180 57,344 bytes. 126 00:12:13,860 --> 00:12:21,990 We noticed the entropy value is 5.272, not very high for a malware file, but we 127 00:12:21,990 --> 00:12:28,470 should remember that we have loaded an unpacked version so the level of randomness is going down, 128 00:12:29,100 --> 00:12:31,710 thereby reducing the entropy value. 129 00:12:34,080 --> 00:12:44,700 It says the file is an executable and the architecture is 32 bit and a compile time is May 7th, 2007. 130 00:12:46,810 --> 00:12:55,300 Down here, we see all the different parts of the PE file that is DOS header, DOS stub, PE file header, 131 00:12:55,300 --> 00:12:58,360 optional header and sections. 132 00:13:00,170 --> 00:13:06,380 DOS stub give the expected message that is this program cannot be run in DOS mode. 133 00:13:07,810 --> 00:13:14,730 file header says this file has three sections in total, which we will verify in a few moments. 134 00:13:16,040 --> 00:13:18,830 It also confirms it is an executable. 135 00:13:20,640 --> 00:13:28,230 As expected, the optional header gives the entry point that is the point in the memory where the code 136 00:13:28,230 --> 00:13:28,740 starts. 137 00:13:29,680 --> 00:13:31,930 Which is the dot text section. 138 00:13:33,740 --> 00:13:41,270 Checking the details of the section, we see that the file has three sections, dot text, dot data 139 00:13:41,540 --> 00:13:43,040 and dot resource. 140 00:13:44,780 --> 00:13:50,540 We see the hash of each section and entropy of each section. 141 00:13:52,060 --> 00:13:59,440 If you notice the permission on the dot text section, it has a right permission, which is a tell-tell 142 00:13:59,440 --> 00:14:04,750 sign of a malware, that is, it is a self modifying executable. 143 00:14:07,190 --> 00:14:10,610 We can also see that it is using these seven libraries. 144 00:14:11,690 --> 00:14:14,600 And these 85 different functions. 145 00:14:16,690 --> 00:14:18,670 Let's take a look at the resources section. 146 00:14:19,750 --> 00:14:26,050 There are three resources to support this file, each of the resources, and it's a modified hash and 147 00:14:26,050 --> 00:14:28,090 the entropy value is highlighted. 148 00:14:29,350 --> 00:14:33,160 Also, we noticed that the dialogue appears to be in Russian. 149 00:14:35,690 --> 00:14:43,400 Finally, we see all the strings here, the same information, what we got from bin text, we will 150 00:14:43,400 --> 00:14:45,940 add all these details to our malware report. 151 00:14:46,430 --> 00:14:49,630 The report is attached in the resource section of this lesson. 152 00:14:52,800 --> 00:14:58,530 By now, we have a fair idea about the sample and by various static analysis method, it is confirmed 153 00:14:58,530 --> 00:15:00,300 that it is a malware file. 154 00:15:02,360 --> 00:15:07,220 We can further confirm the same during the dynamic analysis of the sample.