1 00:00:00,600 --> 00:00:09,030 Hello and welcome in this video, I'm going to talk about streaming and data and coding and how it is 2 00:00:09,030 --> 00:00:16,320 used to obfuscate hidden data or hidden secrets inside PDF document. 3 00:00:19,060 --> 00:00:23,240 PDAF can include data in multiple ways. 4 00:00:23,950 --> 00:00:30,180 Let us take a look at one example of a PDF file which contains the falling object. 5 00:00:31,060 --> 00:00:37,840 This object has caught the stream Keywood and the inside it you have the string. 6 00:00:37,840 --> 00:00:38,420 Hello. 7 00:00:39,430 --> 00:00:45,370 And now he also has the opening object and enclosing an object. 8 00:00:45,580 --> 00:00:50,300 So this whole thing is actually an object which contains a string. 9 00:00:50,770 --> 00:00:53,740 So how are we going to encode this? 10 00:00:53,740 --> 00:00:55,210 To obfuscate this? 11 00:00:57,040 --> 00:01:07,570 This is one way the malicious PDAF Ortel could use hacks and coding, for example, using the Hexa encoding 12 00:01:07,570 --> 00:01:11,500 for Hello, this is what Halloa will look like. 13 00:01:12,850 --> 00:01:23,590 So hish e l l o space and world is converted into the text representation, which is bound for the year 14 00:01:23,800 --> 00:01:28,220 before age 65 or E and so on. 15 00:01:29,470 --> 00:01:36,760 So this is one way in which the string can be obfuscated using hex encoding. 16 00:01:38,630 --> 00:01:47,360 So coming back to this example of change, if we were to use hex encoding for Hello World, we will 17 00:01:47,360 --> 00:01:48,530 get something like this. 18 00:01:51,580 --> 00:01:58,610 So now the Hallowell has been called in to become a different obfuscated string. 19 00:02:00,520 --> 00:02:06,490 This is a list of some of the other available encoding methods. 20 00:02:07,480 --> 00:02:11,710 As we have seen, HelloWallet can be included into Destry. 21 00:02:12,780 --> 00:02:17,100 We can also include the string Hallowell using after including. 22 00:02:18,210 --> 00:02:27,150 So everybody there, including you, used a backslash followed by the the representation for the age 23 00:02:27,960 --> 00:02:32,700 of the one four five four e October five for four hours and so on. 24 00:02:34,960 --> 00:02:44,740 The malicious author can also create a mix between octal and Hex, including, for example, here we 25 00:02:44,740 --> 00:02:50,440 see the character which is incorrect hex including. 26 00:02:51,430 --> 00:02:59,940 Followed by a character being encoded with octal, which is by searching for five and then in character 27 00:02:59,950 --> 00:03:10,600 L, the encoded V hex encoding, which is bound Cixi and the other L is encoded using after encoding, 28 00:03:10,600 --> 00:03:13,240 which is backslash one, five, four and so on. 29 00:03:14,980 --> 00:03:23,050 And to add some variety and further obfuscation, the malicious article also and whitespace for example, 30 00:03:23,500 --> 00:03:32,950 each and some white space followed by the octal encoding for E followed by the hex including four L 31 00:03:33,460 --> 00:03:36,750 and then after that is a space again and so on. 32 00:03:37,300 --> 00:03:43,390 So when this thing is being decoded in the document, the white space will be no. 33 00:03:43,750 --> 00:03:51,850 But this is definitely one way in which the author can defeat static pattern analysis. 34 00:03:51,850 --> 00:03:57,040 If you want to use your arrows, for example, to detect Yarraville, fail to detect this. 35 00:03:57,640 --> 00:04:02,670 So these are some things you should be aware when you are doing analysis as well. 36 00:04:03,880 --> 00:04:09,880 Another way in which the data are malicious script can be Fosgate. 37 00:04:09,880 --> 00:04:22,510 It is about using filters to decode data so the malicious autocratic string on data and encoding using 38 00:04:22,510 --> 00:04:22,930 hex. 39 00:04:24,450 --> 00:04:26,430 For example, you get this. 40 00:04:27,560 --> 00:04:35,780 So this is actually the same as what we saw here for a cease fire, sixty six years, he said minus 41 00:04:35,790 --> 00:04:36,920 eight pounds symbol. 42 00:04:38,760 --> 00:04:49,230 So the object could use something called a filter, a filter would tell a PDF reader that this is your 43 00:04:49,230 --> 00:04:56,010 string here is supposed to be decoded using this method hex code. 44 00:04:56,760 --> 00:05:05,630 So when the object is being read or being read, the PDA reader will then use this tactic to decode 45 00:05:05,640 --> 00:05:08,430 this back into a normal string. 46 00:05:08,910 --> 00:05:10,710 So this is a meaningless filter. 47 00:05:11,190 --> 00:05:19,800 So the syntax is the filter, followed by the type of filter to be applied to the object. 48 00:05:20,880 --> 00:05:22,620 Let's take a look at another example. 49 00:05:24,290 --> 00:05:31,480 It is also possible to combine multiple filters to decode obfuscated string. 50 00:05:31,950 --> 00:05:38,820 So, for example, in this case here, this string here has gone through two processes, two levels 51 00:05:38,820 --> 00:05:39,810 of obfuscation. 52 00:05:41,040 --> 00:05:49,290 The first obfuscation is to decode it in hex, and the second obfuscation is to compress it. 53 00:05:50,220 --> 00:06:01,010 So when the reader sees this filter ewood decoding first, uncompress it first and then only decode 54 00:06:01,110 --> 00:06:05,970 it back from hex to ski so the coding is done in reverse. 55 00:06:06,450 --> 00:06:15,680 So this gives the malware a lot more choice to further obfuscate the string by combining various hex 56 00:06:15,690 --> 00:06:18,810 encoding plus compression as well. 57 00:06:19,440 --> 00:06:28,290 So remember, when you are trying to decode this, trying to obfuscate this, always look for the future 58 00:06:28,590 --> 00:06:38,310 and the type of the encoding obfuscation being used and then apply the obfuscation by going the reverse. 59 00:06:38,640 --> 00:06:46,710 That means decode it in reverse order and compress in this case, followed by Hectically in this case. 60 00:06:48,280 --> 00:06:51,880 There are multiple ways in which obfuscation can be done. 61 00:06:51,910 --> 00:06:54,880 For example, there is a hefty code. 62 00:06:55,330 --> 00:06:57,880 You have also NCW compression. 63 00:06:58,210 --> 00:07:03,420 You also have Zetlin compression base 85 and also encryption itself. 64 00:07:04,000 --> 00:07:11,150 So if you don't find something which is listed here, I always use Google to search for some more information. 65 00:07:12,640 --> 00:07:17,660 So this is a example of a malicious PDAF document. 66 00:07:18,340 --> 00:07:24,500 So in here you can see that this PDA document has got several objects on top here. 67 00:07:25,000 --> 00:07:32,020 You see this object, number one, based on the idea and the version number and the opening object that 68 00:07:32,020 --> 00:07:33,010 includes including object. 69 00:07:33,520 --> 00:07:36,410 And this particular object type catalog. 70 00:07:37,480 --> 00:07:44,740 And then here you have object number seven based on 87, version zero, the opening object to the closing 71 00:07:44,740 --> 00:07:45,070 object. 72 00:07:45,170 --> 00:07:49,120 Take any type of this object, which is JavaScript. 73 00:07:50,080 --> 00:07:53,830 And the object here is denoted by the ideate. 74 00:07:54,040 --> 00:07:58,480 He was Shinjiro, the opening object to the closing object. 75 00:07:59,320 --> 00:08:05,720 And then there's some object within it, which is stream or stream type and closing stream type. 76 00:08:06,280 --> 00:08:08,890 And inside here you see the obfuscated string. 77 00:08:09,910 --> 00:08:16,530 And in here you can see that this string has been put through two processes of obfuscation. 78 00:08:17,050 --> 00:08:20,780 First is to compress it and into two, encode. 79 00:08:22,150 --> 00:08:29,890 How do you read this sequence of objects looking at the first object, which is object number one of 80 00:08:29,890 --> 00:08:30,560 that catalog? 81 00:08:31,360 --> 00:08:34,690 You can see that there is an open I shouldn't be specified here. 82 00:08:34,700 --> 00:08:41,310 This directive and these operations parameter is seven seven refers to this object. 83 00:08:41,650 --> 00:08:50,440 That means when this malicious document is being open, it will execute this directive to open object 84 00:08:50,440 --> 00:08:51,210 number seven. 85 00:08:51,550 --> 00:08:53,650 So it would jump to object number seven. 86 00:08:53,980 --> 00:09:02,170 And number seven, you will see an error directive saying that is JavaScript, JavaScript object and 87 00:09:02,170 --> 00:09:04,990 this JavaScript objects parameter is eight. 88 00:09:05,380 --> 00:09:09,940 And this eight here would then refer down into another object, eight here. 89 00:09:10,210 --> 00:09:15,920 And this object here is to string the obfuscated string that we are seeing here. 90 00:09:16,690 --> 00:09:24,340 So what you will do then he will look at the filter directly and the parameters of this filter, and 91 00:09:24,340 --> 00:09:35,010 then the reader will know that it is supposed to the could of this a using the parameters here. 92 00:09:35,410 --> 00:09:44,260 So it will decoding first, hectically first and then you would uncompress it in order to get back the 93 00:09:44,260 --> 00:09:46,420 original JavaScript code. 94 00:09:47,080 --> 00:09:54,670 And after because of this operation, the PDAF document, we didn't execute whatever JavaScript that 95 00:09:54,670 --> 00:09:57,480 has been decoded into plain text. 96 00:09:57,910 --> 00:10:06,700 So this is the one of the method by which a proper document is able to execute malicious code using 97 00:10:06,700 --> 00:10:09,310 JavaScript, which has been obfuscated. 98 00:10:10,540 --> 00:10:15,370 So that brings us to the end of this video lesson. 99 00:10:15,950 --> 00:10:17,110 Thank you for watching.