1 00:00:00,820 --> 00:00:01,960 Hello, my name is Ivan. 2 00:00:01,960 --> 00:00:07,660 And in this lecture, let's shift our focus to the fascinating world of the program header table, which 3 00:00:07,690 --> 00:00:14,860 offers a segment view of the binary in contrast to the section header table which we discussed earlier, 4 00:00:14,860 --> 00:00:20,050 which that provides a section view primarily for static linking purposes. 5 00:00:20,080 --> 00:00:25,510 The program header table, which you will learn in this lecture, serves a different purpose. 6 00:00:25,780 --> 00:00:33,010 So it is utilized by the operating system and dynamic linker during the loading process of an Elf binary 7 00:00:33,010 --> 00:00:35,650 into a process for execution. 8 00:00:35,890 --> 00:00:43,000 The program header table enables them to locate the relevant code and make informed decisions about 9 00:00:43,000 --> 00:00:46,420 what to load into the virtual memory. 10 00:00:46,420 --> 00:00:51,760 And in an Elf binary, a segment encapsulates a zero or more sections. 11 00:00:51,760 --> 00:00:56,230 So let's go back to the Kali machine here and. 12 00:00:57,520 --> 00:00:59,350 Here we have. 13 00:00:59,530 --> 00:01:03,550 Let's open the include elf dot header file. 14 00:01:03,550 --> 00:01:10,930 And here, as you can see here, a segment encapsulates zero or more sections, essentially bundling 15 00:01:10,930 --> 00:01:13,660 them together into a cohesive unit. 16 00:01:13,660 --> 00:01:22,360 So the segments provide an execution view, making them essential for Elf for executable files, but 17 00:01:22,360 --> 00:01:26,410 not executable files like Relocatable objects do not require them. 18 00:01:26,410 --> 00:01:34,210 So to represent this segment, view the program header tables instruct employs program headers of type 19 00:01:34,210 --> 00:01:36,670 the Elf 64 here. 20 00:01:40,340 --> 00:01:46,910 As you can see here, F 64 and specifically F 64, the error. 21 00:01:49,100 --> 00:01:49,520 Here. 22 00:01:51,330 --> 00:01:59,790 So these are the as you can see here, this header file also has the header of the also comments to 23 00:01:59,790 --> 00:02:03,600 that says that this is a program header program, segment header. 24 00:02:03,600 --> 00:02:10,740 And here which this each containing various fields that provide essential information. 25 00:02:10,740 --> 00:02:16,920 And here as you can see, the segment types flags, file offsets, virtual addresses, physical address, 26 00:02:16,920 --> 00:02:21,630 size and file size and memory and segment alignment. 27 00:02:21,630 --> 00:02:28,950 So understanding the program header table is crucial as it grants us insights into how the operating 28 00:02:28,950 --> 00:02:34,380 system and dynamic linker organize and load binaries into a memory. 29 00:02:35,380 --> 00:02:41,500 By examining the program headers, we can decipher the layout and composition of the binary, enabling 30 00:02:41,500 --> 00:02:48,040 us to comprehend the crucial components that contribute to the binaries functionality. 31 00:02:48,040 --> 00:02:54,160 In the upcoming sections, we will delve deeper into the inner workings of program headers and their 32 00:02:54,160 --> 00:02:59,050 role in the loading and execution of Elf binaries. 33 00:02:59,110 --> 00:03:04,840 And as we continue our exploration of the Elf format, we gain a comprehensive understanding of its 34 00:03:04,840 --> 00:03:06,880 intricate structure. 35 00:03:07,060 --> 00:03:16,090 This knowledge equips us with the necessary tools to dissect and analyze binaries effectively, or reverse 36 00:03:16,090 --> 00:03:24,070 engineer malware and unraveling their secrets and uncovering the fascinating world of binary analysis. 37 00:03:24,100 --> 00:03:25,390 Now let's. 38 00:03:26,360 --> 00:03:31,490 Analyze this, files this information more deeply here. 39 00:03:31,490 --> 00:03:35,870 So and I will describe each of these fields in the next. 40 00:03:37,060 --> 00:03:41,050 Uh, times, uh, some of them in this lecture, some of them in next lecture. 41 00:03:41,050 --> 00:03:42,820 And now we will. 42 00:03:44,630 --> 00:03:45,290 Again. 43 00:03:45,560 --> 00:03:46,910 Use the red elf here. 44 00:03:46,910 --> 00:03:47,720 We can actually. 45 00:03:47,720 --> 00:03:54,830 No, let's actually open the new tab and let's go to desktop where our Hello World program exists. 46 00:03:55,670 --> 00:03:59,090 And here we have that A.out. 47 00:03:59,100 --> 00:04:01,040 Let's actually run this. 48 00:04:01,960 --> 00:04:04,960 And you will see that we have Hello comma world. 49 00:04:06,390 --> 00:04:16,710 So here we will again use the clear red, white and segments, segments and a dot out. 50 00:04:16,950 --> 00:04:20,010 And this is our compiled application. 51 00:04:20,340 --> 00:04:22,800 Just a regular Hello world application. 52 00:04:26,960 --> 00:04:28,640 And here we have this. 53 00:04:29,440 --> 00:04:30,460 Output here. 54 00:04:31,030 --> 00:04:37,600 You can see that we have the program headers section to segment mapping and. 55 00:04:39,290 --> 00:04:43,660 Which in this section we are interested in this field. 56 00:04:43,670 --> 00:04:52,190 So also keep you in mind that section to segment mapping at here of the rate of output, which clearly 57 00:04:52,220 --> 00:04:58,970 illustrates that segments are simply a bunch of sections bundled together. 58 00:04:59,300 --> 00:05:00,560 As you can see here. 59 00:05:00,560 --> 00:05:07,160 And this specific section to segment mapping is typical for most Elf binaries, but you will encounter 60 00:05:07,160 --> 00:05:08,660 and in. 61 00:05:10,100 --> 00:05:10,550 This. 62 00:05:10,590 --> 00:05:16,460 The rest of the section, we will learn the program header fields. 63 00:05:17,120 --> 00:05:23,150 As you can see here, specifically the P type which we will start P type. 64 00:05:23,300 --> 00:05:26,750 And we will also learn the P flags and. 65 00:05:27,850 --> 00:05:28,660 So on. 66 00:05:28,660 --> 00:05:33,190 So here we let's go back to the header file. 67 00:05:35,080 --> 00:05:36,520 He type P flags. 68 00:05:36,520 --> 00:05:38,170 We have this segments, right? 69 00:05:38,290 --> 00:05:46,060 So P type P flex P offset p v adder, which is segment virtual address segment, physical address, 70 00:05:46,060 --> 00:05:49,720 segment size and file, segment size in memory and segment alignment. 71 00:05:49,720 --> 00:05:53,140 So let's start with the segment type here and we will also need that. 72 00:05:56,130 --> 00:05:59,250 The here and perfect. 73 00:05:59,280 --> 00:06:02,790 We also have this marker, which is not great, but. 74 00:06:04,350 --> 00:06:05,600 Acceptable, I think. 75 00:06:05,610 --> 00:06:11,700 And here let's also increase the font size a little bit so you can see better. 76 00:06:11,700 --> 00:06:17,880 And this P type field, as you can see, we have two of these. 77 00:06:18,960 --> 00:06:24,840 Which one is for the Elf 64 and one is for Elf 32. 78 00:06:25,050 --> 00:06:31,680 As you can see here, this has several additional methods, variables. 79 00:06:31,680 --> 00:06:34,860 And as you can see here, we have in 64. 80 00:06:36,490 --> 00:06:38,110 After the word here. 81 00:06:41,080 --> 00:06:43,150 As you can see, Elf64 error. 82 00:06:43,150 --> 00:06:47,290 And here in L32 we have the just the regular word. 83 00:06:47,290 --> 00:06:50,980 And here in Elf64, we have the word. 84 00:06:52,210 --> 00:06:57,130 So their names may be varied depending on the structure, but. 85 00:07:00,730 --> 00:07:06,910 If you look at this here, they all have the same functionality here and some description. 86 00:07:07,030 --> 00:07:15,310 And this P type, which is segment type field, identifies the type of the segment and important values 87 00:07:15,310 --> 00:07:22,810 for this fields include the load dynamic and p t interpreter. 88 00:07:23,820 --> 00:07:28,020 And as you can see here, Interp, we have offset virtual. 89 00:07:29,760 --> 00:07:30,630 Address. 90 00:07:30,850 --> 00:07:31,280 Yes. 91 00:07:31,320 --> 00:07:35,880 Virtual address, physical address, file size, mem size, flags and so on. 92 00:07:38,470 --> 00:07:47,320 And the segments of this type load, as the name implies, are intended to be loaded into memory when 93 00:07:47,320 --> 00:07:55,510 setting up the process and the size of the loadable chunk and the address to load it at are described 94 00:07:55,720 --> 00:07:58,240 in the rest of the program header. 95 00:07:58,270 --> 00:08:07,120 As you can see in this output, there are usually less load here and here. 96 00:08:07,120 --> 00:08:07,720 Let's actually. 97 00:08:08,790 --> 00:08:12,060 We have the entire peer dynamic. 98 00:08:13,910 --> 00:08:20,810 And we have the Lords, the Lord of Lords here, dynamic node, node, node, renew property and so 99 00:08:20,810 --> 00:08:22,340 on, which you will learn here. 100 00:08:22,340 --> 00:08:23,690 And we also. 101 00:08:24,960 --> 00:08:32,250 Have the flags field second field so this flags field. 102 00:08:33,640 --> 00:08:34,810 Hit the flags. 103 00:08:35,110 --> 00:08:36,000 It flags. 104 00:08:36,010 --> 00:08:41,890 So the flags specify the runtime access permission for the segment. 105 00:08:41,890 --> 00:08:51,070 And three important types of flags is the exist is t, f, x, p, f, w and the p f read or are here. 106 00:08:52,470 --> 00:08:53,940 T f x flag. 107 00:08:54,480 --> 00:08:56,250 Let's actually write it down here. 108 00:08:57,540 --> 00:08:58,200 He. 109 00:09:00,950 --> 00:09:09,350 Underscore X means that it indicates that the segment is executable and set for the code. 110 00:09:09,350 --> 00:09:16,040 Segments like Rudolph displays it as an E rather than X in the flag column here. 111 00:09:18,070 --> 00:09:18,640 It's actually. 112 00:09:21,530 --> 00:09:24,710 You can see here HDR flag in. 113 00:09:24,710 --> 00:09:27,950 Sometimes here we have R, W and E here. 114 00:09:27,950 --> 00:09:30,710 So you can read this as X, so. 115 00:09:32,130 --> 00:09:39,930 They are the same in reality, but rather wants to write it as E because it's actually the first word, 116 00:09:40,170 --> 00:09:42,270 the first character of the executable. 117 00:09:42,270 --> 00:09:45,060 So it makes the sense. 118 00:09:45,060 --> 00:09:55,890 But X was as as acceptable here and finally here, which the and also the R here means the readable 119 00:09:55,890 --> 00:09:56,670 and. 120 00:09:57,900 --> 00:10:04,710 We have W, which is the means that segment is writable and it's normally set only for writable data 121 00:10:04,710 --> 00:10:07,590 segments and never for the code segments. 122 00:10:07,590 --> 00:10:11,160 And we have this R here obviously. 123 00:10:12,620 --> 00:10:22,070 This means that readable segment here, as in normally the case for both code and data segments. 124 00:10:22,070 --> 00:10:25,340 And the later we have this. 125 00:10:26,210 --> 00:10:30,530 After flax we have PE offset pe vector. 126 00:10:30,560 --> 00:10:31,430 PE pe. 127 00:10:31,460 --> 00:10:38,360 After pe files SC and he mem SC here. 128 00:10:38,450 --> 00:10:46,160 These fields are analogous to the C-H offset here which you saw previously. 129 00:10:47,620 --> 00:10:52,140 We have this here section file offset section size in bytes and so on. 130 00:10:52,150 --> 00:10:53,080 So. 131 00:10:55,090 --> 00:10:58,710 You need to go back to p files here. 132 00:10:59,920 --> 00:11:00,400 Yes. 133 00:11:00,520 --> 00:11:07,630 So they specify the file offset at which the segment starts and the virtual address at which it is to 134 00:11:07,630 --> 00:11:10,090 be loaded and the file size. 135 00:11:11,230 --> 00:11:13,900 Of the segments respectively for loadable segments. 136 00:11:13,930 --> 00:11:14,270 P. 137 00:11:15,010 --> 00:11:16,270 V Adder here. 138 00:11:16,300 --> 00:11:16,720 P. 139 00:11:16,720 --> 00:11:17,080 P. 140 00:11:17,110 --> 00:11:17,800 Adder. 141 00:11:18,600 --> 00:11:28,740 And R must be equal to P offset, which is typically 4096 bytes. 142 00:11:28,740 --> 00:11:36,030 And on some systems it's possible to use the p addr field to specify at which address in physical memory 143 00:11:36,030 --> 00:11:37,650 to load the segment. 144 00:11:37,680 --> 00:11:45,540 On modern operating systems such as Linux, this field is unused and set to zero since they execute 145 00:11:45,540 --> 00:11:47,790 all binaries in virtual memory. 146 00:11:47,790 --> 00:11:54,600 So at first glance it may not be obvious that why there are distinct fields for the file size of the 147 00:11:54,600 --> 00:11:55,080 segment. 148 00:11:55,080 --> 00:11:56,520 Like if. 149 00:11:57,350 --> 00:12:01,920 He files SEC and the size and memory memes. 150 00:12:02,930 --> 00:12:09,800 Um, to understand this, let's recall the subsections only indicate the need to allocate some bytes 151 00:12:09,800 --> 00:12:13,880 in memory, but don't actually occupy these bytes in the binary file. 152 00:12:13,880 --> 00:12:17,210 So for instance, the BSS section. 153 00:12:18,080 --> 00:12:22,700 Which you can't see on the shelf here, contains zero initialized data. 154 00:12:22,700 --> 00:12:26,870 Since all data in this section is known to be zero anyway. 155 00:12:27,670 --> 00:12:34,330 And there are no need to actually include all these zeros in the binary. 156 00:12:34,370 --> 00:12:34,750 Right. 157 00:12:34,750 --> 00:12:44,440 So, however, when loading the segment containing the BSS into virtual memory, all the bytes in BSS 158 00:12:44,560 --> 00:12:46,300 should be allocated. 159 00:12:46,300 --> 00:12:52,150 So this is possible for mem mem sec to be larger than the PE file sec. 160 00:12:52,180 --> 00:12:59,110 When this happens, the loader adds the extra bytes at the end of the segment when loading the binary 161 00:12:59,110 --> 00:13:02,710 and initializes them to zero. 162 00:13:02,710 --> 00:13:07,120 And lastly before ending this. 163 00:13:08,040 --> 00:13:08,880 Section here. 164 00:13:08,910 --> 00:13:10,200 This field. 165 00:13:11,560 --> 00:13:16,950 Here we have just one field left to explained. 166 00:13:16,960 --> 00:13:26,620 So the p align field is analogous to the error a align filled in the section header. 167 00:13:26,620 --> 00:13:35,020 It indicates the required memory alignment in bytes for the segments, just as with the Kwadril line, 168 00:13:35,290 --> 00:13:42,400 an alignment value of 0 or 1 indicates that no particular alignment is required and if p align is set 169 00:13:42,400 --> 00:13:52,750 to 0 or 1 then its value must be power of two and p of adder must be equal to p offset modulo p align. 170 00:13:52,750 --> 00:13:58,030 And in this lecture you learned all the intricacies of the elf format. 171 00:13:58,890 --> 00:14:02,460 And we have covered the format of executable header. 172 00:14:02,490 --> 00:14:07,140 The section header and program header tables and contents of sections. 173 00:14:07,140 --> 00:14:14,490 So that was quite an endeavor and it was worth it because now that you are familiar with the innards 174 00:14:14,490 --> 00:14:22,770 of Elf binaries, you have a great foundation for learning more about binary analysis and reverse engineering. 175 00:14:23,680 --> 00:14:31,870 And stay tuned for more exciting insights into format and reverse engineering and its impact on binary 176 00:14:31,870 --> 00:14:32,560 analysis. 177 00:14:32,560 --> 00:14:34,270 I'm waiting you in the next lecture.