1 00:00:00,530 --> 00:00:06,800 Now that you have gained a high level understanding of the inner workings of binaries, it's time to 2 00:00:06,800 --> 00:00:09,200 delve into a specific binary format. 3 00:00:09,230 --> 00:00:15,980 In this section, we will explore the executable and linkable format Elf, which serves as the default 4 00:00:15,980 --> 00:00:19,190 binary format for Linux based systems. 5 00:00:19,460 --> 00:00:27,050 Elf executable and linkable format finds its utility in various types of files, including executables, 6 00:00:27,080 --> 00:00:30,950 object files, shared libraries and code dams. 7 00:00:30,980 --> 00:00:38,930 While our primary focus will be on Elf executables in this section, it's important to note that the 8 00:00:38,930 --> 00:00:44,660 concepts we discuss apply to other types of Elf files as well. 9 00:00:45,400 --> 00:00:53,350 Given that we will primarily work with 64 bit binaries through this section and our discussion will 10 00:00:53,350 --> 00:00:57,370 revolve around the intricacies of 64 bit files. 11 00:00:57,370 --> 00:01:04,930 However, it's worth mentioning that 32 bit format is similar, differing mainly in the size of the 12 00:01:04,930 --> 00:01:12,040 size and the order of certain header files and other data structures, but they are basically similar 13 00:01:12,070 --> 00:01:22,150 here and therefore you will have no trouble extrapolating the concepts discussed here to 32 bit binaries. 14 00:01:22,360 --> 00:01:30,880 And in this diagram I created an illustration of the format and contents typically found in a 64 bit 15 00:01:30,910 --> 00:01:33,550 Elf executable file here. 16 00:01:33,550 --> 00:01:39,280 So at first glance, the complexity of analyzing Elf binaries may appear overwhelming. 17 00:01:39,370 --> 00:01:48,410 However, in a sense, Elf binaries consist of four primary components an executable header. 18 00:01:50,620 --> 00:01:52,810 A series of program heater. 19 00:01:53,650 --> 00:01:58,540 A number of sections and series of section headers. 20 00:01:58,570 --> 00:02:02,560 Now let's explore each of these components in detail. 21 00:02:03,650 --> 00:02:10,820 So as we see in this diagram, standard Elf rivalries begin with an executable heater. 22 00:02:12,040 --> 00:02:18,370 Followed by the program headers and conclude with the sections and section headers. 23 00:02:19,040 --> 00:02:21,950 Uh, to facilitate a more coherent discussion. 24 00:02:21,950 --> 00:02:29,360 I will deviate slightly from this order and first delve into the sections and sections leaders before 25 00:02:29,360 --> 00:02:32,660 addressing the program itself here. 26 00:02:34,030 --> 00:02:37,960 So let's begin with the executable header for now. 27 00:02:37,960 --> 00:02:44,050 So the executable header marks the beginning of every file. 28 00:02:44,050 --> 00:02:49,600 So it consists of a structured sequence of bytes that provides essential information about the file, 29 00:02:49,630 --> 00:02:57,700 such as its status as an Elf file, the specific type of file it represents, and the locations within 30 00:02:57,700 --> 00:03:04,000 the file where you can find the remaining contents to gain a comprehensive understanding of the executable 31 00:03:04,000 --> 00:03:10,570 header format, you can refer to the type definition and related definitions of other elf type elf related 32 00:03:10,600 --> 00:03:17,050 types and constants which can be found in our Linux distro here. 33 00:03:17,050 --> 00:03:20,320 So we will jump back to Linux here. 34 00:03:22,280 --> 00:03:23,780 Open this here. 35 00:03:24,140 --> 00:03:27,290 And what we're going to do here is. 36 00:03:29,220 --> 00:03:29,700 Sorry. 37 00:03:33,860 --> 00:03:34,820 We will now. 38 00:03:36,100 --> 00:03:36,760 Had the terminal. 39 00:03:36,760 --> 00:03:48,370 And what we're going to do is we will read the user mouse pad user include dot H here. 40 00:03:59,910 --> 00:04:04,800 Of that age and here, as you can see, actually. 41 00:04:24,530 --> 00:04:26,240 And this is our file. 42 00:04:26,780 --> 00:04:30,950 And here we have the E type. 43 00:04:31,130 --> 00:04:32,650 Machine type here. 44 00:04:32,660 --> 00:04:35,150 As you can see, we also have the comments of it. 45 00:04:35,800 --> 00:04:37,750 Let's increase the font size a little bit. 46 00:04:37,750 --> 00:04:45,190 And here so the executable header is represented here as a C struct here. 47 00:04:46,260 --> 00:04:52,110 And called the Elf 64 e HDR here. 48 00:04:54,000 --> 00:04:59,910 And if you look at up as we did here, you will get the same results. 49 00:05:00,270 --> 00:05:01,500 And here. 50 00:05:02,720 --> 00:05:11,660 And you may notice that the struct definition given there contains types such as 64 half and Elf 64 51 00:05:11,660 --> 00:05:12,830 word here. 52 00:05:12,860 --> 00:05:22,750 These are just typedefs for integer types such as u integer 16 dash t and u integer 32 t. 53 00:05:23,000 --> 00:05:27,020 So for simplicity here you can see. 54 00:05:28,200 --> 00:05:29,070 Uh, the. 55 00:05:30,160 --> 00:05:33,880 Comments of all those definitions here. 56 00:05:34,740 --> 00:05:37,680 And now let's start with the. 57 00:05:38,600 --> 00:05:41,180 E ident array. 58 00:05:41,720 --> 00:05:42,170 Right. 59 00:05:42,200 --> 00:05:43,130 So. 60 00:05:44,220 --> 00:05:45,430 This is an array. 61 00:05:45,640 --> 00:05:47,380 The executable header. 62 00:05:47,710 --> 00:05:48,880 The elf file. 63 00:05:48,950 --> 00:05:49,300 Oops. 64 00:05:49,600 --> 00:05:50,220 Sorry. 65 00:05:50,230 --> 00:05:53,320 Let's actually get the pen here. 66 00:05:53,320 --> 00:05:54,910 So I will draw this. 67 00:05:57,030 --> 00:05:57,210 It. 68 00:06:00,050 --> 00:06:01,100 And here. 69 00:06:01,190 --> 00:06:01,970 So we will. 70 00:06:01,970 --> 00:06:06,050 First, let's start with the ident here. 71 00:06:08,060 --> 00:06:18,650 So the executable heater and the files start with the 16 byte array called the E ident. 72 00:06:19,580 --> 00:06:25,940 And the array always starts with the four byte. 73 00:06:27,370 --> 00:06:29,520 For byte magical. 74 00:06:29,530 --> 00:06:31,500 That's the magic value here. 75 00:06:31,510 --> 00:06:35,740 Identifying the file as an elf binary. 76 00:06:37,500 --> 00:06:38,190 And. 77 00:06:39,070 --> 00:06:41,610 Sexually it again here. 78 00:06:42,640 --> 00:06:49,480 And the magic value consists of the hexadecimal number of 0X7. 79 00:06:49,840 --> 00:06:50,890 F here. 80 00:06:51,610 --> 00:06:54,940 Followed by an Ascii character. 81 00:06:55,760 --> 00:07:02,000 Um, codes for letters like E here, L and F. 82 00:07:03,310 --> 00:07:11,620 Having these bites right at the start is convenient because it allows tools such as file like. 83 00:07:12,770 --> 00:07:14,540 We did in previous year. 84 00:07:14,540 --> 00:07:17,810 We can get the information of files with. 85 00:07:19,040 --> 00:07:19,440 File. 86 00:07:19,580 --> 00:07:27,440 Command in Linux here, for example, let's go to new terminal and desktop and we will use the files 87 00:07:27,470 --> 00:07:28,010 again. 88 00:07:30,160 --> 00:07:32,050 To see the desktop here. 89 00:07:33,770 --> 00:07:35,630 And here we have several files here. 90 00:07:35,630 --> 00:07:38,360 So let's try with my APK here. 91 00:07:38,360 --> 00:07:42,080 And as you can see here, it's a C source Ascii text here. 92 00:07:42,110 --> 00:07:51,200 My my app file, my app dot all here and we can see the Elf 64 bit LSB Relocatable. 93 00:07:51,200 --> 00:07:54,740 We discussed about this in previous lecture here, so. 94 00:07:55,720 --> 00:07:57,400 Uh, we will skip this for now. 95 00:07:57,940 --> 00:08:01,960 And here we have the. 96 00:08:05,000 --> 00:08:10,510 So we can quickly discover that they are dealing with an Elf file and following magic value. 97 00:08:10,520 --> 00:08:18,830 There are a number of bytes that give more detailed information about the specifics of the type of elf 98 00:08:18,830 --> 00:08:30,650 file in an elf dot h here elf dot header file the indexes for these bytes, for example indexes for 99 00:08:30,680 --> 00:08:39,740 here four through 15 in the identifier array are symbolically referred as a class. 100 00:08:39,740 --> 00:08:41,750 Here I will write it out. 101 00:08:41,840 --> 00:08:46,460 So a e class is the. 102 00:08:47,890 --> 00:08:49,630 A class. 103 00:08:50,810 --> 00:08:51,560 Uppercase. 104 00:08:53,050 --> 00:08:53,860 Also. 105 00:08:55,170 --> 00:08:58,050 E a e theta. 106 00:09:04,230 --> 00:09:07,150 Also a. 107 00:09:08,360 --> 00:09:09,110 Version. 108 00:09:14,980 --> 00:09:15,790 Also. 109 00:09:19,910 --> 00:09:21,680 A wasabi here. 110 00:09:22,220 --> 00:09:23,690 These are the underscores. 111 00:09:30,510 --> 00:09:33,540 And also a E. 112 00:09:36,460 --> 00:09:37,540 Abbey version. 113 00:09:37,870 --> 00:09:38,620 Abby version. 114 00:09:38,620 --> 00:09:40,720 And a. 115 00:09:41,450 --> 00:09:44,240 Lastly here a part. 116 00:09:44,300 --> 00:09:46,760 Sorry for my handwriting. 117 00:09:47,240 --> 00:09:48,830 These are actually not handwriting. 118 00:09:48,830 --> 00:09:50,330 This is mouse writing here. 119 00:09:50,780 --> 00:09:53,540 I'm struggling with this, so. 120 00:09:55,290 --> 00:09:58,650 The A Path field actually contains multiple bytes. 121 00:09:58,680 --> 00:10:06,420 Namely indexes seven nine through 15 in a ident here. 122 00:10:10,670 --> 00:10:16,100 All of these bytes are currently designated as padding, so there are reserved for possible future use, 123 00:10:16,100 --> 00:10:18,080 but currently set to zero. 124 00:10:18,080 --> 00:10:24,800 And the A class byte denotes what the specifications refers to as the binary class. 125 00:10:24,800 --> 00:10:32,720 So this is a bit of a misnomer since the world class is so generic and it could mean almost anything. 126 00:10:32,720 --> 00:10:36,680 So you will learn about this in the next lectures. 127 00:10:36,680 --> 00:10:38,390 But firstly. 128 00:10:39,830 --> 00:10:40,520 We will. 129 00:10:42,100 --> 00:10:43,390 Rudolph here. 130 00:10:43,390 --> 00:10:44,860 We can close this now. 131 00:10:45,250 --> 00:10:46,210 We will. 132 00:10:46,210 --> 00:10:46,800 Rudolph. 133 00:10:47,140 --> 00:10:50,230 Our old file here. 134 00:10:50,970 --> 00:10:51,460 Let's see here. 135 00:10:51,470 --> 00:10:54,800 As you can see here, we should have the dot out. 136 00:10:55,280 --> 00:10:59,120 So if we execute, try to run this app, we will. 137 00:10:59,120 --> 00:11:06,010 This is just a regular Hello World application is written in C and what we're going to do is read Elf 138 00:11:06,020 --> 00:11:12,320 here, read Elf H and a dot out and that's it. 139 00:11:12,320 --> 00:11:16,460 We have the several information here which I will explain right now. 140 00:11:16,550 --> 00:11:26,150 And here the a ident here, this is the a ident is shown on the line marked the magic. 141 00:11:26,750 --> 00:11:27,770 This was the. 142 00:11:28,980 --> 00:11:32,460 That agent which we discussed previously here. 143 00:11:33,200 --> 00:11:34,670 When we started this lecture. 144 00:11:34,670 --> 00:11:35,750 Let's try this. 145 00:11:35,750 --> 00:11:37,280 Open this up here again. 146 00:11:38,970 --> 00:11:39,630 And. 147 00:11:46,330 --> 00:11:48,700 We're going to go to cat or mouse. 148 00:11:48,700 --> 00:11:49,660 Fat is okay here. 149 00:11:49,660 --> 00:11:53,950 Mouse pad is mouse pad user. 150 00:11:54,920 --> 00:11:55,850 Include. 151 00:11:58,540 --> 00:11:59,470 And that. 152 00:12:03,740 --> 00:12:04,550 What's right include. 153 00:12:05,420 --> 00:12:05,870 Here. 154 00:12:08,560 --> 00:12:09,670 And here. 155 00:12:11,670 --> 00:12:12,660 We will now. 156 00:12:15,910 --> 00:12:17,080 To that. 157 00:12:18,110 --> 00:12:19,090 Right here. 158 00:12:19,600 --> 00:12:25,030 That's actually the struct, but in struct we have this array magic number and other information here. 159 00:12:25,060 --> 00:12:26,200 A ident. 160 00:12:27,130 --> 00:12:28,510 And here. 161 00:12:31,200 --> 00:12:33,570 The thing you can see here in magic. 162 00:12:34,870 --> 00:12:42,040 As I said, it's the ident array and it starts with the familiar four magic bytes. 163 00:12:42,040 --> 00:12:49,840 Seven F, uh, 45 uh, followed by a value of two. 164 00:12:51,990 --> 00:12:55,920 Indicating that elf class six, fourth, then one. 165 00:12:56,860 --> 00:12:59,290 Um, which is Elf data to LSB. 166 00:12:59,320 --> 00:13:03,730 And finally another one which is EV current. 167 00:13:03,730 --> 00:13:16,150 So the remaining bytes are all zeroed out since the a OCB and a EB version bytes are at their default 168 00:13:16,150 --> 00:13:16,630 values. 169 00:13:16,630 --> 00:13:20,860 So the padding bytes are also are all set to zero as well. 170 00:13:20,860 --> 00:13:30,220 So this the information contained in some of these bytes is explicitly repeated on dedicated lines marked 171 00:13:30,220 --> 00:13:32,830 as the class here. 172 00:13:33,880 --> 00:13:35,140 Data versions. 173 00:13:35,470 --> 00:13:40,180 Two's complement Little endian and version one Current.