1 00:00:00,450 --> 00:00:04,590 Hello, my name is Stephan, and in this lecture we will learn about the instruction types and we will 2 00:00:04,590 --> 00:00:05,940 also analyze this code. 3 00:00:05,940 --> 00:00:11,220 And in computer programming we use instructions to tell the computer what to do. 4 00:00:11,580 --> 00:00:15,090 Two important instructions mentioned here are L. 5 00:00:16,390 --> 00:00:22,540 A a load effective address and m of which is mu. 6 00:00:23,020 --> 00:00:28,960 So these instructions helps us work with data stored in memory. 7 00:00:28,960 --> 00:00:37,690 So when we use the data, when we use the LDA instruction with a memory location like B num, we are 8 00:00:37,690 --> 00:00:45,730 essentially loading the address of the memory location into a register called Rax and think of Rax as 9 00:00:45,730 --> 00:00:50,530 a temporary storage container that the computer's processor can use. 10 00:00:50,530 --> 00:00:59,530 So if we want to get the memory address of Rax, we can use l e a but if we want to, if we want the 11 00:00:59,530 --> 00:01:05,890 actual data value stored at the memory address, we can use MOV when we use MOV. 12 00:01:05,920 --> 00:01:15,100 If we put square braces here around binom, we get the data value itself and not the address. 13 00:01:15,100 --> 00:01:17,990 And as you can see here, I added comments here. 14 00:01:17,990 --> 00:01:24,230 When we use the square brackets with move and rax load the value stored at the memory address stored 15 00:01:24,350 --> 00:01:28,040 in binom into Rax and. 16 00:01:29,880 --> 00:01:30,630 So. 17 00:01:31,770 --> 00:01:35,250 Uh, let's actually in this context here. 18 00:01:35,490 --> 00:01:38,730 Racks Racks is 64 bit register. 19 00:01:38,730 --> 00:01:43,500 So this means it can hold eight bytes of data. 20 00:01:43,500 --> 00:01:48,690 So the data is stored in specific way called little endian. 21 00:01:48,690 --> 00:01:54,660 So where the rightmost byte smallest part of the value is stored in the lowest memory address. 22 00:01:54,660 --> 00:02:02,460 So in the case we are mainly interested in the smallest part of racks which is called a L. 23 00:02:02,490 --> 00:02:11,640 So this holds just one byte of data and that's the part we want to use when we need to store or manipulate 24 00:02:11,670 --> 00:02:12,870 small values. 25 00:02:12,870 --> 00:02:16,140 So sometimes we want to start with a clean slate. 26 00:02:16,140 --> 00:02:17,610 In register. 27 00:02:17,610 --> 00:02:28,950 To do this, we use XOR instruction to set a register to zero and in this case it is used like here 28 00:02:28,950 --> 00:02:32,560 XOR racks racks here. 29 00:02:33,490 --> 00:02:34,240 And. 30 00:02:38,960 --> 00:02:40,940 And after that we use the. 31 00:02:41,270 --> 00:02:48,830 We also use the restore the value of RSP from Rbp, which is clean up the stack frame. 32 00:02:50,190 --> 00:02:51,750 And to avoid. 33 00:02:52,730 --> 00:02:58,040 And also when we move the data between memory and registers, we need to be careful about the size of 34 00:02:58,040 --> 00:02:58,430 the data. 35 00:02:58,430 --> 00:03:02,660 For example, move P over here. 36 00:03:02,840 --> 00:03:04,160 Yeah, it's actually. 37 00:03:04,160 --> 00:03:04,600 Yes. 38 00:03:04,610 --> 00:03:05,420 Move. 39 00:03:06,410 --> 00:03:07,070 Beaver. 40 00:03:10,140 --> 00:03:10,680 Aw, yeah. 41 00:03:10,710 --> 00:03:13,980 Move by Rex Moves. 42 00:03:14,010 --> 00:03:18,150 Eight bytes from rags to the memory location. 43 00:03:18,180 --> 00:03:18,900 Beaver. 44 00:03:18,930 --> 00:03:21,030 So this might be more than we. 45 00:03:21,600 --> 00:03:22,680 More than you intend. 46 00:03:22,680 --> 00:03:27,570 And potentially overwriting other important data in memory. 47 00:03:27,570 --> 00:03:37,500 And to avoid overwriting unintended data, you can use move p r a l instead. 48 00:03:37,500 --> 00:03:45,540 So this moves just one byte value in a l to the memory location beaver leaving the rest of the data 49 00:03:45,720 --> 00:03:50,490 untouched and Indian ness on the value representation here. 50 00:03:50,490 --> 00:04:00,330 So when reading data from memory like for example text one the computer stores the data in little endian 51 00:04:00,480 --> 00:04:01,080 notation. 52 00:04:01,080 --> 00:04:06,420 So this means the smallest part of the value stored at the lowest memory address. 53 00:04:07,020 --> 00:04:11,950 And you can use l e a to load memory addresses. 54 00:04:11,950 --> 00:04:17,680 It makes your code more understandable since it's clear that you are dealing with addresses and it it 55 00:04:17,680 --> 00:04:19,900 can also be faster for some calculations. 56 00:04:19,900 --> 00:04:26,680 However, in this context, the l e a won't be used for calculations. 57 00:04:26,680 --> 00:04:29,200 So now what we're going to do is. 58 00:04:30,640 --> 00:04:32,800 And we will make this program. 59 00:04:32,800 --> 00:04:35,140 So there's an output for this program. 60 00:04:35,140 --> 00:04:38,790 We will use the debugger to step through each instructions. 61 00:04:39,310 --> 00:04:48,190 CSM is helpful here, but we define some variables of different sizes, including an array of five double 62 00:04:48,190 --> 00:04:51,370 words filled with zeros. 63 00:04:52,370 --> 00:04:53,150 As you can see here. 64 00:04:53,150 --> 00:04:58,190 So we also define some items in sections BS. 65 00:04:58,670 --> 00:05:05,510 Now look in your debugger for a span, the stack pointer, it's a very high value. 66 00:05:06,820 --> 00:05:07,570 And. 67 00:05:10,530 --> 00:05:16,080 And here the stack pointer refers to an address in high memory. 68 00:05:16,080 --> 00:05:24,780 So the stack in an area in memory used for temporarily storing data and the stack will grow as more 69 00:05:24,780 --> 00:05:34,170 data is stored in it and it will grow in the downward direction from higher addresses to lower addresses. 70 00:05:34,170 --> 00:05:41,970 So the stack pointer RSP here like this will decrease every time you put data on the stack. 71 00:05:41,970 --> 00:05:45,960 So we will discuss the stack in a separate section of our course. 72 00:05:45,960 --> 00:05:51,630 But remember already that the stack is placed somewhere in high memory. 73 00:05:52,380 --> 00:05:59,070 Now we will debug this program here by clicking on debug and as you can see, it's started. 74 00:05:59,070 --> 00:06:08,520 So if you can't see this right tab here, in order to see it here, you need to go to debug and click 75 00:06:08,520 --> 00:06:09,420 on Show Register. 76 00:06:09,420 --> 00:06:18,220 You can also use the control R to show this register tab on the right side of your screen now. 77 00:06:19,240 --> 00:06:20,590 That sit here. 78 00:06:20,590 --> 00:06:33,490 So here we use this l l a no l e a instruction, which means, as I explained here in this lecture load 79 00:06:33,490 --> 00:06:38,890 effective address to load the memory address of Benham here. 80 00:06:39,540 --> 00:06:42,630 To load the memory address of Bynum into Rex. 81 00:06:42,630 --> 00:06:49,650 We can obtain the same result with MV without the square braces around Bynum. 82 00:06:49,710 --> 00:06:56,790 So if we use square braces like this with mov instruction like we used here. 83 00:06:57,580 --> 00:06:58,150 Right. 84 00:06:58,990 --> 00:07:04,150 Uh, we are loading the value and not the address at Benham interacts. 85 00:07:04,150 --> 00:07:05,920 But we are not loading. 86 00:07:05,940 --> 00:07:07,000 Only Benham interacts. 87 00:07:07,000 --> 00:07:14,920 But because the Rax is 64 bit or eight byte register and more bytes are loaded into Rax and our Benham 88 00:07:14,920 --> 00:07:22,810 is the rightmost byte in Rax here, which is, as I explained here, this is little endian and here 89 00:07:22,810 --> 00:07:32,870 we are only interested in the register as so you when you require rax to contain only only the value 90 00:07:32,870 --> 00:07:39,010 of one, two, three, you will first have to clear rax like for example. 91 00:07:42,450 --> 00:07:43,110 Here. 92 00:07:44,550 --> 00:07:46,740 Like X or with x? 93 00:07:46,740 --> 00:07:47,760 With x or. 94 00:07:48,210 --> 00:07:52,670 Actually, we need to stop the program from debugging here and we will continue again. 95 00:07:52,680 --> 00:08:02,520 So as I said, like if you want to react to contain only value one, two, three, you need to. 96 00:08:03,970 --> 00:08:08,740 Uh, clear the racks with X or x or Rex. 97 00:08:09,470 --> 00:08:10,360 Rex. 98 00:08:10,430 --> 00:08:11,160 Rex. 99 00:08:11,170 --> 00:08:12,820 And that's it. 100 00:08:13,800 --> 00:08:16,710 And you can also use instead of. 101 00:08:17,770 --> 00:08:19,690 Like using the. 102 00:08:23,020 --> 00:08:31,000 Instead of using the move racks being on like this, you can use. 103 00:08:33,010 --> 00:08:34,630 L Aw, Yeah. 104 00:08:34,900 --> 00:08:37,950 Instead of using this, actually, instead of writing this. 105 00:08:37,990 --> 00:08:38,350 Yeah. 106 00:08:38,350 --> 00:08:41,710 You instead of using this, you can use move. 107 00:08:43,150 --> 00:08:44,020 Al. 108 00:08:46,470 --> 00:08:47,160 And. 109 00:08:48,710 --> 00:08:51,170 Be in braces. 110 00:08:52,270 --> 00:08:57,910 And be careful about the sizes of data you are moving to and from memory. 111 00:08:58,690 --> 00:09:03,100 Look, for instance, let's actually turn this. 112 00:09:05,980 --> 00:09:06,370 Here. 113 00:09:06,400 --> 00:09:10,450 Look, for instance, in move PVR racks. 114 00:09:10,450 --> 00:09:17,090 With this instruction, we are moving the eight bytes in racks to the address bar. 115 00:09:17,110 --> 00:09:24,220 If you only intended to write one, two, three two beaver, you can check with your debugger that you 116 00:09:24,220 --> 00:09:29,410 overwrite another seven bytes in memory so you can choose it. 117 00:09:30,070 --> 00:09:37,660 Choose type D for beaver in the CSM memory window and this can introduce nasty bugs in your program. 118 00:09:37,660 --> 00:09:48,640 To avoid that, replace the instructions with move Beaver a l i as I explained here so. 119 00:09:50,300 --> 00:09:53,740 Let's actually try this debug this here. 120 00:09:53,750 --> 00:10:00,890 And as you can see here we are seeing registers, hacks and infos on our register table. 121 00:10:02,080 --> 00:10:09,370 And now what we're going to do is we will use the sample command this as main and. 122 00:10:11,730 --> 00:10:14,310 As you can see, we are also seeing the symbol here. 123 00:10:14,700 --> 00:10:18,780 So now we will use the red elf at the command line. 124 00:10:18,810 --> 00:10:27,420 Remember that we asked NSM to assemble using the Elf command like with this debugging settings here. 125 00:10:27,540 --> 00:10:31,260 But let's actually let's go to terminal here. 126 00:10:32,520 --> 00:10:35,790 And compile this with make file here. 127 00:10:35,790 --> 00:10:40,110 And also we will generate the elf. 128 00:10:41,800 --> 00:10:43,510 Here on the list as well. 129 00:10:43,810 --> 00:10:47,800 So we will go to here We are in exabytes. 130 00:10:50,310 --> 00:10:51,750 Actually, we had a. 131 00:10:53,160 --> 00:10:54,180 Britain the. 132 00:10:55,390 --> 00:10:58,480 Make file in another project so we can copy it. 133 00:10:59,050 --> 00:10:59,530 Just. 134 00:11:00,920 --> 00:11:01,350 Parsley. 135 00:11:02,180 --> 00:11:08,990 So we will just change the kicking to exabyte exabytes. 136 00:11:10,400 --> 00:11:12,080 Will be easy here. 137 00:11:14,350 --> 00:11:17,260 Search and replace. 138 00:11:18,050 --> 00:11:19,670 Kicking to. 139 00:11:21,220 --> 00:11:23,280 Bytes and replace. 140 00:11:23,290 --> 00:11:24,430 Now save it. 141 00:11:26,360 --> 00:11:27,290 Don't save. 142 00:11:27,290 --> 00:11:30,260 Now open the terminal here. 143 00:11:30,290 --> 00:11:31,160 Make. 144 00:11:31,190 --> 00:11:32,690 Make here. 145 00:11:32,690 --> 00:11:34,730 And as you can see here, our program is created. 146 00:11:34,760 --> 00:11:41,150 Now, if we run this, we will not get an output, but we will not get any error here because our programs 147 00:11:41,150 --> 00:11:42,710 works like a butter. 148 00:11:42,920 --> 00:11:49,550 And what we're going to do is we will use red elf. 149 00:11:49,670 --> 00:11:55,310 Red elf file header and memory. 150 00:11:57,020 --> 00:11:59,150 Now, not memory exabytes here. 151 00:11:59,420 --> 00:12:00,590 Exabytes. 152 00:12:01,040 --> 00:12:02,820 And that's it here. 153 00:12:02,840 --> 00:12:08,660 And you will get some general information about your executable memory here. 154 00:12:09,300 --> 00:12:10,970 The executable exabytes here. 155 00:12:10,970 --> 00:12:14,810 And look at the entry point address here. 156 00:12:14,960 --> 00:12:17,540 40 1020. 157 00:12:17,570 --> 00:12:21,350 That is the memory location of the start of our program. 158 00:12:21,350 --> 00:12:30,260 So between the program entry and the start of the code, as shown, um, as we will see on the GDB. 159 00:12:34,570 --> 00:12:36,100 Port, 1190. 160 00:12:37,640 --> 00:12:41,630 There is some overhead right here. 161 00:12:41,630 --> 00:12:46,370 We had 40, 11, 90 and 40. 162 00:12:46,460 --> 00:12:49,640 Ten, 20 and. 163 00:12:52,560 --> 00:12:56,820 That's actually also use the this as a main. 164 00:12:57,090 --> 00:12:57,510 Yeah. 165 00:12:57,510 --> 00:12:59,280 This is here gdb. 166 00:13:01,250 --> 00:13:01,900 LZ. 167 00:13:02,030 --> 00:13:02,380 Yeah. 168 00:13:03,000 --> 00:13:04,580 Uh, GDP. 169 00:13:05,040 --> 00:13:06,170 Exabytes. 170 00:13:10,230 --> 00:13:11,880 This as main. 171 00:13:15,290 --> 00:13:17,960 And here for 1110. 172 00:13:20,700 --> 00:13:22,350 Is it true for us? 173 00:13:26,210 --> 00:13:27,410 Um, let actually. 174 00:13:54,730 --> 00:14:02,650 Now here, our starting point of our program is 4011. 175 00:14:03,570 --> 00:14:04,750 Ten, right? 176 00:14:07,520 --> 00:14:08,330 And. 177 00:14:08,980 --> 00:14:10,900 We're going to try this with Rudolph. 178 00:14:11,890 --> 00:14:21,490 Our entry entry point is 40 1020, and there's some overhead in our program, and the heater provides 179 00:14:21,490 --> 00:14:28,810 us with additional information about the operating system and the executable code as well here, operating 180 00:14:28,810 --> 00:14:29,650 system. 181 00:14:29,800 --> 00:14:33,370 And we have the size of the section headers and so on. 182 00:14:33,880 --> 00:14:42,670 And Rudolph is convenient for exploring a binary executable as well, which you can use the symbols 183 00:14:42,670 --> 00:14:46,540 parameter and we will grab the main from it. 184 00:14:46,540 --> 00:14:49,990 So red elf symbols. 185 00:14:51,780 --> 00:14:56,340 Exabytes and we will use pip here. 186 00:14:56,370 --> 00:14:57,270 Grep. 187 00:14:58,290 --> 00:14:58,890 Main. 188 00:15:01,380 --> 00:15:03,600 And there we have it. 189 00:15:06,620 --> 00:15:07,820 For the symbols mean. 190 00:15:07,820 --> 00:15:15,150 So with grep we specify that we are looking for all lines with the word main in it. 191 00:15:15,170 --> 00:15:20,060 So here you can see the main function starts at 40 1110. 192 00:15:21,220 --> 00:15:21,670 Um. 193 00:15:21,670 --> 00:15:22,570 And. 194 00:15:24,300 --> 00:15:26,490 As we saw in GDB. 195 00:15:28,620 --> 00:15:28,950 Port. 196 00:15:28,950 --> 00:15:30,060 1110. 197 00:15:30,240 --> 00:15:30,810 Right. 198 00:15:32,450 --> 00:15:35,330 And in this example. 199 00:15:37,800 --> 00:15:44,220 We need to look in the symbols table for every occurrence of the label start. 200 00:15:44,220 --> 00:15:52,350 So we see start of section data and BSS. 201 00:15:53,120 --> 00:15:55,010 Which is and the start of the program. 202 00:15:55,010 --> 00:16:04,340 We can also see the start of the program as well with our red symbols, but with grep command in order 203 00:16:04,340 --> 00:16:10,040 to see that, we will use the start and. 204 00:16:12,870 --> 00:16:14,040 Yeah, that's it here. 205 00:16:14,040 --> 00:16:19,110 As you can see here, we have start data, start here. 206 00:16:19,230 --> 00:16:20,760 These are the start point. 207 00:16:20,790 --> 00:16:23,970 Let's actually compare them with our. 208 00:16:27,000 --> 00:16:27,750 1020. 209 00:16:27,780 --> 00:16:30,180 We are still not on the entry point. 210 00:16:31,280 --> 00:16:31,820 Yes. 211 00:16:31,820 --> 00:16:36,110 Here and start our entry point actually corresponds here. 212 00:16:36,170 --> 00:16:42,800 Now we need to also see what we have in memory with the instruction here. 213 00:16:44,020 --> 00:16:46,120 Now in order to again. 214 00:16:47,890 --> 00:16:57,850 We will use the tail plus ten and we will use sort minus key to are here. 215 00:16:59,230 --> 00:17:05,860 And this instruction ignores some lines that are not interesting to us right now. 216 00:17:05,860 --> 00:17:10,960 So we sort on the second column, the memory address in reverse order. 217 00:17:10,960 --> 00:17:20,740 As you can see, some basic knowledge of Linux commands comes handy and the start of the program is 218 00:17:20,740 --> 00:17:25,600 at some low address and the start of the main. 219 00:17:26,390 --> 00:17:27,420 Is that? 220 00:17:29,080 --> 00:17:31,150 Part of the main is. 221 00:17:36,530 --> 00:17:38,180 Sexually see it here? 222 00:17:39,710 --> 00:17:40,490 We actually. 223 00:17:42,030 --> 00:17:43,530 So it improves here. 224 00:17:43,530 --> 00:17:44,030 Yeah. 225 00:17:44,070 --> 00:17:46,800 Start with the main is 4011. 226 00:17:46,800 --> 00:17:47,550 Ten. 227 00:17:48,090 --> 00:17:48,720 Right. 228 00:17:50,590 --> 00:18:02,590 And look look for the start of the section data here in this for ten, 20, 40 in data starts with for 229 00:18:02,590 --> 00:18:09,190 1020 with the address of all this variable and the start section of B s. 230 00:18:09,700 --> 00:18:13,270 We also saw that here. 231 00:18:13,960 --> 00:18:14,980 Note here. 232 00:18:16,290 --> 00:18:23,880 Start 40, 40, 39, and with the address reserved for its variables as well. 233 00:18:23,880 --> 00:18:27,060 So now let's summarize our findings here. 234 00:18:27,060 --> 00:18:37,380 So we found we found at the beginning of this lecture that the stack is in high memory, which we saw 235 00:18:37,380 --> 00:18:39,960 this with RSP here. 236 00:18:40,590 --> 00:18:41,760 RSP here. 237 00:18:43,040 --> 00:18:51,950 And with Rudolph, we found that the executable code is at the lower side of memory and on top of the 238 00:18:51,950 --> 00:18:54,740 executable code we have section data. 239 00:18:54,740 --> 00:19:01,810 And on top of that, on on top of the section data, we have section B, s. 240 00:19:01,820 --> 00:19:03,380 S here. 241 00:19:06,160 --> 00:19:12,610 But in this here, we as you know, we are using the reverse order here. 242 00:19:12,820 --> 00:19:14,080 And. 243 00:19:16,050 --> 00:19:27,150 And this high memory can grow and it grows in the downward direction toward Section BS and the available 244 00:19:27,150 --> 00:19:38,880 free memory between the stack and the other section is called Heap H, a heap, and the memory in the 245 00:19:38,880 --> 00:19:41,910 section that is assigned at runtime. 246 00:19:41,910 --> 00:19:43,950 So you can easily check that. 247 00:19:44,820 --> 00:19:46,320 And now. 248 00:19:48,270 --> 00:19:50,760 Well, let's actually go here. 249 00:19:50,760 --> 00:19:57,150 And in this case, as you can see here, we have QR risk reserved variables here. 250 00:19:57,150 --> 00:19:59,400 And we. 251 00:20:00,700 --> 00:20:02,260 Let's actually change this to. 252 00:20:03,600 --> 00:20:06,480 Uh, 33 of oops. 253 00:20:06,480 --> 00:20:07,220 Not this for. 254 00:20:09,760 --> 00:20:14,270 Let's change 3 to 3000 and you will. 255 00:20:14,290 --> 00:20:15,970 Actually, we need to stop this for. 256 00:20:15,970 --> 00:20:17,560 Sorry for here. 257 00:20:18,960 --> 00:20:25,860 So if we change 3 to 3000 and save the program, make again. 258 00:20:26,280 --> 00:20:31,340 And, uh, sorry, we had some error here. 259 00:20:31,350 --> 00:20:33,150 I think we messed with the code here. 260 00:20:33,150 --> 00:20:33,810 Yes. 261 00:20:36,470 --> 00:20:43,550 And why we have zero here, we will change 3 to 3000. 262 00:20:44,400 --> 00:20:45,900 And we will make again. 263 00:20:45,900 --> 00:20:48,630 And as you can see, it's compiled now. 264 00:20:49,710 --> 00:20:54,240 Now we rebuild the program with make. 265 00:20:54,250 --> 00:20:58,080 And now let's look the size again here LA. 266 00:20:59,070 --> 00:21:00,630 And as you can see, it's. 267 00:21:02,140 --> 00:21:02,680 Here. 268 00:21:05,800 --> 00:21:11,560 And we will use the try this with Rudolph again the same command. 269 00:21:19,060 --> 00:21:26,980 And as you can see here, the size here is significantly, significantly changed.