1 00:00:00,840 --> 00:00:07,110 Hello, in this section we will implement processes. As we have talked about it before, 2 00:00:07,110 --> 00:00:09,420 a process is basically a program in execution. 3 00:00:10,090 --> 00:00:11,910 It contains program instructions, 4 00:00:12,240 --> 00:00:16,770 the data the program needs, heap and stack within the address space. 5 00:00:17,610 --> 00:00:19,860 Each process has its own address space. 6 00:00:20,640 --> 00:00:23,820 As you can see, we have two processes in this example. 7 00:00:24,300 --> 00:00:28,230 The operating system kernel resides in the kernel space of each process. 8 00:00:28,980 --> 00:00:33,570 So we map the kernel space to the same physical pages where the kernel is located. 9 00:00:34,670 --> 00:00:41,900 However, the user spaces of these two processes will map to different physical pages. 10 00:00:41,900 --> 00:00:44,550 So the process user space saves its own program instructions and data. 11 00:00:45,200 --> 00:00:48,110 The kernel space is the same among all the processes. 12 00:00:49,130 --> 00:00:51,560 Ok, let’s create our first process. 13 00:00:53,490 --> 00:00:59,760 We add two new files process.c and process.h. In the header file, 14 00:01:02,280 --> 00:01:08,520 we define a structure process which is like process control block. The structure is used to 15 00:01:08,520 --> 00:01:10,980 store the essential data of the process in the system. 16 00:01:11,430 --> 00:01:16,440 So it is saved in the kernel space and user program is not allowed to access it. 17 00:01:17,460 --> 00:01:20,580 The pid is the identification number of a process. 18 00:01:21,790 --> 00:01:27,130 The state indicates the status of the process, such as used, initialized, etc. 19 00:01:28,360 --> 00:01:33,700 Page map saves the address of page map level 4 table, when we run the process, we will switch to the current vm 20 00:01:33,700 --> 00:01:34,600 . 21 00:01:35,710 --> 00:01:42,540 The stack is used for the kernel code. A process has two stacks, one is for user mode 22 00:01:42,550 --> 00:01:47,920 and another is for kernel mode. The stack here is used when we enter the kernel mode. 23 00:01:48,990 --> 00:01:52,410 The last one is trap frame, which we will discuss in a moment. 24 00:01:53,430 --> 00:01:58,530 The structure tss is used only for setting up stack pointer for ring0. 25 00:01:59,530 --> 00:02:04,800 We also add attribute packed so that the items in the structure are stored without padding in it. 26 00:02:06,680 --> 00:02:12,770 Then we define some constants. The stack size specifies the size of the kernel stack which is 2m in this example. 27 00:02:12,770 --> 00:02:16,480 The number of processes is set to 10. 28 00:02:16,490 --> 00:02:19,820 So we could have a total of 10 processes running in the system. 29 00:02:20,770 --> 00:02:26,080 The next two ones are the process states. As we move along, we will add more states in the file. 30 00:02:27,210 --> 00:02:29,460 OK, let's see the process.c file. 31 00:02:34,190 --> 00:02:40,460 The variable process stores the important info about all the processes and most of the operations in this module 32 00:02:40,460 --> 00:02:46,400 is related to this structure. Since we allow 10 processes running in the system, we define 33 00:02:46,400 --> 00:02:48,680 the process table array with 10 items. 34 00:02:49,730 --> 00:02:54,770 The pid num is used to allocate a new process with a identification number. 35 00:02:56,070 --> 00:02:58,800 Alright, the first function we are going to talk about 36 00:03:02,070 --> 00:03:08,760 is function initialize process. In this function, we will find an unused process slot in the process table 37 00:03:08,760 --> 00:03:10,810 by calling function find unused process 38 00:03:10,830 --> 00:03:12,000 . 39 00:03:12,960 --> 00:03:18,690 We use assert to make sure that the process is the first item in the table because we just initialize 40 00:03:18,740 --> 00:03:19,950 the process right now. 41 00:03:21,100 --> 00:03:24,900 Ok let’s see the implementation of find unused process function. 42 00:03:27,750 --> 00:03:33,360 This function is simple. All it does is loop through the process table and check the state of the process. 43 00:03:33,360 --> 00:03:39,810 If it is unused, we will copy the address of the process to the variable process 44 00:03:39,810 --> 00:03:40,710 and exit out the for loop. 45 00:03:41,700 --> 00:03:47,670 Because we have initialized process with NULL. If there is no process available to use, 46 00:03:47,670 --> 00:03:52,080 the value of the process remains NULL. Return the process and we are done. 47 00:03:55,640 --> 00:04:01,370 After we find a process, the next thing we are going to do is we are going to set the process structure 48 00:04:01,650 --> 00:04:03,680 by the function set process entry. 49 00:04:04,660 --> 00:04:08,670 The argument is the unused process we just retrieved. 50 00:04:09,970 --> 00:04:17,470 This function sets each member of the process structure. The process state is set to proc initialized 51 00:04:17,860 --> 00:04:24,970 and the pid is set with value pid num. Then we increment the variable pid num for later use. 52 00:04:26,200 --> 00:04:32,980 Now we get to member stack. We allocate a new page for the kernel stack. So you can see each process 53 00:04:32,980 --> 00:04:34,360 has its own kernel stack. 54 00:04:35,630 --> 00:04:40,960 After we check the return value, we zero the page and add the page size to the base of the new page. 55 00:04:41,900 --> 00:04:44,330 This is because the stack grows downwards. 56 00:04:44,630 --> 00:04:47,990 So we adjust the top of the stack to the base of next page. 57 00:04:48,950 --> 00:04:52,580 It will decrement the stack pointer when pushing data on the stack. 58 00:04:53,740 --> 00:04:59,680 The next few statements are for the trap frame. Remember in the trap.c file, we used this structure. 59 00:04:59,680 --> 00:05:03,280 Let's open trap.c file. 60 00:05:06,670 --> 00:05:13,300 The reason we have the trap frame structure in the process is that in our system, we have two entry points 61 00:05:13,300 --> 00:05:17,530 when we switch from ring3 to ring0. One is through interrupts, 62 00:05:17,710 --> 00:05:19,390 another is through exceptions. 63 00:05:20,700 --> 00:05:27,210 Since we have handled them in the same function handler, this is actually the only entry point. 64 00:05:27,210 --> 00:05:30,660 Which means the function will be called when we jump from ring3 to ring0. 65 00:05:31,940 --> 00:05:38,330 In the section interrupts and exceptions handling, we have tested the code for getting to ring3 66 00:05:38,330 --> 00:05:42,490 and set up tss for kernel stack pointer which is used when we jump to ring0. 67 00:05:43,570 --> 00:05:51,070 In our system, the top of the kernel stack is set to the rsp0 in tss. Meaning that 68 00:05:51,070 --> 00:05:52,660 when the interrupt or exception handler is called, 69 00:05:52,960 --> 00:05:57,390 the stack used in this case is actually the kernel stack we set up in the process. 70 00:05:58,510 --> 00:06:04,030 In order to easily reference the data, we define the trap frame structure just as we did 71 00:06:04,300 --> 00:06:09,460 in the handler function, such as trap number we referenced here. 72 00:06:10,680 --> 00:06:12,720 Ok let’s back to the process file. 73 00:06:14,650 --> 00:06:20,200 Let's move on. The top of stack - the size of trap frame, we will get the address of the trap frame. 74 00:06:20,200 --> 00:06:25,270 As you see, the trap frame is located at the top of the kernel stack. 75 00:06:26,410 --> 00:06:30,310 What we are going to do next is we are going to set the data in trap frame. 76 00:06:30,850 --> 00:06:32,180 We have done it before. 77 00:06:32,590 --> 00:06:36,330 Remember when we get from ring0 to ring3, we push the cs 78 00:06:36,370 --> 00:06:44,080 rip, ss, etc, to fabricate a scenario where we return from ring0 to ring3. 79 00:06:45,180 --> 00:06:51,410 The difference is that we did that using assembly language, and now we implement it using c language. 80 00:06:52,490 --> 00:07:01,760 The rip is set to 400000 and rsp is 400000 plus page size. So the code and stack of the program 81 00:07:01,760 --> 00:07:03,560 are in the same page. 82 00:07:04,220 --> 00:07:08,570 Next we setup kvm which creates a new kvm for this process. 83 00:07:09,440 --> 00:07:11,000 This is the first process, 84 00:07:11,000 --> 00:07:13,220 so we assume that it doesn’t fail. 85 00:07:14,210 --> 00:07:22,230 The page map stores the address of page map level 4 table. Then we setup uvm, the arguments we pass to it 86 00:07:22,760 --> 00:07:27,360 are pml4 table, the start address of the program instructions we want to copy 87 00:07:27,380 --> 00:07:32,560 and the size of the program. 88 00:07:32,600 --> 00:07:35,240 In this example, the program is actually this simple main function. 89 00:07:38,650 --> 00:07:43,690 Ok at this point, we have one process ready to run. The function launch 90 00:07:45,370 --> 00:07:46,630 will start the process. 91 00:07:48,030 --> 00:07:54,660 As you see it calls set tss. The argument is the address of process. So let's take a look. 92 00:07:57,230 --> 00:08:04,580 You see set tss just assign the top of the kernel stack to rsp0 in the tss. So when we jump from ring3 to ring0, 93 00:08:04,580 --> 00:08:11,080 the kernel stack is used. The tss is defined in the kernel file. 94 00:08:11,600 --> 00:08:12,650 So let's open 95 00:08:13,010 --> 00:08:13,910 the kernel file. 96 00:08:17,740 --> 00:08:24,850 You can see it is defined here. We declare it global so that we can reference it in the process file. 97 00:08:29,190 --> 00:08:36,030 In the c file, we extern tss. The structure tss is what we have defined in the header file. 98 00:08:36,870 --> 00:08:38,150 Alright, let’s move on. 99 00:08:41,250 --> 00:08:46,010 After we set tss, the next thing we are going to do is we are going to switch vm, 100 00:08:46,020 --> 00:08:51,750 the pml4 table is the table we just set up in the process structure. 101 00:08:53,800 --> 00:09:00,390 At this point, we are at the process virtual space and we have copied the main function in the address 400000 102 00:09:00,460 --> 00:09:01,540 . 103 00:09:02,680 --> 00:09:09,270 The only thing left to do is jump to trap return to get to ring3 and run the main function. 104 00:09:09,270 --> 00:09:10,870 Before we implement pstart function, 105 00:09:10,920 --> 00:09:12,960 let’s check the current state of the stack. 106 00:09:14,640 --> 00:09:21,420 As you see, in the set process entry function, the member tf is set to the start of trap frame structure. 107 00:09:21,930 --> 00:09:27,390 And we also set these fields. As their names imply, they will be popped to the segment registers 108 00:09:27,390 --> 00:09:32,910 or general-purpose registers when we execute instruction iretq. 109 00:09:34,520 --> 00:09:36,530 So in the trap assembly file, 110 00:09:40,340 --> 00:09:42,020 let’s focus on trap return. 111 00:09:43,330 --> 00:09:50,020 As we have seen it before, we pop 15 general-purpose registers and add rsp 16 to skip the trap number and error code. 112 00:09:50,020 --> 00:09:55,450 and then the stack pointer is pointing to these 5 values. 113 00:09:56,050 --> 00:10:01,000 We have set them up to the desired value which means when we execute interrupt return, 114 00:10:01,000 --> 00:10:08,620 we will be jumping to address 400000 and running in ring3. The top of the stack we use in the process is set to 115 00:10:08,620 --> 00:10:10,240 600000, 116 00:10:10,240 --> 00:10:16,180 so if we push data on the stack, the first one will be pushed on the top address of the same page and so on 117 00:10:16,180 --> 00:10:16,350 . 118 00:10:17,170 --> 00:10:22,630 Ok, with this in mind, what we need to do is just change rsp register to point to 119 00:10:22,630 --> 00:10:25,030 the start of the trap frame when we at trap return. 120 00:10:26,460 --> 00:10:28,200 Alright, we define pstart 121 00:10:32,120 --> 00:10:34,490 and copy the value of rdi to rsp. 122 00:10:37,060 --> 00:10:38,440 Then jump to trap return. 123 00:10:41,560 --> 00:10:46,300 That’s it. Because we use it in the c file, we declare it global. 124 00:10:51,380 --> 00:10:53,000 OK, back to a process file. 125 00:10:55,850 --> 00:11:03,340 We call pstart and pass tf to the function. Tf is set to the address of trap frame 126 00:11:03,350 --> 00:11:04,730 and now we are good to go. 127 00:11:05,720 --> 00:11:09,620 Before we build the project, let’s talk a little about main function. 128 00:11:11,020 --> 00:11:16,570 Normally we should print message on the screen using print function. But in this example, 129 00:11:16,570 --> 00:11:17,570 it is not an option. 130 00:11:18,100 --> 00:11:21,570 We know that we are running in ring3 when we get to main function. 131 00:11:22,060 --> 00:11:27,130 The printk function is mapped to the kernel space which cannot be accessed in ring3. 132 00:11:27,670 --> 00:11:33,700 So here we just do a test, we pick a memory address which is located in the kernel space, and write 1 to it. 133 00:11:33,700 --> 00:11:38,440 This operation will fail and generate cpu exception. 134 00:11:38,950 --> 00:11:40,960 So let’s open trap.c file. 135 00:11:42,940 --> 00:11:47,290 In the function handler, the exception will be processed in the default here. 136 00:11:49,420 --> 00:11:52,630 We print the message on screen. So we print 137 00:11:55,760 --> 00:11:57,140 we print error 138 00:11:58,510 --> 00:11:59,470 the trap number, 139 00:12:04,550 --> 00:12:06,290 at ring the number 140 00:12:10,200 --> 00:12:15,960 Remember the lower 2bits of cs register stores the current privilege level, 141 00:12:16,230 --> 00:12:19,940 so here we and cs value with 3 which will preserve the lower 2bits of the value. 142 00:12:20,920 --> 00:12:24,950 Then we print error code, which gives us information about the error. 143 00:12:29,680 --> 00:12:35,450 The virtual address we try to access which causes the exception. This virtual address is stored 144 00:12:35,450 --> 00:12:36,780 in register cr2. 145 00:12:38,350 --> 00:12:39,220 So we write 146 00:12:41,200 --> 00:12:42,490 read cr2, 147 00:12:47,390 --> 00:12:50,960 the last one is the address of the instruction which causes the exception. 148 00:12:52,530 --> 00:12:54,510 So we print rip register. 149 00:12:57,440 --> 00:13:00,770 Since we use printk function to print message on screen, 150 00:13:02,990 --> 00:13:05,160 we add print.h 151 00:13:09,010 --> 00:13:11,530 Now, let's write read cr2 function. 152 00:13:12,810 --> 00:13:16,980 We add it in trap assembly file. This function is simple, 153 00:13:22,240 --> 00:13:26,740 all we need to do is mov rax,cr2 and return. 154 00:13:28,910 --> 00:13:32,660 So the return value is what we want. Don’t forget to 155 00:13:37,390 --> 00:13:40,090 global read cr2. 156 00:13:41,120 --> 00:13:42,920 In the trap.h file, 157 00:13:46,400 --> 00:13:49,310 we also add the declaration of read cr2. 158 00:13:54,290 --> 00:13:59,860 The last thing we need to do is call the init process function in the c file. 159 00:14:03,070 --> 00:14:08,610 So we open the main.c file and include process.h 160 00:14:11,550 --> 00:14:14,460 call function init process. 161 00:14:16,950 --> 00:14:20,100 Then we start the process by calling function launch. 162 00:14:22,460 --> 00:14:24,680 Since we add a new module process, 163 00:14:25,600 --> 00:14:27,480 we add it in the build script. 164 00:14:33,820 --> 00:14:35,170 And add the process.o 165 00:14:36,380 --> 00:14:38,150 to the linker command. 166 00:14:38,970 --> 00:14:41,360 Ok let’s build the project and test it out. 167 00:14:48,900 --> 00:14:55,460 As you see, exception 14 is generated at ring3. This is page fault exception. 168 00:14:55,890 --> 00:14:57,690 The error code is 7 169 00:14:58,120 --> 00:15:05,010 that is the lower three bits are all set. The bit 1(not 2) means that the exception is caused by writing operation. 170 00:15:05,340 --> 00:15:10,200 And bit 2(not 3) indicates that the exception is generated when we are in ring3. 171 00:15:11,140 --> 00:15:14,260 The value in cr2 register is this virtual address, 172 00:15:17,600 --> 00:15:21,920 you can see this is the exact address we try to access in main function. 173 00:15:25,640 --> 00:15:34,010 The address of the instruction is at 40002e which is at the page we allocated for main function. 174 00:15:34,730 --> 00:15:37,260 The base address is 400000, 175 00:15:37,700 --> 00:15:40,760 so the offset is 2e in the main function. 176 00:15:41,830 --> 00:15:47,050 If you want to see the instruction which causes the exception, you can disassemble the kernel file 177 00:15:47,050 --> 00:15:50,530 in the terminal by typing objdump 178 00:15:51,560 --> 00:15:55,910 -d kernel, press enter. 179 00:15:58,740 --> 00:16:00,480 So let's check the main function. 180 00:16:03,520 --> 00:16:08,830 As you see, this is the start address of the main function. Remember the offset of the instruction 181 00:16:08,830 --> 00:16:10,410 is 2e in the main function. 182 00:16:10,960 --> 00:16:18,450 So we add the offset 2e which produces the result here. The assembly code is in the at&t format, 183 00:16:18,850 --> 00:16:24,370 what it does is copy the value 1 to the address saved in rax which is this value. 184 00:16:24,940 --> 00:16:27,430 So this is what we did in the c file. 185 00:16:28,830 --> 00:16:30,600 OK, that's it for this lecture. 186 00:16:31,230 --> 00:16:32,460 See you in the next video.