1 00:00:00,120 --> 00:00:01,230 ‫Welcome, guys. 2 00:00:01,470 --> 00:00:08,670 ‫This is a video that I took from my show on my YouTube channel, which is called the Back End Engineering 3 00:00:08,670 --> 00:00:15,450 ‫Show, where I discuss different back end technologies and from networking to databases to proxies to 4 00:00:15,750 --> 00:00:16,830 ‫low level. 5 00:00:19,190 --> 00:00:22,640 ‫Things that related to the backend Web servers, protocols. 6 00:00:22,820 --> 00:00:26,780 ‫And this particular episode was discussing in details. 7 00:00:26,780 --> 00:00:33,110 ‫The rider had log the reader logs and I thought, I'll put it up in this course as well. 8 00:00:33,110 --> 00:00:38,600 ‫But because it's available on my YouTube channel, obviously I'm going to make it preview only here. 9 00:00:38,690 --> 00:00:43,850 ‫So you can effectively see this as a preview and I hope you really enjoy it. 10 00:00:43,850 --> 00:00:45,920 ‫I think it's a good addition to this course. 11 00:00:45,950 --> 00:00:47,780 ‫It kind of glue things together. 12 00:00:47,780 --> 00:00:53,240 ‫We talked about we have a dedicated obviously lectures for the wall, but this is almost like a, you 13 00:00:53,240 --> 00:01:00,590 ‫know, things that tying with tying in different experiences together into one around a 30 minute video. 14 00:01:00,860 --> 00:01:01,910 ‫I hope you enjoy it. 15 00:01:04,780 --> 00:01:17,140 ‫Database logs are a critical component of any dbmr system in order to ensure durability and crash recovery, 16 00:01:17,140 --> 00:01:17,890 ‫if you will. 17 00:01:18,550 --> 00:01:29,260 ‫So logs like I'm referring to logs such as the right ahead log or wall for short while. 18 00:01:29,260 --> 00:01:35,470 ‫That is the redo logs and the undo logs. 19 00:01:36,400 --> 00:01:38,170 ‫These are the only three that I know. 20 00:01:38,320 --> 00:01:43,630 ‫There must be other type of logs that are there for specific implementation, but I think these are 21 00:01:43,630 --> 00:01:45,220 ‫the three most popular. 22 00:01:46,000 --> 00:01:52,060 ‫In fact, you can argue that the wall is actually identical to the redo logs. 23 00:01:52,420 --> 00:01:56,650 ‫You know, and I'm going to talk about all of this in this episode of the back end engineering show. 24 00:01:56,680 --> 00:01:57,550 ‫It's been a while. 25 00:01:57,580 --> 00:02:01,900 ‫Welcome to the back of the engineering show with your host, Hussein Nasr. 26 00:02:01,900 --> 00:02:02,250 ‫And. 27 00:02:04,120 --> 00:02:07,360 ‫If we really think about durability. 28 00:02:08,980 --> 00:02:10,750 ‫In database system. 29 00:02:11,770 --> 00:02:12,070 ‫You. 30 00:02:12,320 --> 00:02:14,530 ‫You really need to think about. 31 00:02:16,350 --> 00:02:17,760 ‫How do you persist? 32 00:02:17,760 --> 00:02:18,480 ‫Data. 33 00:02:18,510 --> 00:02:19,200 ‫Right. 34 00:02:20,660 --> 00:02:26,090 ‫Well, you might say I say, okay, you have tables, you have data structures on those tables such 35 00:02:26,090 --> 00:02:30,830 ‫as indexes, indexes, sequences, constraints, stuff like that. 36 00:02:31,760 --> 00:02:36,830 ‫And these live in files because that's why we have today file system. 37 00:02:36,830 --> 00:02:38,570 ‫So we have to work with files. 38 00:02:38,570 --> 00:02:45,920 ‫You might argue that no, we can work with block storage directly, but most generic implementations 39 00:02:45,920 --> 00:02:49,640 ‫work with files and there are pros and cons to that. 40 00:02:50,580 --> 00:02:51,080 ‫Right. 41 00:02:51,090 --> 00:02:54,240 ‫And I like to always think about this, you know. 42 00:02:55,350 --> 00:02:56,130 ‫Think about. 43 00:02:56,130 --> 00:03:03,720 ‫There is not only one way to do things and never pigeonhole your thing into one thing. 44 00:03:03,720 --> 00:03:10,260 ‫You know, always look outside the box, as cheesy as it may sound, you know, trying to look outside 45 00:03:10,260 --> 00:03:17,610 ‫the box and then understand why do you. 46 00:03:19,850 --> 00:03:22,610 ‫Do you say the things you say? 47 00:03:22,760 --> 00:03:28,380 ‫You know, so things live in files and we're working with data files. 48 00:03:28,380 --> 00:03:31,610 ‫So table live in a data file indexes live in a data file. 49 00:03:31,610 --> 00:03:38,570 ‫And this is up to you as the database implementer to whether to put these to into one file or put them 50 00:03:38,570 --> 00:03:39,650 ‫in separate files. 51 00:03:39,650 --> 00:03:48,140 ‫There are pros and cons for both cases, and whether you put all indexes in one file or put each index 52 00:03:48,140 --> 00:03:53,270 ‫in one file in its own file, it's all really depends on the implementation. 53 00:03:53,920 --> 00:04:01,020 ‫And that and this is something we never think about as back in engineers like because hey, it's behind 54 00:04:01,830 --> 00:04:07,040 ‫black box and the database does this thing what is good to understand how database is implemented because 55 00:04:07,040 --> 00:04:10,260 ‫is the database is just a program at the end of the day, right. 56 00:04:10,830 --> 00:04:11,970 ‫It's not really. 57 00:04:12,960 --> 00:04:16,560 ‫Well, what I was about to say rocket science, but. 58 00:04:17,690 --> 00:04:20,690 ‫You can argue it is as complex as rocket science. 59 00:04:21,330 --> 00:04:22,430 ‫Database engineer. 60 00:04:24,850 --> 00:04:25,570 ‫So. 61 00:04:26,650 --> 00:04:32,380 ‫When when I make a change as a transaction, I begin my transaction, I start changing my table. 62 00:04:32,800 --> 00:04:41,410 ‫The changes that I make to my table will trigger side behavior to update indexes, and these indexes 63 00:04:41,410 --> 00:04:43,150 ‫need to be updated as a result. 64 00:04:44,660 --> 00:04:47,150 ‫And the tables need to be updated as a result. 65 00:04:48,410 --> 00:04:52,130 ‫The tables consist of pages and the pages are touched and getting dirty. 66 00:04:52,130 --> 00:04:58,250 ‫And at the end of the day, we we going to say, commit my changes. 67 00:04:58,250 --> 00:04:59,450 ‫I just made a bunch of changes. 68 00:04:59,450 --> 00:05:00,590 ‫Go ahead and commit them. 69 00:05:03,900 --> 00:05:08,130 ‫Now if I say commit my changes. 70 00:05:10,290 --> 00:05:11,760 ‫What does that mean? 71 00:05:14,830 --> 00:05:21,070 ‫It means logically, I want anything that's high, high level speak. 72 00:05:21,430 --> 00:05:25,270 ‫Everything that I wrote, I want it to be there forever. 73 00:05:26,160 --> 00:05:29,340 ‫I want to see it next time I read. 74 00:05:32,240 --> 00:05:35,050 ‫And this means I want it to be durable. 75 00:05:35,060 --> 00:05:39,770 ‫That means if the database shut down after I successfully committed. 76 00:05:41,270 --> 00:05:42,300 ‫And that's another thing. 77 00:05:42,320 --> 00:05:44,040 ‫What does it mean to successfully commit? 78 00:05:44,060 --> 00:05:50,540 ‫It means I receive a successful synchronous response from my commit operation. 79 00:05:51,410 --> 00:05:53,330 ‫So I say, yup. 80 00:05:53,330 --> 00:05:55,150 ‫All done when you tell me. 81 00:05:55,160 --> 00:05:55,520 ‫Yup. 82 00:05:55,520 --> 00:05:56,360 ‫All done. 83 00:05:56,480 --> 00:06:05,450 ‫I assume that even if I shut you that down in that second, if I unplug you, I come back. 84 00:06:05,630 --> 00:06:07,910 ‫Everything that I wrote should be there. 85 00:06:08,770 --> 00:06:15,730 ‫So if we go back to this commit operation for a minute and ask yourself, how is this actually implemented? 86 00:06:15,730 --> 00:06:17,560 ‫How can I implement it? 87 00:06:17,800 --> 00:06:21,010 ‫Let's not get pigeonholed again with implementation. 88 00:06:21,610 --> 00:06:29,740 ‫One way you would implement it, you say, okay, any time I write, if I begin my transaction, I start 89 00:06:29,860 --> 00:06:35,440 ‫updating a row and inserting a new row and I doing all this changes and I update the indexes. 90 00:06:35,680 --> 00:06:43,570 ‫I am going to make all these changes in memory because you see, if I want to update row number seven, 91 00:06:43,570 --> 00:06:50,050 ‫I need to fetch the page where row number seven lives and update the page in memory. 92 00:06:50,050 --> 00:06:51,040 ‫That's what I'm going to do. 93 00:06:51,370 --> 00:06:52,540 ‫That's why I'm implementation. 94 00:06:52,690 --> 00:06:53,860 ‫I'm going to write in memory. 95 00:06:55,210 --> 00:07:00,880 ‫Then I'm going to insert a new row on the same page in memory, and then I'm going to insert another 96 00:07:00,880 --> 00:07:01,120 ‫row. 97 00:07:01,120 --> 00:07:02,740 ‫But this table is cluster. 98 00:07:02,740 --> 00:07:06,370 ‫So the pages at the end at the tail of the table. 99 00:07:06,370 --> 00:07:12,310 ‫So I need to fetch that page, fetch a put in the pool, the buffer pool, the memory, and then write 100 00:07:12,460 --> 00:07:13,030 ‫to it. 101 00:07:13,030 --> 00:07:19,570 ‫And I keep writing to memory, writing to my writing memory and that's fine because these changes are 102 00:07:19,570 --> 00:07:29,860 ‫dirty and I didn't commit right now if I say if the client tells me to commit, I'll take everything 103 00:07:29,860 --> 00:07:34,800 ‫that I have in memory and I literally just flush it to disk. 104 00:07:34,810 --> 00:07:44,230 ‫That means I am overwriting the same location where the page existed in the file with my changes. 105 00:07:46,430 --> 00:07:51,980 ‫And you can see if you make a lot of changes, the commit operation will be slow. 106 00:07:52,870 --> 00:07:58,870 ‫Right this implementation naturally, because all of this stuff is your memory and you have to take 107 00:07:58,870 --> 00:08:02,350 ‫time to persist these changes to disk. 108 00:08:02,380 --> 00:08:06,790 ‫Try, try, try as you as you ride them to disk. 109 00:08:07,270 --> 00:08:14,500 ‫You're taking a hit, you're going to the disk and writing these different pages and that might take 110 00:08:14,500 --> 00:08:16,390 ‫a while to write all this. 111 00:08:17,110 --> 00:08:21,940 ‫And the problem here is not the time, per se of this implementation. 112 00:08:21,940 --> 00:08:23,200 ‫The problem is what happened. 113 00:08:23,200 --> 00:08:28,330 ‫If I if you wrote half of these pages and the database crash in the middle of your commit. 114 00:08:33,140 --> 00:08:34,460 ‫That is dangerous. 115 00:08:37,220 --> 00:08:38,030 ‫Why? 116 00:08:38,720 --> 00:08:48,830 ‫Because you just flushed something and you changed the presentation of the table for a transaction that 117 00:08:48,860 --> 00:08:50,450 ‫was half --. 118 00:08:51,230 --> 00:08:52,970 ‫It was half committed. 119 00:08:53,240 --> 00:08:55,070 ‫And what does it mean to have committed? 120 00:08:55,100 --> 00:08:56,540 ‫Well, it doesn't mean anything. 121 00:08:56,720 --> 00:09:02,660 ‫I have committed the transaction is a rolled back transaction is a bad transaction and should not be 122 00:09:02,660 --> 00:09:03,680 ‫considered. 123 00:09:04,280 --> 00:09:08,630 ‫That's rule number one and asset transactions at home, etc.. 124 00:09:10,630 --> 00:09:11,620 ‫So that's a problem. 125 00:09:11,620 --> 00:09:14,240 ‫That implementation sucks, right? 126 00:09:15,970 --> 00:09:19,930 ‫But it was so good because my rights were so fast, right? 127 00:09:19,930 --> 00:09:21,280 ‫Because I am writing. 128 00:09:21,280 --> 00:09:22,990 ‫All my changes are in memory. 129 00:09:23,350 --> 00:09:24,580 ‫But that sucks. 130 00:09:25,780 --> 00:09:29,020 ‫So what people did, what computer scientists did. 131 00:09:29,350 --> 00:09:31,990 ‫We said, okay, computer science said this. 132 00:09:32,080 --> 00:09:33,220 ‫What if. 133 00:09:34,330 --> 00:09:44,620 ‫As I am writing these dirty changes, I keep a log, an actual journal. 134 00:09:45,550 --> 00:09:48,140 ‫Of what exactly changed? 135 00:09:48,160 --> 00:09:49,630 ‫Not the whole thing. 136 00:09:50,170 --> 00:09:56,950 ‫Because remember, when you pull a page of six kilobyte, a kilobyte page depending on the SSD size 137 00:09:56,950 --> 00:10:00,460 ‫and the database page size and how they agree on that. 138 00:10:01,030 --> 00:10:02,230 ‫And you change. 139 00:10:03,250 --> 00:10:04,600 ‫One bite. 140 00:10:05,700 --> 00:10:13,590 ‫Or he changed 32 bits or whatever the size of the role you changed and you said flush. 141 00:10:14,010 --> 00:10:16,050 ‫There is no flush one. 142 00:10:16,050 --> 00:10:18,060 ‫But when it comes to disk. 143 00:10:18,870 --> 00:10:24,190 ‫Not until we get byte address ability on on SSDs and hard drives. 144 00:10:24,210 --> 00:10:31,050 ‫It doesn't exist today youif the minimum size is called and people argue about this. 145 00:10:31,080 --> 00:10:32,160 ‫It's called a page. 146 00:10:32,160 --> 00:10:33,120 ‫I believe in SSD. 147 00:10:33,150 --> 00:10:39,480 ‫Sometimes it's called double lock, sometimes it's called erasable unit and a database and hard disk 148 00:10:39,480 --> 00:10:40,590 ‫is called sector. 149 00:10:41,040 --> 00:10:45,710 ‫There is a specific size and we don't care what is called because names don't go anywhere. 150 00:10:45,720 --> 00:10:50,940 ‫To be honest, this is I've been in this business for a long time and everybody invents a name from 151 00:10:50,940 --> 00:10:51,810 ‫their --. 152 00:10:52,440 --> 00:10:57,750 ‫So you can invent your own name from your own -- and be satisfied with that. 153 00:10:58,080 --> 00:11:05,190 ‫But what what that technology does not allow us to write one, but because of physical limitation of 154 00:11:05,190 --> 00:11:10,500 ‫the disk, because it's not worth it to write one byte, I believe that's the reason. 155 00:11:10,680 --> 00:11:12,810 ‫So we write a bunch of them. 156 00:11:12,810 --> 00:11:19,660 ‫So if I change something that I have to write eight K, even if I change one thing, right. 157 00:11:19,740 --> 00:11:21,300 ‫And that's that's costly. 158 00:11:21,870 --> 00:11:28,230 ‫So that back to what computer scientist say, okay, because we have this limitation and instead of 159 00:11:28,380 --> 00:11:34,500 ‫I changed one thing, let's keep a log of that thing that changed. 160 00:11:36,340 --> 00:11:37,600 ‫That's a good idea. 161 00:11:38,620 --> 00:11:45,580 ‫And let's make sure that that log is immediately persisted to disk because that log. 162 00:11:46,590 --> 00:11:47,490 ‫That. 163 00:11:50,360 --> 00:11:53,720 ‫The Journal is the source of truth. 164 00:11:54,350 --> 00:11:59,260 ‫You change this role, A became B and RO. 165 00:11:59,270 --> 00:12:04,990 ‫Column seven in row six became this value and the string. 166 00:12:05,000 --> 00:12:06,410 ‫Hello becomes world. 167 00:12:06,620 --> 00:12:14,720 ‫You make changes you can't you just journal the changes and this is called the right ahead look and 168 00:12:14,720 --> 00:12:20,090 ‫it's called right ahead because you're writing ahead of time almost you're predicting. 169 00:12:21,720 --> 00:12:26,190 ‫Was going to change because that's that's that's what's going to change. 170 00:12:32,380 --> 00:12:33,190 ‫So. 171 00:12:38,100 --> 00:12:47,010 ‫If I write this now and I have this log and I make sure every write I do to the wall, to the right 172 00:12:47,010 --> 00:12:50,100 ‫ahead log is persisted to desk. 173 00:12:52,020 --> 00:12:57,360 ‫Then that's nice because I can keep my dirty pages in memory. 174 00:12:59,730 --> 00:13:00,540 ‫Now. 175 00:13:02,100 --> 00:13:03,720 ‫If I say commit. 176 00:13:07,540 --> 00:13:11,620 ‫I already written all the changes to the log. 177 00:13:11,980 --> 00:13:19,900 ‫I have a journal of every possible thing that my transaction did and all other transactions as well. 178 00:13:20,110 --> 00:13:25,390 ‫So I can replay this if I want to. 179 00:13:27,180 --> 00:13:30,040 ‫So now let's take this into consideration again. 180 00:13:30,060 --> 00:13:36,660 ‫You have a presentation of what the page looks like or the table looks like on disk. 181 00:13:37,050 --> 00:13:40,830 ‫You pulled it into memory and you start making the changes in memory. 182 00:13:40,830 --> 00:13:49,170 ‫You still make the changes to that page in memory, but you also take note of what these tiny changes, 183 00:13:49,170 --> 00:13:51,800 ‫just what changed into this wall. 184 00:13:51,810 --> 00:13:58,310 ‫So the wall will be so compact and it's also compressed and done all sorts of stuff to it. 185 00:13:58,560 --> 00:14:02,950 ‫And you flush the changes to disk, you keep those in memory. 186 00:14:02,950 --> 00:14:07,830 ‫You might say, Hey, you're not flushing this, it's okay not to flush it. 187 00:14:07,830 --> 00:14:08,910 ‫We're going to talk about it. 188 00:14:08,910 --> 00:14:13,980 ‫We're going to need to flush it eventually, but not immediately, because this are expensive. 189 00:14:14,010 --> 00:14:15,330 ‫These are large pages. 190 00:14:16,170 --> 00:14:18,900 ‫So now you keep all these changes. 191 00:14:20,840 --> 00:14:22,520 ‫And now you say commit. 192 00:14:23,890 --> 00:14:25,120 ‫Come the transaction. 193 00:14:25,690 --> 00:14:30,460 ‫What is the minimum amount of fork that the database needs to do to commit? 194 00:14:32,060 --> 00:14:32,600 ‫Well. 195 00:14:33,520 --> 00:14:35,290 ‫All you have to do is just commit. 196 00:14:35,290 --> 00:14:37,300 ‫Make sure all the walls committed. 197 00:14:38,150 --> 00:14:41,060 ‫And if the walls committed, we can persist. 198 00:14:41,060 --> 00:14:46,580 ‫The fact that this transaction is successfully committed and we can crash if we want. 199 00:14:47,060 --> 00:14:48,920 ‫You might say in the pages are old. 200 00:14:50,360 --> 00:14:51,170 ‫That's true. 201 00:14:51,470 --> 00:15:00,010 ‫The pages in the disk still is in its original representation, but the memory was the final one. 202 00:15:00,020 --> 00:15:00,890 ‫It was. 203 00:15:00,890 --> 00:15:01,760 ‫Everything we made. 204 00:15:01,760 --> 00:15:03,410 ‫The changes was in the memory. 205 00:15:05,220 --> 00:15:06,870 ‫We don't have to commit that. 206 00:15:07,380 --> 00:15:10,830 ‫We can try later, but we don't have to. 207 00:15:11,280 --> 00:15:17,610 ‫So let's take this consideration that I committed and I committed all the changes in the wall or the 208 00:15:17,610 --> 00:15:20,730 ‫right to ahead log and boom. 209 00:15:21,270 --> 00:15:25,890 ‫My database crashed but I committed it comes back up. 210 00:15:27,750 --> 00:15:29,160 ‫They detected that. 211 00:15:29,940 --> 00:15:33,940 ‫Well, the page now is completely out of sync with the wall. 212 00:15:34,000 --> 00:15:34,560 ‫Right. 213 00:15:34,680 --> 00:15:36,510 ‫Because the wall is ahead. 214 00:15:37,260 --> 00:15:42,810 ‫In this case, the write ahead log is literally ahead of the data files. 215 00:15:42,840 --> 00:15:45,840 ‫The data files was the old original one. 216 00:15:45,840 --> 00:15:47,490 ‫So the database knows this. 217 00:15:48,230 --> 00:15:48,980 ‫How does it know? 218 00:15:48,980 --> 00:15:52,850 ‫Because there are records of this and says, oh, wait a minute. 219 00:15:53,480 --> 00:15:55,670 ‫It starts up as, oh, wait a minute. 220 00:15:57,430 --> 00:15:58,670 ‫This is old. 221 00:15:58,690 --> 00:16:02,950 ‫I can't let people read this because nobody reads from the wall. 222 00:16:03,340 --> 00:16:05,560 ‫That's another question I got from the database course. 223 00:16:06,040 --> 00:16:07,120 ‫What if I can? 224 00:16:07,120 --> 00:16:08,170 ‫I read from the wall. 225 00:16:08,440 --> 00:16:10,390 ‫Can my transaction go to the wall or read it? 226 00:16:10,420 --> 00:16:11,830 ‫That's a bad idea. 227 00:16:12,130 --> 00:16:13,510 ‫Well, you can. 228 00:16:14,200 --> 00:16:15,160 ‫Let's not say anything. 229 00:16:15,160 --> 00:16:15,800 ‫It's bad idea. 230 00:16:15,820 --> 00:16:17,380 ‫It's just you can. 231 00:16:17,420 --> 00:16:18,830 ‫You can do anything you want. 232 00:16:18,850 --> 00:16:19,930 ‫You own the software. 233 00:16:19,930 --> 00:16:21,630 ‫If you're building it, you own the software. 234 00:16:21,640 --> 00:16:25,330 ‫You can write anything you do, you own it. 235 00:16:25,840 --> 00:16:31,600 ‫You can you can decide, hey, let's write from the let's read from the wall the changes. 236 00:16:32,260 --> 00:16:35,530 ‫But the work that you have to do is enormous. 237 00:16:35,980 --> 00:16:41,860 ‫And clients don't necessarily want to do that or they want to read a nice tucked in page and they want 238 00:16:41,860 --> 00:16:44,860 ‫to just crack the rows and read them. 239 00:16:45,010 --> 00:16:49,550 ‫They don't want to just figure things out because remember, this only has the changes. 240 00:16:49,570 --> 00:16:53,530 ‫You still need more stuff, so you'll end up reading multiple places. 241 00:16:54,010 --> 00:16:54,730 ‫So it can be done. 242 00:16:54,730 --> 00:16:55,390 ‫But it's hard. 243 00:16:56,610 --> 00:17:03,510 ‫So an Airbus crashed and stowed back and then it detected that there was a crash. 244 00:17:04,410 --> 00:17:08,060 ‫And the page that is on this is out of sync. 245 00:17:08,070 --> 00:17:10,380 ‫The wall is ahead of us. 246 00:17:10,830 --> 00:17:11,320 ‫Right. 247 00:17:12,360 --> 00:17:12,990 ‫The writer had. 248 00:17:12,990 --> 00:17:16,830 ‫Log is way ahead of us, so the wall is ahead. 249 00:17:18,920 --> 00:17:20,510 ‫So what are the debates do? 250 00:17:20,540 --> 00:17:21,310 ‫Text the page. 251 00:17:23,230 --> 00:17:26,650 ‫And then reads it into memory and says, Wait a minute. 252 00:17:28,650 --> 00:17:40,290 ‫Let me apply the changes in the wall to this page and then start redoing the changes because at one 253 00:17:40,290 --> 00:17:42,510 ‫point we did that changes. 254 00:17:42,510 --> 00:17:45,240 ‫This is a redo of these changes. 255 00:17:45,240 --> 00:17:52,080 ‫So you start redo these changes, applying these changes again, because this have been done at one 256 00:17:52,080 --> 00:17:52,380 ‫point. 257 00:17:52,380 --> 00:17:52,830 ‫Right. 258 00:17:52,830 --> 00:17:53,880 ‫But we lost it. 259 00:17:53,880 --> 00:17:56,010 ‫Now we're redoing it. 260 00:17:56,220 --> 00:17:59,160 ‫That's why the wall is also called the redo log. 261 00:18:01,190 --> 00:18:02,930 ‫So we're redoing these changes. 262 00:18:02,930 --> 00:18:08,480 ‫So take that beautiful old stale page, which is consistent, by the way, because at that moment of 263 00:18:08,480 --> 00:18:14,840 ‫time we didn't then apply that that, that, that, that all the changes that that transaction or other 264 00:18:14,840 --> 00:18:18,260 ‫transaction made were until we are done with the wall. 265 00:18:20,190 --> 00:18:25,710 ‫And now that page is freshly dirty with the latest changes. 266 00:18:25,710 --> 00:18:35,310 ‫So I can have clients read from this dirty page and I can safely flush this page to disk. 267 00:18:36,010 --> 00:18:36,570 ‫All right. 268 00:18:39,920 --> 00:18:44,090 ‫And now when you flush this page to disk. 269 00:18:45,100 --> 00:18:48,280 ‫You're you're now consistent with the war effectively. 270 00:18:48,580 --> 00:18:49,510 ‫You might say now. 271 00:18:50,450 --> 00:18:52,860 ‫Now, this is a very good point, right. 272 00:18:52,880 --> 00:19:00,410 ‫We talked about the data files on the indexes and we talked about the wall and how often should I flush 273 00:19:00,410 --> 00:19:02,160 ‫data files to disk? 274 00:19:02,180 --> 00:19:03,290 ‫It's up, really. 275 00:19:03,290 --> 00:19:08,420 ‫It's all of this is configurable for you as a DBA. 276 00:19:08,930 --> 00:19:11,180 ‫You know, there is something called the wall size. 277 00:19:11,180 --> 00:19:12,980 ‫How big the wall can get. 278 00:19:13,640 --> 00:19:17,960 ‫The wall shouldn't you think about shouldn't go infinity, right? 279 00:19:19,270 --> 00:19:21,730 ‫Because once I flush. 280 00:19:22,680 --> 00:19:25,440 ‫That dirty changes. 281 00:19:26,390 --> 00:19:29,810 ‫The wall can be perched. 282 00:19:30,670 --> 00:19:31,420 ‫Right. 283 00:19:32,350 --> 00:19:36,620 ‫Because those changes are already synched with the data files. 284 00:19:36,640 --> 00:19:39,700 ‫There is no point for me to keep these old walls. 285 00:19:40,150 --> 00:19:46,660 ‫All changes, because those changes has been completely the only purpose of the wall was in case of 286 00:19:46,660 --> 00:19:47,440 ‫a crash. 287 00:19:47,590 --> 00:19:54,760 ‫I can recover and redo the changes and if you did them and all the data files are in sync, why the 288 00:19:54,760 --> 00:19:56,150 ‫heck do you keep them around? 289 00:19:56,170 --> 00:19:58,240 ‫It's just wasted space. 290 00:19:58,840 --> 00:20:02,560 ‫That's why the wall is also presented sometimes as a circle. 291 00:20:03,140 --> 00:20:04,570 ‫And you. 292 00:20:04,600 --> 00:20:05,040 ‫You. 293 00:20:05,050 --> 00:20:05,830 ‫Whoa, whoa, whoa, whoa. 294 00:20:05,870 --> 00:20:06,190 ‫You're right. 295 00:20:06,250 --> 00:20:07,810 ‫I try change and change. 296 00:20:08,050 --> 00:20:09,100 ‫Changes, changes. 297 00:20:09,100 --> 00:20:16,090 ‫And once you end the end of the circle, that means it's time to flush the walls almost full. 298 00:20:16,120 --> 00:20:24,250 ‫Go and flush every data, file that page and memory down to disk. 299 00:20:24,250 --> 00:20:27,760 ‫And if the flushing was successful, go. 300 00:20:27,760 --> 00:20:32,290 ‫And first of all, you might say, Hussein, isn't this the same problem that you originally started 301 00:20:32,290 --> 00:20:33,520 ‫when he started this show? 302 00:20:33,550 --> 00:20:35,140 ‫You talked about that. 303 00:20:35,350 --> 00:20:41,080 ‫What happened if while you're flushing that page, the database crashed? 304 00:20:41,260 --> 00:20:43,660 ‫That's fine, because we know. 305 00:20:43,810 --> 00:20:44,470 ‫Right? 306 00:20:44,470 --> 00:20:53,470 ‫We know the moment where the the pages were consistent and we know that the wall is not whatever's in 307 00:20:53,470 --> 00:20:55,060 ‫the wall is the truth. 308 00:20:55,390 --> 00:21:00,970 ‫So you go back and you say, okay, from this point, that's when things were good. 309 00:21:01,060 --> 00:21:03,550 ‫That change is all garbage. 310 00:21:04,000 --> 00:21:10,720 ‫So you need to remove all these changes and then you have to reapply them again. 311 00:21:11,050 --> 00:21:14,570 ‫So that's that's another thing that the database do. 312 00:21:14,590 --> 00:21:22,270 ‫So in the case of a crash, so it's a complicated process and they take care of all of these situations. 313 00:21:22,270 --> 00:21:32,050 ‫So the that process that we talked about just now, where the flushing of these dirty pages so that 314 00:21:32,050 --> 00:21:37,270 ‫we can get rid of the wall is called check pointing and. 315 00:21:38,390 --> 00:21:44,660 ‫It really happens at the most random time because check pointing. 316 00:21:45,320 --> 00:21:52,280 ‫If you think about it is a data intensive, oil intensive operation. 317 00:21:52,850 --> 00:21:56,540 ‫Because now what I'm doing check pointing. 318 00:21:56,930 --> 00:21:59,180 ‫It depends really on the database implementation. 319 00:21:59,180 --> 00:22:00,590 ‫But I believe. 320 00:22:01,540 --> 00:22:09,370 ‫Things need to get paused during checkpoint because you really need to make sure that no new things 321 00:22:09,370 --> 00:22:11,680 ‫comes to the wall as you checkpoint. 322 00:22:12,010 --> 00:22:16,330 ‫You can you can you can argue that, hey, I'm going to put a checkpoint point on my wall. 323 00:22:16,330 --> 00:22:18,760 ‫This is where I'm checkpoint right now. 324 00:22:18,910 --> 00:22:21,520 ‫Write everything, flush everything to disk. 325 00:22:21,520 --> 00:22:26,620 ‫And then, yeah, the wall can continue to grow and transaction can keep coming in and you can implement 326 00:22:26,620 --> 00:22:27,640 ‫something like that. 327 00:22:27,760 --> 00:22:28,040 ‫Right. 328 00:22:28,090 --> 00:22:30,850 ‫With a little bit of finesse engineering I guess. 329 00:22:31,450 --> 00:22:32,100 ‫Right. 330 00:22:33,190 --> 00:22:36,880 ‫But yeah, I believe it's going to be a little bit complex. 331 00:22:37,180 --> 00:22:43,690 ‫But these checkpoints in operation and Chris warns about it, my SQL warns about it, every database 332 00:22:43,690 --> 00:22:52,930 ‫warns about, hey, just, just watch out for this because CPU ram disk activity will spike. 333 00:22:54,250 --> 00:22:58,840 ‫As checkpoint happens and this could happen at the most random places. 334 00:22:58,840 --> 00:23:01,900 ‫You might have a very high intensive workload. 335 00:23:02,920 --> 00:23:03,550 ‫Right. 336 00:23:03,550 --> 00:23:07,540 ‫That is happening at the same time as a checkpoint and that's will suffer as a result. 337 00:23:07,810 --> 00:23:09,160 ‫Do you want to suffer? 338 00:23:09,640 --> 00:23:12,520 ‫Well, the life is suffering, unfortunately. 339 00:23:12,520 --> 00:23:14,020 ‫So we all need to suffer. 340 00:23:16,970 --> 00:23:18,650 ‫Sometimes we cannot escape this, though. 341 00:23:20,320 --> 00:23:29,240 ‫But you have you can understand now that if this happens, right, this explains why sometimes it takes 342 00:23:29,240 --> 00:23:31,240 ‫a fraction of a millisecond. 343 00:23:31,240 --> 00:23:35,540 ‫Sometimes it takes 50 milliseconds out of nowhere. 344 00:23:35,560 --> 00:23:36,970 ‫Like, what is this? 345 00:23:37,000 --> 00:23:39,490 ‫Because the database might be doing something. 346 00:23:39,760 --> 00:23:40,870 ‫So what do you do? 347 00:23:41,590 --> 00:23:46,510 ‫Well, you try to make the wall as short as possible, right? 348 00:23:46,510 --> 00:23:48,200 ‫As small as possible. 349 00:23:48,220 --> 00:23:54,340 ‫If you make a wall as small as possible, then the checkpoint size will be smaller. 350 00:23:54,340 --> 00:23:56,500 ‫So the flushing will be more frequent. 351 00:23:56,500 --> 00:23:58,390 ‫The checkpoint thing won't be more frequent. 352 00:23:58,930 --> 00:23:59,830 ‫And. 353 00:24:00,820 --> 00:24:09,010 ‫In this case, you're only flushing certain amount of data as a given time, and that could be tolerable 354 00:24:09,010 --> 00:24:11,020 ‫and it will be almost consistent. 355 00:24:11,050 --> 00:24:17,230 ‫And you can argue, well, let's make the wall so large such that, yeah, I don't want to deal with 356 00:24:17,230 --> 00:24:27,400 ‫checkpoint point until I don't know, certain time, but if you kept the that for a long time that it 357 00:24:27,400 --> 00:24:30,100 ‫can also grow into other problems. 358 00:24:31,300 --> 00:24:32,320 ‫Everything is a trade off. 359 00:24:32,320 --> 00:24:35,470 ‫Unfortunately, I don't have solutions to any of this stuff. 360 00:24:35,560 --> 00:24:41,080 ‫It's just understanding that this exists and we have to deal with it all. 361 00:24:41,080 --> 00:24:42,910 ‫For World, for durability. 362 00:24:43,120 --> 00:24:45,760 ‫We are doing all of this for durability. 363 00:24:45,760 --> 00:24:50,740 ‫We want to be durable and we want to recover and be consistent in case of a crash. 364 00:24:51,930 --> 00:24:56,440 ‫If you can guarantee that you will never crash the room of the wall. 365 00:24:56,460 --> 00:25:00,450 ‫I dare you remove the wall. 366 00:25:07,550 --> 00:25:14,540 ‫So some say computer scientist built a wall and the DBA is. 367 00:25:16,380 --> 00:25:17,400 ‫Paid for it. 368 00:25:18,210 --> 00:25:18,810 ‫Get it? 369 00:25:20,050 --> 00:25:21,730 ‫No, that was a bad joke. 370 00:25:23,020 --> 00:25:23,590 ‫All right. 371 00:25:27,750 --> 00:25:36,120 ‫So we talked about redo logs, which is the wall the writer had log and and go go to the configuration 372 00:25:36,120 --> 00:25:43,890 ‫and you will get to see this plaster everywhere wall this wall that wall time wall this wall flush time. 373 00:25:44,100 --> 00:25:46,470 ‫What do you want, f sink or not, right? 374 00:25:47,460 --> 00:25:47,870 ‫Yeah. 375 00:25:47,880 --> 00:25:48,600 ‫Let's talk about this. 376 00:25:48,900 --> 00:25:50,730 ‫Actually, the f sink and the wall. 377 00:25:51,780 --> 00:25:56,340 ‫So you see, if you are building your database on top of an operating system. 378 00:25:58,010 --> 00:25:58,490 ‫You might say. 379 00:25:58,490 --> 00:26:01,340 ‫What kind of stupid statement is that, Hussein? 380 00:26:01,370 --> 00:26:04,610 ‫What else are you going to build your database on? 381 00:26:04,790 --> 00:26:10,850 ‫Hey, I have to be very specific, because you might build your own operating system. 382 00:26:10,850 --> 00:26:12,250 ‫That is a database. 383 00:26:12,260 --> 00:26:13,340 ‫Yeah, because you might. 384 00:26:13,340 --> 00:26:13,790 ‫You might. 385 00:26:13,820 --> 00:26:15,860 ‫You might build your own OS. 386 00:26:15,860 --> 00:26:21,260 ‫That happens to be a database without the bloat of the operating system. 387 00:26:23,030 --> 00:26:31,250 ‫But if you decide, like any database that is to build your database that lives on top of this bloat, 388 00:26:31,310 --> 00:26:34,580 ‫it's called the OS the general purpose. 389 00:26:35,470 --> 00:26:35,750 ‫Right. 390 00:26:35,830 --> 00:26:42,640 ‫OS Then you have to live with the rules of the OS which is Linux or Windows or Mac. 391 00:26:44,850 --> 00:26:46,050 ‫Or Temple Oaks. 392 00:26:46,080 --> 00:26:48,450 ‫Is that a horse then? 393 00:26:50,250 --> 00:26:54,270 ‫There is there is something that the operating system does because it's general purpose. 394 00:26:54,270 --> 00:26:55,050 ‫It doesn't trust. 395 00:26:55,050 --> 00:26:58,110 ‫Any app says, hey, all apps are stupid. 396 00:26:59,040 --> 00:27:01,380 ‫So if the if so. 397 00:27:01,380 --> 00:27:06,720 ‫So the apps tend to do some repetitive job. 398 00:27:06,720 --> 00:27:12,720 ‫They write to the same file multiple times in the same microsecond. 399 00:27:13,020 --> 00:27:23,070 ‫You know, if the operating system let that right go to disk immediately, your SSD will be dead effectively. 400 00:27:23,070 --> 00:27:23,370 ‫Right. 401 00:27:23,370 --> 00:27:25,200 ‫And in a few. 402 00:27:26,170 --> 00:27:27,220 ‫Because let's say. 403 00:27:27,820 --> 00:27:28,150 ‫Right. 404 00:27:28,970 --> 00:27:29,030 ‫Right. 405 00:27:29,410 --> 00:27:33,350 ‫Because like, how many times do you write save control as controls controllers? 406 00:27:33,400 --> 00:27:37,780 ‫Imagine all these rights going immediately to the right. 407 00:27:38,620 --> 00:27:39,310 ‫No. 408 00:27:41,110 --> 00:27:44,770 ‫What the system has is like it has a file system cache. 409 00:27:44,770 --> 00:27:47,650 ‫So if you write something, it says. 410 00:27:48,450 --> 00:27:49,380 ‫You really want, right? 411 00:27:49,470 --> 00:27:50,550 ‫Listen, let's just wait. 412 00:27:50,550 --> 00:27:59,190 ‫Let's buffer these rights so it it buffers these rights in memory and it keeps them in memory in the 413 00:27:59,190 --> 00:28:04,770 ‫cache, hoping that it might that same page will receive more rights because. 414 00:28:05,730 --> 00:28:06,750 ‫Same problem. 415 00:28:07,170 --> 00:28:07,650 ‫Right. 416 00:28:07,860 --> 00:28:11,310 ‫Because if I write one byte, I have to flush the whole disk. 417 00:28:11,310 --> 00:28:12,510 ‫It's just not worth it. 418 00:28:12,690 --> 00:28:14,850 ‫Not this whole page, right? 419 00:28:14,930 --> 00:28:15,930 ‫It's just not worth it. 420 00:28:15,930 --> 00:28:17,430 ‫So let's just wait for more. 421 00:28:17,430 --> 00:28:23,310 ‫Write that hopefully this dirty page receives so I can flush the whole page with. 422 00:28:23,310 --> 00:28:27,030 ‫With rich dirty rights. 423 00:28:28,010 --> 00:28:34,970 ‫The more rights the page receives and memory like, the better the economics of writing a page. 424 00:28:34,970 --> 00:28:41,030 ‫Because otherwise, if you write, if you change one character and that writes the desk, then he changed 425 00:28:41,040 --> 00:28:42,760 ‫another character in that writes the desk. 426 00:28:42,770 --> 00:28:49,120 ‫You're writing eight k, 8kkk, eight k, eight k. 427 00:28:50,180 --> 00:28:54,800 ‫Every microsecond, every millisecond that you're writing. 428 00:28:55,010 --> 00:28:56,270 ‫And that's too much, right? 429 00:28:58,010 --> 00:29:04,970 ‫So the operating system has the cache and waits for this cache to fail and then flush that cache to 430 00:29:04,970 --> 00:29:05,630 ‫disk. 431 00:29:06,620 --> 00:29:07,010 ‫Right. 432 00:29:09,580 --> 00:29:10,450 ‫So. 433 00:29:13,790 --> 00:29:14,840 ‫That's a problem. 434 00:29:15,650 --> 00:29:17,540 ‫We talked about the wall that writer had logged. 435 00:29:17,540 --> 00:29:17,960 ‫Right. 436 00:29:19,760 --> 00:29:20,930 ‫If I, if I. 437 00:29:21,050 --> 00:29:29,250 ‫If I told you to flush the wall, change that wall, I won't make it till the wall is actually persist. 438 00:29:29,270 --> 00:29:31,100 ‫Don't put it in the cash. 439 00:29:31,400 --> 00:29:32,990 ‫Don't try to be smart. 440 00:29:33,860 --> 00:29:35,900 ‫Don't try to be efficient operating system. 441 00:29:36,740 --> 00:29:38,120 ‫Go to desk. 442 00:29:39,100 --> 00:29:43,720 ‫I want a way to bypass this cache that you have. 443 00:29:44,510 --> 00:29:54,500 ‫And so that's why most databases in certain operations like the wall, because wall is critical. 444 00:29:54,710 --> 00:29:55,160 ‫Right. 445 00:29:55,280 --> 00:29:56,510 ‫It is tiny things. 446 00:29:56,990 --> 00:30:00,140 ‫But we need to flush them directly to disk. 447 00:30:01,390 --> 00:30:01,740 ‫Yeah. 448 00:30:04,530 --> 00:30:06,060 ‫And that's called F sync. 449 00:30:06,480 --> 00:30:07,980 ‫That option is called F sync. 450 00:30:08,790 --> 00:30:13,140 ‫So if if sync is enabled, which is the default, I believe, then. 451 00:30:14,180 --> 00:30:14,660 ‫No. 452 00:30:14,660 --> 00:30:18,860 ‫If Singh is disabled, right, which is off, that's the default. 453 00:30:19,070 --> 00:30:20,660 ‫Then it goes to the cash. 454 00:30:20,690 --> 00:30:28,430 ‫If Sink is enabled, they force the sink, make the change, go directly to disc bypass. 455 00:30:28,460 --> 00:30:28,820 ‫It's. 456 00:30:28,820 --> 00:30:35,030 ‫It's it's make it as a right through cash and punch through a hole through the cash. 457 00:30:37,440 --> 00:30:43,350 ‫So yeah, you can turn it off if you want and you get to get a little bit more speed, but in a danger 458 00:30:43,350 --> 00:30:45,090 ‫of losing your data. 459 00:30:45,390 --> 00:30:48,780 ‫I have that option off, by the way, in my testing data because I don't care. 460 00:30:48,780 --> 00:30:53,010 ‫It's a testing data and I load the data on a daily basis. 461 00:30:53,010 --> 00:30:57,030 ‫So if there is a crash, which is very highly unlikely, I just reload again. 462 00:30:57,390 --> 00:31:02,280 ‫And I did really not as much difference to be honest, but never mind. 463 00:31:03,090 --> 00:31:08,720 ‫So databases have this enabled most of the time, not necessarily. 464 00:31:08,750 --> 00:31:13,830 ‫Postgres doesn't have it enabled for like reading and writing pages because we know that it's okay, 465 00:31:14,550 --> 00:31:18,120 ‫let's write it to the cache and or operating system cache. 466 00:31:18,120 --> 00:31:20,400 ‫Let's read from the operating system cache. 467 00:31:20,970 --> 00:31:21,840 ‫That's fine. 468 00:31:22,860 --> 00:31:27,960 ‫So if you think about it, focus has almost two caches that cache in the buffer pool. 469 00:31:27,990 --> 00:31:30,330 ‫I think it's called the working memory. 470 00:31:31,320 --> 00:31:32,840 ‫Or maybe it's called the Beaufort War. 471 00:31:33,070 --> 00:31:33,780 ‫I forgot. 472 00:31:34,140 --> 00:31:36,720 ‫And then there is the operating system, cash itself. 473 00:31:39,670 --> 00:31:40,720 ‫So two layers of cash. 474 00:31:43,450 --> 00:31:44,080 ‫So yeah. 475 00:31:44,080 --> 00:31:46,270 ‫So that's something to watch out for. 476 00:31:47,830 --> 00:31:50,140 ‫So, yeah, database engineering is very complex. 477 00:31:50,530 --> 00:31:53,770 ‫So the final thing we're going to talk about is the undo log. 478 00:31:57,270 --> 00:31:57,870 ‫Hussein. 479 00:31:59,340 --> 00:32:00,570 ‫I wrote changes. 480 00:32:01,870 --> 00:32:02,280 ‫Right. 481 00:32:04,440 --> 00:32:09,250 ‫And by the way, not all databases have this and undo log thing. 482 00:32:09,270 --> 00:32:12,330 ‫Postgres doesn't because it does it differently. 483 00:32:13,050 --> 00:32:16,650 ‫The undo log was designed specifically. 484 00:32:17,790 --> 00:32:18,810 ‫For. 485 00:32:20,030 --> 00:32:22,550 ‫Give me the state of their raw. 486 00:32:23,920 --> 00:32:25,840 ‫Before it was changed. 487 00:32:27,550 --> 00:32:28,460 ‫I say, Hosain. 488 00:32:28,850 --> 00:32:29,390 ‫What? 489 00:32:30,900 --> 00:32:31,740 ‫Why? 490 00:32:31,770 --> 00:32:33,000 ‫Why are you doing this? 491 00:32:33,720 --> 00:32:39,270 ‫Well, if you are in a running transaction, a long running transaction, you're changing, changing, 492 00:32:39,270 --> 00:32:40,320 ‫changing, changing. 493 00:32:40,320 --> 00:32:40,740 ‫Right. 494 00:32:42,700 --> 00:32:43,200 ‫Right. 495 00:32:43,240 --> 00:32:47,140 ‫You're making changes directly to the page right in memory. 496 00:32:47,650 --> 00:32:51,400 ‫And you're writing the changes you made to the wall. 497 00:32:52,630 --> 00:32:53,560 ‫So now. 498 00:32:55,740 --> 00:33:02,370 ‫What happened to transactions that started before you? 499 00:33:04,770 --> 00:33:05,400 ‫Right. 500 00:33:07,110 --> 00:33:11,520 ‫Transaction have this thing that's called isolation level since they started. 501 00:33:11,520 --> 00:33:14,040 ‫Before you make this change and you still didn't comment. 502 00:33:15,330 --> 00:33:19,830 ‫Right in almost all isolation levels except read and committed. 503 00:33:21,280 --> 00:33:27,430 ‫Those transactions need the old state of the road before you changed it. 504 00:33:28,090 --> 00:33:33,190 ‫So it cannot read from the page that is dirty because it has the latest stuff. 505 00:33:33,220 --> 00:33:35,680 ‫It does not need the latest stuff. 506 00:33:37,690 --> 00:33:39,790 ‫It needs the old state. 507 00:33:40,840 --> 00:33:50,860 ‫So what the database do is they keep a record of the undo log, specifically how this row looked like 508 00:33:50,860 --> 00:33:56,620 ‫in its entirety, right in an old row. 509 00:33:58,230 --> 00:34:05,220 ‫In a specific log area that's called the Undo Log Oracle SQL Server, I believe. 510 00:34:06,900 --> 00:34:13,110 ‫I'm not 100% sure about SQL Server, but my sequel and Oracle have this model, which is the undo log. 511 00:34:14,020 --> 00:34:14,290 ‫Right. 512 00:34:14,290 --> 00:34:24,730 ‫Postgres doesn't because it uses versioning a version is that all postcards makes the changes in the 513 00:34:24,730 --> 00:34:25,450 ‫page. 514 00:34:25,450 --> 00:34:26,310 ‫Write it. 515 00:34:26,320 --> 00:34:29,710 ‫If you made an update, it's a new roll and postscript. 516 00:34:29,710 --> 00:34:33,010 ‫So the old row already lives, which is perfect, right? 517 00:34:33,280 --> 00:34:36,610 ‫I love this design much better if you think about it. 518 00:34:38,000 --> 00:34:39,340 ‫Again, it's all personal. 519 00:34:39,550 --> 00:34:42,220 ‫At the end of the day, preferences. 520 00:34:43,240 --> 00:34:45,640 ‫But now all transactions. 521 00:34:47,070 --> 00:34:50,460 ‫That want to read ROEs that have been changed. 522 00:34:51,090 --> 00:34:56,770 ‫Those rules don't exist on the page so that they have to do a little bit more extra work. 523 00:34:56,790 --> 00:35:00,660 ‫They have to go to the undo log, crack it open. 524 00:35:00,960 --> 00:35:02,030 ‫I don't know what that means. 525 00:35:02,040 --> 00:35:03,060 ‫Crack it open. 526 00:35:03,390 --> 00:35:05,700 ‫Just want to make sure that it's slower. 527 00:35:05,820 --> 00:35:06,840 ‫That's why I said that. 528 00:35:07,680 --> 00:35:16,880 ‫And then take the page, the row from the page, and then undo the changes that because the row is latest. 529 00:35:16,890 --> 00:35:19,020 ‫You want to undo the changes. 530 00:35:19,020 --> 00:35:23,490 ‫You want to apply changes to make it older, if you will. 531 00:35:23,900 --> 00:35:24,420 ‫Right. 532 00:35:25,440 --> 00:35:33,900 ‫So you want to undo these changes so that it goes back effectively? 533 00:35:35,870 --> 00:35:39,170 ‫To the old state so you can read it. 534 00:35:39,410 --> 00:35:41,180 ‫So a little bit more work. 535 00:35:42,110 --> 00:35:42,860 ‫For. 536 00:35:44,400 --> 00:35:48,990 ‫All their transactions or for that matter, newer transactions. 537 00:35:49,470 --> 00:35:50,040 ‫Right. 538 00:35:50,790 --> 00:35:51,720 ‫Because. 539 00:35:52,480 --> 00:35:57,040 ‫The problem is we have a transaction that did not commit. 540 00:35:57,340 --> 00:35:59,620 ‫We have a long running transaction. 541 00:35:59,620 --> 00:36:07,300 ‫That's why a long run transaction is is the worse for performance reasons because we the undo leg will 542 00:36:07,300 --> 00:36:14,800 ‫keep filling up and those transaction will have to go to the log and crack it open and apply the changes 543 00:36:15,040 --> 00:36:18,970 ‫and roll back, if you will, to read the old rules. 544 00:36:19,060 --> 00:36:24,430 ‫It has to do this all the time and you might say you can do all sorts of caching, but this is work 545 00:36:24,430 --> 00:36:26,920 ‫that the database has to do. 546 00:36:26,950 --> 00:36:30,130 ‫If you think about right, it's all work. 547 00:36:30,670 --> 00:36:33,040 ‫I'm saying, do you persist the undo log? 548 00:36:33,850 --> 00:36:35,470 ‫I believe you should. 549 00:36:36,110 --> 00:36:36,600 ‫Right. 550 00:36:37,510 --> 00:36:38,710 ‫Let's think this through. 551 00:36:38,740 --> 00:36:45,460 ‫You should do the same with the undo log as you do with the redo log, because the reading log tells 552 00:36:45,460 --> 00:36:48,600 ‫you what's the final state, right? 553 00:36:48,790 --> 00:36:52,240 ‫The undo log tells you the. 554 00:36:53,440 --> 00:36:55,120 ‫The older states. 555 00:36:56,020 --> 00:36:56,590 ‫Right. 556 00:36:57,620 --> 00:37:04,580 ‫And the correct way to do this is in case of a crash, you're going to have a bunch of wall, which 557 00:37:04,580 --> 00:37:11,690 ‫is the reader's log, and then you're going to have the undo logs and you get to have a state of the 558 00:37:11,690 --> 00:37:13,880 ‫page as it existed in desk. 559 00:37:13,880 --> 00:37:18,410 ‫So in case of a crash, you have the undo log, you have the redo log, you have the old page. 560 00:37:19,190 --> 00:37:20,180 ‫So what do you do? 561 00:37:20,210 --> 00:37:25,070 ‫The the correct order is take the page and whatever is in the world is the truth. 562 00:37:25,190 --> 00:37:30,530 ‫Apply all the world changes br apply all of them. 563 00:37:30,920 --> 00:37:38,090 ‫And then because you might have applied stuff from a transaction that have been rolled back, right? 564 00:37:38,570 --> 00:37:49,970 ‫Go and take the undo log and then redo, not redo, undo the changes up until the point where the transaction 565 00:37:49,970 --> 00:37:51,020 ‫has been rolled back. 566 00:37:51,870 --> 00:37:52,390 ‫All right. 567 00:37:52,390 --> 00:37:55,360 ‫So you're going to redo, redo and then undo, undo, undo. 568 00:37:55,360 --> 00:38:00,360 ‫And until you reach a consistent state, right. 569 00:38:00,370 --> 00:38:03,070 ‫That's one one implementation that I can think of. 570 00:38:03,370 --> 00:38:03,830 ‫Right. 571 00:38:04,000 --> 00:38:11,590 ‫Can you not persist undo log and only apply the redo logs for transaction that has committed? 572 00:38:12,340 --> 00:38:22,960 ‫I suppose you can, but then you have to differentiate committed transactions versus not committed transaction 573 00:38:22,960 --> 00:38:24,070 ‫in the wall. 574 00:38:24,520 --> 00:38:24,850 ‫Right. 575 00:38:24,850 --> 00:38:27,910 ‫Which I believe that information doesn't exist. 576 00:38:28,130 --> 00:38:33,370 ‫A wall has changes and it doesn't know if this is from this transaction versus that transaction. 577 00:38:33,670 --> 00:38:36,340 ‫I suppose you can store this information in another. 578 00:38:37,360 --> 00:38:42,160 ‫Data file containing the transactions that are being committed. 579 00:38:42,470 --> 00:38:44,050 ‫But it's just easier this way. 580 00:38:44,500 --> 00:38:46,810 ‫I don't know, guys. 581 00:38:47,350 --> 00:38:54,550 ‫This has been an episode of the back engineering show discussing the logs of the database, and pretty 582 00:38:54,550 --> 00:38:59,080 ‫sure I missed some of that stuff, but I believe that's the one of the most important things. 583 00:38:59,080 --> 00:39:05,110 ‫You know, logs undo log wall and the reader logs. 584 00:39:06,490 --> 00:39:07,510 ‫Can I see on the next one. 585 00:39:07,540 --> 00:39:08,680 ‫Hope you enjoyed this one. 586 00:39:08,770 --> 00:39:08,950 ‫Stay. 587 00:39:08,950 --> 00:39:09,370 ‫Awesome. 588 00:39:09,660 --> 00:39:10,040 ‫Goodbye.