1 00:00:00,060 --> 00:00:03,040 ‫Welcome to another Q&A lecture. 2 00:00:03,060 --> 00:00:05,130 ‫Today we have a question from Peter. 3 00:00:05,850 --> 00:00:06,780 ‫Peter asked. 4 00:00:06,990 --> 00:00:10,160 ‫Isolation, repeatable read versus snapshot. 5 00:00:10,170 --> 00:00:12,060 ‫So the question is right. 6 00:00:12,060 --> 00:00:18,720 ‫So Peter is basically trying to understand and give us an example between repeatable rate versus snapshot 7 00:00:18,720 --> 00:00:19,380 ‫isolation. 8 00:00:20,340 --> 00:00:22,260 ‫Again, isolation levels. 9 00:00:22,260 --> 00:00:24,980 ‫Or think of it as just a single instance. 10 00:00:24,990 --> 00:00:28,290 ‫We're not talking about distribute or anything because I got that question as well. 11 00:00:28,650 --> 00:00:30,720 ‫It has nothing to do with distributed. 12 00:00:30,720 --> 00:00:31,980 ‫It's just a single instant. 13 00:00:31,980 --> 00:00:34,290 ‫That's where you're isolating the transactions. 14 00:00:35,130 --> 00:00:39,510 ‫So we have here's an example he gave he pulled it from the Internet execution order. 15 00:00:39,510 --> 00:00:46,380 ‫So we have select pet ID, pet name score from pet shots, a table called picture where score equal 16 00:00:46,380 --> 00:00:48,570 ‫select max score from picture. 17 00:00:48,960 --> 00:00:56,820 ‫So he's effectively he's getting it the pet ID who got the most score, whatever that means. 18 00:00:56,820 --> 00:00:57,140 ‫Right. 19 00:00:57,150 --> 00:00:59,360 ‫That's what it was trying to do. 20 00:00:59,370 --> 00:01:05,850 ‫So there is an insight in equity that is inside another query effectively, and Max will return almost 21 00:01:05,850 --> 00:01:06,540 ‫one value. 22 00:01:06,540 --> 00:01:08,910 ‫So we want to find out which one Is that. 23 00:01:09,390 --> 00:01:13,090 ‫The way I would I would have written the query differently, Right? 24 00:01:13,140 --> 00:01:18,390 ‫I would I would just do an order by it's like cheaper in that particular case because these are two 25 00:01:18,390 --> 00:01:27,180 ‫queries versus I'm going to do select pet name score or thereby score descending and then limit one. 26 00:01:27,180 --> 00:01:28,590 ‫I'm going to fetch the first one. 27 00:01:28,830 --> 00:01:31,770 ‫That's that's basically how how I would do it. 28 00:01:31,860 --> 00:01:33,480 ‫I don't know which one is faster to be honest. 29 00:01:34,060 --> 00:01:34,240 ‫Right. 30 00:01:34,650 --> 00:01:39,780 ‫This will this is fast because, Max, we know exactly where the score is, assuming the score is indexed. 31 00:01:40,380 --> 00:01:47,310 ‫And I know we're talking about data modeling here and performance, but I thought it was going to be 32 00:01:47,310 --> 00:01:47,850 ‫useful. 33 00:01:48,090 --> 00:01:54,900 ‫The score in this particular case, that if it's an index, this will be immediate because I know the 34 00:01:54,900 --> 00:01:56,160 ‫last page in the index. 35 00:01:56,190 --> 00:01:59,910 ‫Go and fetch that and that will give me the last score. 36 00:01:59,910 --> 00:02:01,080 ‫So that's fast. 37 00:02:01,560 --> 00:02:12,390 ‫And then this is ab3 index scan to find that value, which we just found in order to find out the beat 38 00:02:12,390 --> 00:02:13,590 ‫ID, the bid name. 39 00:02:13,590 --> 00:02:14,160 ‫Right. 40 00:02:14,160 --> 00:02:17,310 ‫And the score of that score. 41 00:02:17,340 --> 00:02:17,820 ‫Right. 42 00:02:17,820 --> 00:02:22,710 ‫So in this particular thing, we're going to find a score and then go jump back to the heap, get the 43 00:02:22,710 --> 00:02:24,210 ‫pet ID and other columns. 44 00:02:24,210 --> 00:02:24,570 ‫Right. 45 00:02:24,990 --> 00:02:34,320 ‫So two queries versus I'm just going to do a select one query or thereby score. 46 00:02:34,320 --> 00:02:41,310 ‫And in this particular sense, since the order bias index, I'm going to go to the I'm just going to 47 00:02:41,310 --> 00:02:42,570 ‫walk the index. 48 00:02:43,380 --> 00:02:44,790 ‫In a reverse order. 49 00:02:45,420 --> 00:02:47,780 ‫No, it's actually faster to do an order by. 50 00:02:48,250 --> 00:02:54,360 ‫Yeah, because I'm going to go to the last one and find the last entry and immediately jump to that. 51 00:02:54,870 --> 00:02:55,230 ‫Right. 52 00:02:55,230 --> 00:02:59,750 ‫So it's, it's, it's one query versus two. 53 00:02:59,760 --> 00:03:00,720 ‫That makes sense. 54 00:03:00,960 --> 00:03:01,230 ‫Right. 55 00:03:03,680 --> 00:03:04,610 ‫Yeah, definitely. 56 00:03:04,610 --> 00:03:07,820 ‫It's faster to do a select, but it doesn't matter, right? 57 00:03:07,850 --> 00:03:08,270 ‫It's not. 58 00:03:08,570 --> 00:03:09,710 ‫It's not that question. 59 00:03:10,400 --> 00:03:16,130 ‫So the second so I have one transaction that does this select a query like max score of someone and 60 00:03:16,130 --> 00:03:25,670 ‫there is a query one where we have updates the score and, and when you have the score, obviously this 61 00:03:25,670 --> 00:03:26,710 ‫might change, right? 62 00:03:26,720 --> 00:03:34,260 ‫And that's what he's asking is like if I am in my transaction and I query this twice and meanwhile I, 63 00:03:34,260 --> 00:03:40,790 ‫if I were a first query, gotta get a better ID, then someone updates the score and commit and then 64 00:03:40,790 --> 00:03:41,870 ‫I query it again. 65 00:03:43,280 --> 00:03:49,760 ‫And what is the behavior in repeatable rate versus isolate snapshot? 66 00:03:49,760 --> 00:03:50,200 ‫Right. 67 00:03:50,390 --> 00:03:57,020 ‫So if you do a repeatable read, if your tw x one is a repeatable read, what will happen is you can 68 00:03:57,020 --> 00:03:57,980 ‫execute the query. 69 00:03:57,980 --> 00:04:03,890 ‫Someone will commit something, change the score, so the ID will change and you're going to execute 70 00:04:03,890 --> 00:04:05,180 ‫it again, right? 71 00:04:07,240 --> 00:04:15,250 ‫If your query is repeatable, if the transaction one is actually isolation is is repeatable. 72 00:04:15,250 --> 00:04:15,790 ‫Read. 73 00:04:16,650 --> 00:04:17,370 ‫Then. 74 00:04:18,410 --> 00:04:22,790 ‫What will happen, unfortunately, is you're going to get different result. 75 00:04:22,820 --> 00:04:26,090 ‫You might say what I thought repeatable will give you a repeatable rate. 76 00:04:26,120 --> 00:04:30,710 ‫Well, repeatable rate will give you a repeatable rate for things that you actually physically read 77 00:04:30,800 --> 00:04:32,150 ‫a row that you read. 78 00:04:32,390 --> 00:04:32,810 ‫Right. 79 00:04:32,810 --> 00:04:39,320 ‫If you read a row and that row has changed, reading that same row will give you the same value. 80 00:04:39,410 --> 00:04:41,090 ‫But that's not the case here. 81 00:04:41,480 --> 00:04:44,900 ‫What changed here is completely something completely different. 82 00:04:44,900 --> 00:04:47,900 ‫No rows were changed, rows were added. 83 00:04:48,170 --> 00:04:48,710 ‫Right. 84 00:04:48,710 --> 00:04:54,550 ‫And when rows are added, the nature of the query is now because of the rate repeatable. 85 00:04:54,560 --> 00:04:56,330 ‫Read is the read committed. 86 00:04:56,360 --> 00:04:58,900 ‫It's going to see the new stuff. 87 00:04:58,910 --> 00:05:05,480 ‫This is a this is a side effect of something called Phantom Reader, which we talked about, right? 88 00:05:05,930 --> 00:05:07,520 ‫Because we're inserting thing. 89 00:05:07,520 --> 00:05:10,130 ‫My query is now detecting those new things. 90 00:05:10,130 --> 00:05:16,490 ‫All of a sudden Phantom reads cannot be solved by repeatable read isolation level. 91 00:05:17,000 --> 00:05:22,370 ‫Even that statement is not true anymore because databases break the rules all the time. 92 00:05:23,300 --> 00:05:28,520 ‫And it's not that the rules are great, but they break it for a reason, for performance reasons. 93 00:05:28,520 --> 00:05:28,880 ‫Right. 94 00:05:29,630 --> 00:05:31,100 ‫And correctness. 95 00:05:31,100 --> 00:05:35,090 ‫So in this particular thing, a repeatable read will give you different results, right? 96 00:05:35,090 --> 00:05:37,700 ‫Because you are technically. 97 00:05:38,590 --> 00:05:40,960 ‫Uh, not reading the same, bro. 98 00:05:41,000 --> 00:05:45,970 ‫You're dating a different RO because now you select score, Max score. 99 00:05:46,270 --> 00:05:49,060 ‫We'll give you a different result because there is a new score. 100 00:05:49,090 --> 00:05:52,570 ‫Right now we just added a new score so we can get that score. 101 00:05:52,570 --> 00:05:55,240 ‫And as a result, you're going to get the P tied for the new score. 102 00:05:55,540 --> 00:05:58,420 ‫So repeatable rate will not solve it. 103 00:05:58,510 --> 00:06:00,520 ‫A snapshot will definitely solve it. 104 00:06:00,520 --> 00:06:04,420 ‫Snapchat will give you a version say, Hey, I don't care. 105 00:06:04,690 --> 00:06:06,340 ‫I started this transaction. 106 00:06:06,700 --> 00:06:12,130 ‫My current version or timestamp is PT seven. 107 00:06:12,640 --> 00:06:15,100 ‫Please read anything PT seven or older. 108 00:06:15,250 --> 00:06:21,520 ‫Never read anything t seven or never read T eight or ten or t nine. 109 00:06:21,520 --> 00:06:22,240 ‫I don't care. 110 00:06:22,660 --> 00:06:31,330 ‫So if those things are committed in PT ten, I'll never see them because I filtered my snapshot. 111 00:06:31,330 --> 00:06:35,890 ‫Isolation is filtering based on a version based on a timestamp. 112 00:06:36,340 --> 00:06:37,930 ‫That's very powerful, right? 113 00:06:38,320 --> 00:06:41,830 ‫And that's how basically politics works. 114 00:06:41,920 --> 00:06:49,090 ‫Postgres doesn't have a concept of snapshot isolation because a repeatable read is a snapshot isolation. 115 00:06:49,330 --> 00:06:51,820 ‫Yeah, I just, I just contradicted myself. 116 00:06:52,390 --> 00:06:59,020 ‫The way postcodes implemented Repeatable reader is a snap as is as a snapshot isolation level because 117 00:06:59,020 --> 00:07:01,330 ‫postcards only have versioning. 118 00:07:01,570 --> 00:07:04,070 ‫Because I would talk about the idea of version, right? 119 00:07:04,180 --> 00:07:09,280 ‫Every every time you touch a row and new version is created for that row. 120 00:07:09,280 --> 00:07:12,370 ‫So it's by default souls, the Phantom reads. 121 00:07:12,370 --> 00:07:19,420 ‫As a result, Peter's problem is not a thing here and applescript and I'm not sure in my sequel I might 122 00:07:19,420 --> 00:07:23,110 ‫be it might be the same can be quickly verified. 123 00:07:23,110 --> 00:07:27,870 ‫Just spin up on my SQL instance on can on a docker container. 124 00:07:27,880 --> 00:07:28,600 ‫Try it up. 125 00:07:28,990 --> 00:07:35,860 ‫It should do a max in another transaction, update the score and in the same transaction, obviously 126 00:07:35,860 --> 00:07:40,750 ‫begin the transaction and then do a max and then update the score and then do our max again. 127 00:07:41,050 --> 00:07:44,710 ‫You can if you got the same thing and the repeatable read. 128 00:07:45,560 --> 00:07:46,700 ‫That means repeatable. 129 00:07:46,760 --> 00:07:50,390 ‫It is implemented as a snapshot isolation level. 130 00:07:50,510 --> 00:07:53,570 ‫I hope you enjoy this lecture on the next one. 131 00:07:53,870 --> 00:07:54,530 ‫You guys have some. 132 00:07:54,530 --> 00:07:54,950 ‫Good bye.