1 00:00:00,090 --> 00:00:06,630 ‫All right, consistency, guys, all right, so I really needed to talk about it and isolation, forced 2 00:00:06,990 --> 00:00:09,690 ‫to talk about consistency, right. 3 00:00:09,720 --> 00:00:12,780 ‫Although consistency is the second property or an asset. 4 00:00:12,780 --> 00:00:13,080 ‫Right. 5 00:00:13,830 --> 00:00:14,910 ‫And the reason is. 6 00:00:15,870 --> 00:00:23,740 ‫Ethnicity and isolation leads to consistency, in my opinion, and the way I see it. 7 00:00:23,760 --> 00:00:26,700 ‫There are two types of consistency, the way I see it here. 8 00:00:26,910 --> 00:00:30,790 ‫And the first one is consistency in your data. 9 00:00:31,050 --> 00:00:34,590 ‫And the second one is consistency in reading the data. 10 00:00:34,620 --> 00:00:42,690 ‫And let's talk about each one of the men and let's talk about what which one of those the sequel guys 11 00:00:42,690 --> 00:00:48,990 ‫came and try to improve upon, OK, and try to it can be relaxed here. 12 00:00:49,710 --> 00:00:54,600 ‫Consistency in data, because this is data is essentially is defined by the user. 13 00:00:54,630 --> 00:00:59,880 ‫It's something that the user defines in their table schema. 14 00:00:59,970 --> 00:01:06,300 ‫It says, hey, that this view and this view should be consistent, like, say, the sum of the money 15 00:01:06,300 --> 00:01:12,210 ‫and the stable should equal the sum of all the balances or the number of likes. 16 00:01:12,210 --> 00:01:19,630 ‫And this picture should no should equal the number of the users will actually like that picture. 17 00:01:19,800 --> 00:01:22,770 ‫So this is a consistent view that the user defines. 18 00:01:22,770 --> 00:01:28,420 ‫OK, usually it's enforced by referential integrity, like four keys and primary keys. 19 00:01:28,420 --> 00:01:33,780 ‫So you ensure consistency in your data and it's also ensured by a tomasetti and isolation. 20 00:01:34,500 --> 00:01:36,480 ‫We saw that we were number one. 21 00:01:36,480 --> 00:01:38,890 ‫We got the blue screen of death and the middle right. 22 00:01:39,150 --> 00:01:41,190 ‫We just lost my data. 23 00:01:41,310 --> 00:01:41,570 ‫Right. 24 00:01:41,700 --> 00:01:46,290 ‫Just went away and I got an inconsistent balance sheet. 25 00:01:46,410 --> 00:01:52,980 ‫As a result, we got nine hundred and five hundred where a hundred just went away and the thin air. 26 00:01:53,400 --> 00:01:59,110 ‫So a total lack of atomistic leads to inconsistency in your data. 27 00:01:59,560 --> 00:01:59,790 ‫OK. 28 00:02:00,690 --> 00:02:01,650 ‫Isolation. 29 00:02:01,680 --> 00:02:03,130 ‫Do I need to say anything? 30 00:02:03,600 --> 00:02:10,170 ‫We saw all that stuff, either isolation, rent, giving all this false reports as a result of lack 31 00:02:10,170 --> 00:02:11,220 ‫of isolation. 32 00:02:11,220 --> 00:02:17,480 ‫Right at the moment, people are changing my product sales while I'm reading this data. 33 00:02:17,730 --> 00:02:19,400 ‫I'm going to get a bad result. 34 00:02:19,410 --> 00:02:26,430 ‫I'm going to get an inconsistent result, although my product says the sum is one 30. 35 00:02:26,760 --> 00:02:27,570 ‫I can see it. 36 00:02:27,570 --> 00:02:28,290 ‫I can sum it up. 37 00:02:28,470 --> 00:02:31,920 ‫But the actual total is saying one fifty five, that's bad. 38 00:02:33,030 --> 00:02:36,330 ‫So a lack of isolation leads to inconsistency. 39 00:02:36,510 --> 00:02:41,680 ‫It's up to you as a user if you can handle if you're okay with that inconsistency or not. 40 00:02:42,750 --> 00:02:45,240 ‫Sometimes it leads to corruption, inconsistency. 41 00:02:45,240 --> 00:02:52,110 ‫If you're not happy with that like bank transaction, you cannot probably be happy with that product 42 00:02:52,110 --> 00:02:52,550 ‫sales. 43 00:02:52,830 --> 00:02:53,660 ‫Probably not. 44 00:02:53,670 --> 00:02:54,360 ‫That's that. 45 00:02:55,680 --> 00:03:01,200 ‫Pictures and like we're going to show that example seem like this is an example, right, where because 46 00:03:01,200 --> 00:03:04,140 ‫this is in my data is really not a big deal. 47 00:03:04,500 --> 00:03:06,440 ‫Like we have two pictures here. 48 00:03:06,780 --> 00:03:07,950 ‫There's a blob of the picture. 49 00:03:07,980 --> 00:03:10,260 ‫I say this is an Instagram implementation. 50 00:03:10,890 --> 00:03:16,250 ‫We have a blob of the picture and we have the number of likes hosting. 51 00:03:16,300 --> 00:03:19,350 ‫We might say, hey, why are you adding a field called Likes? 52 00:03:19,410 --> 00:03:20,790 ‫What what is that game? 53 00:03:20,790 --> 00:03:23,740 ‫Why don't you just query this table and just get the number of flights? 54 00:03:23,850 --> 00:03:27,480 ‫Performance is the best short answer for this. 55 00:03:28,050 --> 00:03:28,310 ‫Right. 56 00:03:28,350 --> 00:03:35,730 ‫So you would add like field here, which will contain the total number of likes in this picture that 57 00:03:35,730 --> 00:03:36,590 ‫this picture got. 58 00:03:37,050 --> 00:03:42,870 ‫And that is another table called picture like which includes like, hey, John liked picture number 59 00:03:42,870 --> 00:03:45,910 ‫one edman like picture number one, John, like picture number two. 60 00:03:45,930 --> 00:03:48,480 ‫So if you sum this right. 61 00:03:49,720 --> 00:03:55,330 ‫Picture number got two likes from John and Edman, that's correct, and then picture two got one like 62 00:03:55,330 --> 00:03:56,260 ‫this one from John. 63 00:03:56,440 --> 00:03:57,970 ‫So that's a consistent view. 64 00:03:58,360 --> 00:04:04,900 ‫But if you saw that, for example, this is four and this view doesn't represent that, that's an inconsistent 65 00:04:04,900 --> 00:04:05,160 ‫view. 66 00:04:05,500 --> 00:04:09,820 ‫It's up to you if you're happy with that inconsistency or not. 67 00:04:10,060 --> 00:04:15,910 ‫And that's a very critical question, because based on that, you can adjust the performance, you can 68 00:04:15,910 --> 00:04:23,920 ‫adjust the consistency, you can adjust scalability, you can adjust isolation based on what can you 69 00:04:23,920 --> 00:04:32,410 ‫give up as an engineer, as a software engineer, you really guys need to think about every single aspect 70 00:04:32,410 --> 00:04:32,800 ‫of that. 71 00:04:33,430 --> 00:04:34,380 ‫What can you give up? 72 00:04:35,200 --> 00:04:39,850 ‫That's why you have to understand the requirements and then ask yourself, what can you give up? 73 00:04:40,130 --> 00:04:48,250 ‫You're telling me that Instagram, if Kylie Jenner gets like five million likes and a picture five million 74 00:04:48,250 --> 00:04:55,030 ‫and thirty two lives, if I clicked on that and view the five million, are you telling me that you're 75 00:04:55,030 --> 00:04:58,450 ‫going to match I betting everything you want. 76 00:04:58,450 --> 00:05:06,280 ‫I know matching this this is impossible to get a consistent view on this and they don't care because 77 00:05:06,280 --> 00:05:13,180 ‫nobody is going to fiddle through five million users anyway so they can give you an approximate number 78 00:05:13,270 --> 00:05:14,590 ‫and they can be off. 79 00:05:14,770 --> 00:05:18,550 ‫There's those two views can be off by even a hundred thousand. 80 00:05:19,450 --> 00:05:21,340 ‫YouTube has the same subscribers. 81 00:05:21,340 --> 00:05:21,630 ‫Right. 82 00:05:21,850 --> 00:05:28,710 ‫Do you think why do you think YouTube is showing you like my subscribers 6k or six point 2k? 83 00:05:28,960 --> 00:05:36,910 ‫Because they cannot guarantee you that I have exactly six thousand thirty two subscribers, OK, or 84 00:05:36,910 --> 00:05:39,550 ‫twenty three or even Peurifoy. 85 00:05:39,550 --> 00:05:39,760 ‫Right. 86 00:05:39,770 --> 00:05:43,810 ‫He has like one now, almost ninety five million subscribers. 87 00:05:44,350 --> 00:05:52,120 ‫That number is not exactly accurate to the actual number of subscribers that he actually got. 88 00:05:52,120 --> 00:05:54,760 ‫It's an approximation because they cannot. 89 00:05:55,030 --> 00:06:00,910 ‫They decided that it's not important to maintain the consistency of this thing because why? 90 00:06:00,910 --> 00:06:05,080 ‫First, because nobody's going through ninety nine million subscribers. 91 00:06:05,230 --> 00:06:09,730 ‫So they are preferring performance over this consistency. 92 00:06:09,730 --> 00:06:10,420 ‫Sorry about guys. 93 00:06:10,870 --> 00:06:14,560 ‫So by going to a tangent, but as I said, it's as critical to this. 94 00:06:15,220 --> 00:06:18,580 ‫Let's talk about the other costs concept of this consistent data. 95 00:06:18,730 --> 00:06:20,860 ‫Sometimes it matters, sometimes it doesn't. 96 00:06:21,280 --> 00:06:24,460 ‫Consistent consistency in rates, same thing. 97 00:06:24,820 --> 00:06:25,880 ‫Sometimes it doesn't. 98 00:06:25,880 --> 00:06:26,820 ‫It doesn't matter. 99 00:06:26,830 --> 00:06:27,930 ‫So we'll think about that. 100 00:06:28,720 --> 00:06:32,890 ‫No consistency and reads is very interesting. 101 00:06:34,000 --> 00:06:34,390 ‫And. 102 00:06:35,950 --> 00:06:41,830 ‫What it means is like if I update something for a transaction, update something to the database, I 103 00:06:41,840 --> 00:06:47,320 ‫say I've added a value X and that X get persistent and then another transactions start reading that 104 00:06:47,680 --> 00:06:49,180 ‫it better get the value X. 105 00:06:50,230 --> 00:06:51,710 ‫Hussein, what are you saying? 106 00:06:51,730 --> 00:07:00,850 ‫That's obviously tough, reckon, right ride obviously has to give the same value, but not necessarily. 107 00:07:01,240 --> 00:07:05,520 ‫We're going to talk about the concept of eventual consistency here. 108 00:07:06,670 --> 00:07:07,880 ‫So that's inconsistent. 109 00:07:07,910 --> 00:07:11,780 ‫Read if I committed something, can you transactions see it immediately. 110 00:07:12,550 --> 00:07:14,790 ‫OK, and here's the thing. 111 00:07:15,160 --> 00:07:24,200 ‫This type of consistency is now available on both relational and equal databases. 112 00:07:24,200 --> 00:07:32,950 ‫Is a big statement to you both that a business suffers from this consistency, including Oracle, including 113 00:07:32,950 --> 00:07:34,780 ‫postscripts including my school. 114 00:07:35,620 --> 00:07:38,600 ‫Those databases are not consistent and reads. 115 00:07:38,620 --> 00:07:39,910 ‫And let me explain here. 116 00:07:40,630 --> 00:07:46,800 ‫When you have a one server and you're committed something to it and you read from that one server, 117 00:07:47,020 --> 00:07:48,910 ‫life is good, right? 118 00:07:49,150 --> 00:07:50,260 ‫Life is perfect. 119 00:07:51,960 --> 00:07:58,140 ‫Because you have one frickin server, but the moment you start adding other server, let's look at what 120 00:07:58,170 --> 00:07:59,700 ‫this is all my school. 121 00:08:00,270 --> 00:08:05,460 ‫What are you going to do essentially is you're going to have one server and you created a replica of 122 00:08:05,460 --> 00:08:06,090 ‫that server. 123 00:08:06,210 --> 00:08:12,860 ‫And that is like even one way replica to where replica is going to start pumping data to the replica. 124 00:08:12,870 --> 00:08:13,180 ‫Right. 125 00:08:13,770 --> 00:08:21,840 ‫And now, obviously, you need to add multiple replicas for horizontal scalability and you cannot serve 126 00:08:21,840 --> 00:08:25,470 ‫seven million people off one database. 127 00:08:25,490 --> 00:08:25,840 ‫Right. 128 00:08:26,610 --> 00:08:32,640 ‫That YouTube started doing that with one database and they scaled it up to multiple and they implemented 129 00:08:32,640 --> 00:08:39,480 ‫now for test, which is this new fancy stuff on top of my school that gives you like sharding on the 130 00:08:39,480 --> 00:08:40,650 ‫fly sharding stuff. 131 00:08:40,650 --> 00:08:43,410 ‫We're going to talk about another video by essentially. 132 00:08:44,750 --> 00:08:52,510 ‫The moment you break things up into these replicas, into this essentially the other databases, the 133 00:08:52,520 --> 00:08:54,140 ‫secondary databases. 134 00:08:57,120 --> 00:09:03,410 ‫Follower leader of follower nodes right now, what would you start doing that you are inconsistent, 135 00:09:03,420 --> 00:09:10,860 ‫sir, you will become inconsistent because you arrive to the primary node and someone else reads from 136 00:09:10,860 --> 00:09:11,940 ‫the secondary node. 137 00:09:12,030 --> 00:09:16,280 ‫The second takes time to get the value propagated. 138 00:09:16,290 --> 00:09:18,120 ‫There is networking going on. 139 00:09:18,130 --> 00:09:23,080 ‫There is delays latency until the secondary node gets the new value. 140 00:09:23,310 --> 00:09:26,880 ‫So you're going to get an old value, my friend. 141 00:09:27,120 --> 00:09:28,890 ‫You're going to get an old value. 142 00:09:29,220 --> 00:09:29,510 ‫Right. 143 00:09:29,940 --> 00:09:32,100 ‫And that is inconsistency. 144 00:09:32,910 --> 00:09:33,330 ‫So. 145 00:09:35,570 --> 00:09:40,770 ‫This problem and a lot of people get this wrong, inconsistency. 146 00:09:40,790 --> 00:09:48,440 ‫So the relational database is inconsistent and reads right the moment you start breaking them into horizontal 147 00:09:48,440 --> 00:09:52,240 ‫scalability, the multiple servers, you are going to get inconsistency, right? 148 00:09:52,550 --> 00:09:56,690 ‫So they are consistent when there is one big nice server. 149 00:09:56,930 --> 00:09:57,220 ‫Right. 150 00:09:57,350 --> 00:10:01,940 ‫The moment you break up, they are inconsistent and reads, you're going to read it, but someone will 151 00:10:01,940 --> 00:10:03,470 ‫get to read an old value. 152 00:10:03,770 --> 00:10:05,030 ‫And it's up to you. 153 00:10:05,030 --> 00:10:12,770 ‫Now, as an engineer, are you happy with this old value if you get a slightly older subscriber count? 154 00:10:13,010 --> 00:10:13,520 ‫OK. 155 00:10:14,810 --> 00:10:22,190 ‫That's completely fine if your video, the latest view, and that's what YouTube does, right, there 156 00:10:22,190 --> 00:10:28,700 ‫are many servers and all these rights going to multiple databases and then they eventually sink back 157 00:10:28,700 --> 00:10:31,400 ‫into one big server, one big database. 158 00:10:31,410 --> 00:10:31,670 ‫Right. 159 00:10:32,060 --> 00:10:38,930 ‫And then if you read if I'm reading that view versus someone from Germany versus someone from Japan 160 00:10:38,930 --> 00:10:44,000 ‫reading that same video, we're going to get a different result because we're reading different replicas. 161 00:10:44,030 --> 00:10:46,750 ‫We're reading from different follower nodes. 162 00:10:47,120 --> 00:10:48,080 ‫And that's OK. 163 00:10:48,630 --> 00:10:49,000 ‫Right. 164 00:10:49,190 --> 00:10:50,450 ‫Sometimes that's OK. 165 00:10:50,450 --> 00:10:51,500 ‫Sometimes it's not. 166 00:10:51,980 --> 00:10:52,700 ‫So now. 167 00:10:53,670 --> 00:10:59,220 ‫There are a bunch of people who say, you know what, we don't want to enforce that thing is this all 168 00:10:59,220 --> 00:11:02,930 ‫this stuff you guys are doing silly, this asset thing is silly. 169 00:11:03,480 --> 00:11:10,140 ‫And the fact that scaling relational databases has been always hard. 170 00:11:10,530 --> 00:11:17,430 ‫Right, because it was designed in the 70s to be a one big beefy machine, one big beefy server. 171 00:11:17,730 --> 00:11:19,250 ‫The database is there and that's it. 172 00:11:19,470 --> 00:11:19,720 ‫Right. 173 00:11:19,890 --> 00:11:25,830 ‫In this era, those guys said, OK, you know what, I'm going to give up consistency because you guys 174 00:11:26,080 --> 00:11:28,110 ‫already don't have consistency. 175 00:11:28,170 --> 00:11:28,470 ‫Right? 176 00:11:28,740 --> 00:11:33,630 ‫I'm going to relax a little bit of these four properties and I'm going to give you. 177 00:11:35,180 --> 00:11:38,060 ‫A better scalability and performance. 178 00:11:39,110 --> 00:11:44,750 ‫So I'm going to scale horizontally, I'm going to add a bunch of other servers on the sides and then 179 00:11:44,750 --> 00:11:51,070 ‫just that we're going we can start sharding doing all these things, undistributed manner. 180 00:11:51,800 --> 00:11:54,950 ‫So that's essentially what they did. 181 00:11:54,950 --> 00:11:58,100 ‫And there was thus the no SQL databases came into picture. 182 00:11:58,490 --> 00:12:08,120 ‫So they give up consistency or in favor of scalability, which the relational database is a really, 183 00:12:08,120 --> 00:12:13,220 ‫really hard to do ratio databases, whether you have to implement the follow or node and leader node 184 00:12:13,220 --> 00:12:16,130 ‫and they start replicating bumpe changes. 185 00:12:16,790 --> 00:12:22,070 ‫And the second one gives you a scalability. 186 00:12:22,070 --> 00:12:24,800 ‫You can add just nodes and then we'll scale nicely. 187 00:12:25,550 --> 00:12:33,320 ‫But if you're looking for isolation and all that fancy stuff, you're not going to get it. 188 00:12:33,530 --> 00:12:35,480 ‫Obviously right now. 189 00:12:35,540 --> 00:12:37,970 ‫You're not going to get all these properties off of this. 190 00:12:38,150 --> 00:12:44,300 ‫You you you will get this new concept called eventual consistency, which is to me is just a marketing 191 00:12:44,300 --> 00:12:46,790 ‫term game until everything is eventually consistent. 192 00:12:48,320 --> 00:12:53,630 ‫So it's like if you if you read the value going an old value, if you wait a little bit, you're going 193 00:12:53,630 --> 00:12:56,570 ‫to get the latest value essentially. 194 00:12:56,570 --> 00:12:56,840 ‫Right. 195 00:12:56,870 --> 00:12:59,630 ‫So to me, a visual consistency. 196 00:12:59,630 --> 00:13:06,020 ‫Both the relational databases and non-resident databases suffer from this eventual consistency.