1 00:00:00,060 --> 00:00:04,860 ‫We talked about a tomasetti, we talked about isolation. 2 00:00:05,400 --> 00:00:09,930 ‫We talked about what it's really a transaction, which is really the most important thing here to understand 3 00:00:09,930 --> 00:00:11,970 ‫before jumping into these properties. 4 00:00:12,000 --> 00:00:20,190 ‫And then we really need to talk about consistency, which is one of the properties there were that were 5 00:00:20,280 --> 00:00:24,240 ‫traded off in different platforms. 6 00:00:24,780 --> 00:00:29,640 ‫When I say different database platform, I mean, not equal versus a relational versus graph consistency 7 00:00:29,640 --> 00:00:31,080 ‫played a lot of role there. 8 00:00:31,320 --> 00:00:38,290 ‫So some databases sacrifice consistency for speed and performance and scalability. 9 00:00:38,310 --> 00:00:41,070 ‫I'm not going to talk about all that stuff, but consistency, really. 10 00:00:41,500 --> 00:00:44,640 ‫What do we mean by consistency? 11 00:00:44,970 --> 00:00:48,420 ‫And you are understand that really consistency. 12 00:00:49,560 --> 00:00:57,990 ‫Already, two folds, there is what I call consistency in the data itself, which represent the state 13 00:00:58,260 --> 00:01:01,140 ‫of the data that is currently persisted. 14 00:01:01,560 --> 00:01:06,060 ‫OK, so what do you what do you have actually in disk is what you have in disk? 15 00:01:06,540 --> 00:01:12,060 ‫Consistent with with the data model that you have and we're going to talk about what does does that 16 00:01:12,060 --> 00:01:12,390 ‫mean? 17 00:01:12,810 --> 00:01:15,570 ‫And there is another consistency. 18 00:01:16,050 --> 00:01:17,850 ‫What I also call consistency and read. 19 00:01:18,060 --> 00:01:27,120 ‫So your data might be consistent on this, but the reading of the data is becomes inconsistent because 20 00:01:27,120 --> 00:01:31,350 ‫you have multiple instances and they are slightly out of sync. 21 00:01:31,590 --> 00:01:34,830 ‫So this applies to the system as a whole. 22 00:01:35,220 --> 00:01:40,140 ‫So if you have like money shards or many partitions of many, many, many, many database instances, 23 00:01:40,530 --> 00:01:44,730 ‫right, this is that applied to the system as a whole as I use it. 24 00:01:44,730 --> 00:01:46,710 ‫I don't care what database A-listers I have. 25 00:01:46,980 --> 00:01:49,320 ‫I'm just issuing a read that read better. 26 00:01:49,320 --> 00:01:56,340 ‫Be consistent and consistent in data is in a given cluster in a given data instance. 27 00:01:56,760 --> 00:01:57,960 ‫What is going on? 28 00:01:57,960 --> 00:02:01,770 ‫There is my data even consistent at that level, right? 29 00:02:02,430 --> 00:02:04,440 ‫And let's take some examples. 30 00:02:05,100 --> 00:02:06,660 ‫Let's talk about consistency and data. 31 00:02:06,720 --> 00:02:08,490 ‫This is really defined by the user. 32 00:02:09,210 --> 00:02:12,870 ‫This is something that the user defines and by the user. 33 00:02:12,880 --> 00:02:19,350 ‫I mean, the DBA, really, or whomever builds out the data model and the database, right? 34 00:02:19,830 --> 00:02:28,260 ‫And it really comes down to most of the time of enforcing referential integrity and and foreign keys 35 00:02:28,260 --> 00:02:29,370 ‫in the database, right? 36 00:02:30,060 --> 00:02:37,110 ‫I know some people coming from NoSQL don't have this concept, but believe it or not, you do have referential 37 00:02:37,110 --> 00:02:37,710 ‫integrity. 38 00:02:37,710 --> 00:02:43,770 ‫Whether you like it or not, you might have a document here that is referring to another document and 39 00:02:43,770 --> 00:02:45,750 ‫the moment there is a referential. 40 00:02:46,170 --> 00:02:48,690 ‫There must be integrity in this reference. 41 00:02:48,850 --> 00:02:56,460 ‫Effectively, all this example that clarifies what that really means and what also ensures consistency 42 00:02:56,460 --> 00:02:58,980 ‫in the data is, I guess, what I tell my city. 43 00:02:59,490 --> 00:03:05,940 ‫If we had a crash while we debated one account and then restarted the database and that account remains 44 00:03:05,940 --> 00:03:10,920 ‫debited half right, we just lost $100 in the thin air. 45 00:03:10,950 --> 00:03:13,050 ‫That's an inconsistency in data. 46 00:03:13,440 --> 00:03:15,630 ‫The data is just persistent, wrong. 47 00:03:15,930 --> 00:03:17,220 ‫It's corrupt. 48 00:03:17,700 --> 00:03:20,790 ‫As as we call it, in junior, there is corruption. 49 00:03:20,790 --> 00:03:23,130 ‫Effectively, you corrupted your database effectively. 50 00:03:24,000 --> 00:03:25,290 ‫That's a pretty bad. 51 00:03:25,980 --> 00:03:32,460 ‫Isolation also can result in inconsistency of data because you could. 52 00:03:33,060 --> 00:03:37,980 ‫And we talked about that in the isolation lecture where you can issue a read and you get a bunch of 53 00:03:37,980 --> 00:03:39,420 ‫data back, right? 54 00:03:39,840 --> 00:03:47,040 ‫But then you issue another read and you get different results based on your isolation level, and that 55 00:03:47,040 --> 00:03:49,620 ‫leads to inconsistent result. 56 00:03:50,040 --> 00:03:57,000 ‫So, yeah, your data might be correct, but it is changing, consistently changing. 57 00:03:57,360 --> 00:04:04,950 ‫And this change is giving you a different view as a result of your view of the consistency is getting 58 00:04:04,950 --> 00:04:05,340 ‫wrong. 59 00:04:06,830 --> 00:04:07,610 ‫Here's an example. 60 00:04:07,970 --> 00:04:13,400 ‫Let's assume you're building some sovereign Instagram data model here, so you have the picture stable, 61 00:04:13,730 --> 00:04:20,420 ‫you have an ID of a blob of the picture minority, a large objects and the number of likes that this 62 00:04:20,660 --> 00:04:21,440 ‫picture got. 63 00:04:22,310 --> 00:04:29,510 ‫And then we have a tracking table that says, OK, John have liked picture one and Edmund had like also 64 00:04:29,520 --> 00:04:32,420 ‫picture one, and John also liked Picture two. 65 00:04:32,480 --> 00:04:38,210 ‫So you have a certain referential integrity is here. 66 00:04:38,210 --> 00:04:38,810 ‫What is that? 67 00:04:39,110 --> 00:04:45,260 ‫And this is defined by you as as the DBA, whoever build this, because the number of likes here better 68 00:04:45,260 --> 00:04:48,050 ‫be equal to the sum of these guys. 69 00:04:48,320 --> 00:04:48,620 ‫Right? 70 00:04:48,710 --> 00:04:55,640 ‫Because if you do a select count from this table where a picture ID equal one, I better get two likes 71 00:04:55,640 --> 00:04:56,730 ‫and this should be equal. 72 00:04:57,200 --> 00:04:59,360 ‫And this is basically the consistency in data. 73 00:04:59,360 --> 00:05:03,050 ‫If this is inconsistent, then bad things can happen. 74 00:05:03,530 --> 00:05:06,890 ‫Who cares about if you're out of sync or the number of likes? 75 00:05:07,190 --> 00:05:09,980 ‫People give that up because really, nobody cares. 76 00:05:09,980 --> 00:05:11,110 ‫Nobody's going to browse. 77 00:05:11,120 --> 00:05:14,860 ‫Kylie Jenner is one point eight million like pictures. 78 00:05:15,170 --> 00:05:17,500 ‫Nobody's going to there, but you get the point. 79 00:05:17,510 --> 00:05:18,980 ‫This is this is one example. 80 00:05:20,450 --> 00:05:26,720 ‫Another example is if you have the few seen like an entry here to a picture that doesn't exist anymore. 81 00:05:27,050 --> 00:05:27,410 ‫Right? 82 00:05:27,770 --> 00:05:30,170 ‫That's also an inconsistency, right? 83 00:05:30,440 --> 00:05:37,520 ‫And this is what do we really mean when we talk about consistency in acid transactions, right? 84 00:05:37,550 --> 00:05:42,740 ‫We're talking about the acid consistency here, talking about the acid consistency here. 85 00:05:43,550 --> 00:05:51,680 ‫So, uh, I'll give you a few moments here to spot the inconsistencies and inconsistencies in this data 86 00:05:51,680 --> 00:05:52,040 ‫model. 87 00:05:53,000 --> 00:05:53,510 ‫Take a look. 88 00:06:00,940 --> 00:06:07,480 ‫So you might have already spotted a one of five likes, but if you actually query that, like, stable, 89 00:06:07,900 --> 00:06:13,480 ‫you can see that picture one only got two likes, so something got out of sync. 90 00:06:13,840 --> 00:06:16,600 ‫This is an inconsistent, referential integrity. 91 00:06:16,810 --> 00:06:22,960 ‫Another referential integrity, inconsistent data is you have Edmond Lake in picture four. 92 00:06:23,380 --> 00:06:29,440 ‫But if you actually want to get the actual picture in in the picture, stable picture four doesn't even 93 00:06:29,440 --> 00:06:29,890 ‫exist. 94 00:06:30,580 --> 00:06:33,190 ‫So that's also an inconsistent result. 95 00:06:33,400 --> 00:06:40,510 ‫So someone deleted this picture, but they didn't have a cascading event to delete the likes picture, 96 00:06:40,660 --> 00:06:44,350 ‫effectively leaving those effectively orphaned. 97 00:06:44,820 --> 00:06:45,010 ‫All right. 98 00:06:45,280 --> 00:06:50,290 ‫And you can have the database do this constraint or you can have it do it at the application level. 99 00:06:50,290 --> 00:06:51,040 ‫It's really up to you. 100 00:06:51,040 --> 00:06:53,170 ‫But this is what do you mean by inconsistency here? 101 00:06:54,460 --> 00:06:55,570 ‫Is that what it is? 102 00:06:55,720 --> 00:06:56,410 ‫No, really. 103 00:06:56,770 --> 00:07:01,800 ‫Consistency and read is is another thing that can happen, right? 104 00:07:02,590 --> 00:07:03,340 ‫Let's take an example. 105 00:07:04,840 --> 00:07:05,860 ‫You have a database system. 106 00:07:05,860 --> 00:07:05,980 ‫You? 107 00:07:07,510 --> 00:07:13,030 ‫And you can look at this as a single database or multiple doesn't matter, but you're assuming you're 108 00:07:13,030 --> 00:07:17,350 ‫talking to one reverse proxy of activity that proxies that request to the. 109 00:07:18,580 --> 00:07:28,450 ‫So you're updating Value X. And now that the the database passes the value X and now the next read after 110 00:07:28,450 --> 00:07:31,960 ‫this thing commits must give you the value X. 111 00:07:32,290 --> 00:07:37,360 ‫That's why it means by consistency in rates, you might say I was saying that's just obvious. 112 00:07:38,210 --> 00:07:39,340 ‫Well, not all the time. 113 00:07:39,820 --> 00:07:40,720 ‫Let's take an example. 114 00:07:42,340 --> 00:07:47,650 ‫If a transaction committed a change, will a new transaction immediately see the change? 115 00:07:47,860 --> 00:07:51,190 ‫That's what really boils down to, right? 116 00:07:51,490 --> 00:07:58,210 ‫If if I committed a change to a database and then I sure it immediately and I don't see that change. 117 00:07:58,480 --> 00:08:05,230 ‫Then you are inconsistent effectively because you just read something that is not really the value that 118 00:08:05,230 --> 00:08:07,180 ‫is supposed to you are supposed to get. 119 00:08:07,810 --> 00:08:09,340 ‫And it affects the system as a whole. 120 00:08:09,640 --> 00:08:18,280 ‫And what is an example that can get you is basically is when you have a master or a mean replica primary 121 00:08:18,520 --> 00:08:23,110 ‫database and they have a multiple worker database, right? 122 00:08:23,110 --> 00:08:25,390 ‫And you're sinking the changes to the back. 123 00:08:25,490 --> 00:08:27,640 ‫And I have an entire section talking about replication. 124 00:08:27,650 --> 00:08:30,250 ‫So check that out when you finish with this lecture. 125 00:08:31,690 --> 00:08:37,090 ‫But effectively, as you write to the primary, the primary will sink back at the changes to the replica. 126 00:08:37,360 --> 00:08:43,960 ‫And if that time, if you read in the replica, you might get an old value and that's what we mean by 127 00:08:44,260 --> 00:08:45,640 ‫inconsistency here. 128 00:08:47,200 --> 00:08:49,090 ‫So that's why it affects the system as a whole. 129 00:08:49,700 --> 00:08:57,220 ‫Relational and no sequel databases suffer from this kind of consistency and even not equal support for 130 00:08:57,220 --> 00:08:57,910 ‫the first time. 131 00:08:57,940 --> 00:09:04,240 ‫If you think think about it right, it still doesn't have this idea of inconsistent result in the persisted 132 00:09:04,240 --> 00:09:04,690 ‫model. 133 00:09:05,170 --> 00:09:11,140 ‫If you don't ensure variation on declarative between your, I don't know MongoDB documents, you're 134 00:09:11,140 --> 00:09:13,070 ‫going to get the same exact problem, right? 135 00:09:13,620 --> 00:09:13,930 ‫Right. 136 00:09:13,960 --> 00:09:17,140 ‫If you have users and followers less than one, I don't know how. 137 00:09:17,380 --> 00:09:18,250 ‫What do you do? 138 00:09:18,640 --> 00:09:23,410 ‫You have to ensure the referential integrity a factor that you set, the rules that you built. 139 00:09:24,070 --> 00:09:29,230 ‫And then we come back to this marketing term that is called eventual consistency. 140 00:09:29,230 --> 00:09:31,720 ‫The consistency releases. 141 00:09:31,750 --> 00:09:37,030 ‫OK, well, which is really what means is, hey, I'm not consistent, but I'm eventually going to be 142 00:09:37,030 --> 00:09:37,600 ‫consistent. 143 00:09:37,670 --> 00:09:37,840 ‫Am 144 00:09:40,560 --> 00:09:44,350 ‫I kind of turn on this term, to be honest? 145 00:09:44,560 --> 00:09:46,510 ‫I'll give my thoughts in another lecture. 146 00:09:46,510 --> 00:09:53,620 ‫But in a nutshell, most of the things that are currently not consistent will get eventually consistent. 147 00:09:54,040 --> 00:10:01,750 ‫But that is that is that is very important to understand because in the consistency of data, if your 148 00:10:01,750 --> 00:10:05,140 ‫data is corrupt and referential, integrity is broken. 149 00:10:05,530 --> 00:10:09,130 ‫There is no eventual consistency coming out of that, right? 150 00:10:09,280 --> 00:10:10,810 ‫Because, hey, the data is just corrupt. 151 00:10:10,990 --> 00:10:16,450 ‫You got you have five likes, but only two records that that represent the likes. 152 00:10:16,570 --> 00:10:24,340 ‫That is no eventual consistency coming from them unless you have some sort of a repair post batch job 153 00:10:24,340 --> 00:10:25,630 ‫that fixes that button. 154 00:10:26,170 --> 00:10:31,330 ‫That's what we mean by eventual consistency is only really to the reads, because if you continue issuing 155 00:10:31,330 --> 00:10:35,410 ‫a lead, you're going to get old values, but eventually you're going to get the correct value. 156 00:10:35,860 --> 00:10:41,380 ‫That's what they mean by Evangel consistency, and there is in many terms, your eventual strong consistency 157 00:10:41,380 --> 00:10:43,270 ‫and eventual weak consistency. 158 00:10:43,510 --> 00:10:47,050 ‫And there is there is so much stuff you can go into this. 159 00:10:47,050 --> 00:10:52,000 ‫I can't really make an entire course, just talk about evangelical students. 160 00:10:53,110 --> 00:10:59,050 ‫So it is valid in certain cases, but it's been invented to kind of differentiate between the corruption 161 00:10:59,050 --> 00:11:01,270 ‫that can happen to your data effectively. 162 00:11:01,750 --> 00:11:07,780 ‫So in summary, consistency are two types consistent in the data which are enforced by the referential 163 00:11:07,780 --> 00:11:09,070 ‫integrity, right? 164 00:11:09,340 --> 00:11:11,170 ‫Things that have persisted in the data. 165 00:11:11,290 --> 00:11:17,250 ‫And there is no eventual consistency coming out of that because the data will become kind of corrupted 166 00:11:18,370 --> 00:11:20,380 ‫based on what you define by corruption here. 167 00:11:20,710 --> 00:11:23,860 ‫And the different type is consistency and rhythm. 168 00:11:24,070 --> 00:11:30,220 ‫The consistency in readers when you update the value and then a transaction tries to read that value 169 00:11:30,220 --> 00:11:35,650 ‫after it was committed, you get the old version and that's an inconsistent result, right? 170 00:11:36,280 --> 00:11:37,930 ‫So it was. 171 00:11:38,140 --> 00:11:45,730 ‫Most databases heal from this kind of inconsistency by eventual consistency. 172 00:11:45,730 --> 00:11:55,900 ‫So that means as their their application process completes them, you will eventually get the final 173 00:11:57,580 --> 00:12:00,820 ‫result effectively from that raid, right? 174 00:12:01,060 --> 00:12:06,970 ‫And this can be enforced by by something called synchronous replication, which I believe I talked about 175 00:12:07,510 --> 00:12:08,920 ‫in this course. 176 00:12:09,130 --> 00:12:13,960 ‫Yeah, synchronous replication versus asynchronous replication one is slower than the other, but one 177 00:12:13,960 --> 00:12:16,170 ‫give you a strong consistency on the other. 178 00:12:16,190 --> 00:12:17,680 ‫Give you a virtual because this is different. 179 00:12:18,010 --> 00:12:25,480 ‫At the end of the day, it's all comes down to what really you want effectively. 180 00:12:25,750 --> 00:12:32,320 ‫And eventual consistency is something both relational databases and null sequel suffer from. 181 00:12:32,350 --> 00:12:36,670 ‫So that's was basically a summary of the consistency. 182 00:12:37,330 --> 00:12:40,870 ‫How about we jump to the final lecture durability of. 183 00:12:40,930 --> 00:12:41,470 ‫My favorite.