1 00:00:00,090 --> 00:00:11,490 ‫Hash tables are effective for caching database joins even partitioning distributed databases sharding 2 00:00:12,030 --> 00:00:14,600 ‫sets to check if something is there or not. 3 00:00:14,610 --> 00:00:16,110 ‫It's very, very effective. 4 00:00:16,110 --> 00:00:23,490 ‫Even load balancing a lot of programming languages implement that in-house in natively in the language. 5 00:00:23,490 --> 00:00:25,350 ‫You can build these hash tables. 6 00:00:25,590 --> 00:00:31,920 ‫However, what I want to discuss in this episode of the back engineering show is the limitations and 7 00:00:31,920 --> 00:00:37,890 ‫the cost of using hash tables and, you know, things that we really don't think about. 8 00:00:37,890 --> 00:00:45,450 ‫It comes back to the leaky abstractions of a there is a black box, it's fast, let me use it. 9 00:00:45,450 --> 00:00:47,700 ‫But we don't understand how it works. 10 00:00:47,700 --> 00:00:52,320 ‫And when you don't understand how something works, you can get away with it. 11 00:00:52,320 --> 00:00:57,120 ‫But sometimes when it doesn't work, that's where you really get screwed. 12 00:00:57,120 --> 00:01:01,620 ‫And, and that's where really a good engineer is. 13 00:01:01,620 --> 00:01:06,870 ‫I identify him or herself from the other engineers. 14 00:01:06,870 --> 00:01:09,780 ‫How about we jump into it and discuss this all? 15 00:01:09,780 --> 00:01:12,060 ‫Come to the back of engineering show with your host Hussein. 16 00:01:12,060 --> 00:01:20,940 ‫Arsalan this is our laid back shell discussion podcast, kind of video audio format. 17 00:01:20,940 --> 00:01:26,460 ‫If you can listen on the podcast, if you're listening on Spotify and Apple Podcasts, Google Podcasts, 18 00:01:26,460 --> 00:01:27,480 ‫you can listen on that. 19 00:01:27,660 --> 00:01:32,010 ‫And this is basically I don't I don't include graphics or I don't include anything. 20 00:01:32,010 --> 00:01:41,340 ‫It's just a long form discussion about one topic and I try to go as deep as possible to my extent of 21 00:01:41,340 --> 00:01:42,840 ‫knowledge on this topic. 22 00:01:42,840 --> 00:01:49,290 ‫To tease apart this topic, today's topic is hash tables. 23 00:01:50,520 --> 00:01:57,840 ‫In order to discuss what hash tables are, we really need to understand what arrays are. 24 00:01:57,900 --> 00:02:00,000 ‫You might say, Hussein, I don't care. 25 00:02:00,030 --> 00:02:01,450 ‫I understand what arrays are. 26 00:02:01,450 --> 00:02:02,760 ‫And do you think I'm dumb? 27 00:02:02,940 --> 00:02:06,960 ‫I know what our arrays and I know how they work. 28 00:02:08,840 --> 00:02:12,350 ‫You'll be surprised that you. 29 00:02:12,500 --> 00:02:21,740 ‫We take a lot of things for granted are raises one of them the concept of array is a consecutive slots 30 00:02:21,740 --> 00:02:25,700 ‫of memory and this is the key word memory. 31 00:02:25,700 --> 00:02:27,710 ‫You cannot put an array on disk. 32 00:02:27,740 --> 00:02:29,480 ‫It doesn't make any sense. 33 00:02:29,510 --> 00:02:29,930 ‫Right. 34 00:02:29,930 --> 00:02:34,130 ‫Well, until Ultra Ram is a thing and we can do byte address ability in the disk. 35 00:02:34,640 --> 00:02:43,310 ‫But arrays are a property of the random access memory, you know, and the power of this is because 36 00:02:43,310 --> 00:02:45,080 ‫it is consecutive. 37 00:02:47,080 --> 00:02:51,940 ‫No matter how large the array is given the index. 38 00:02:52,780 --> 00:02:56,260 ‫I can go to the first node. 39 00:02:57,430 --> 00:03:04,840 ‫The first element of that Ray or I can go to the middle one or I can go to any to the last one or any 40 00:03:04,840 --> 00:03:10,900 ‫element in almost constant cost because of one. 41 00:03:10,900 --> 00:03:17,650 ‫As the computer science guys say, it's it's so fast because of this byte address ability, because 42 00:03:17,650 --> 00:03:22,120 ‫every index is associated with a memory address. 43 00:03:22,120 --> 00:03:27,910 ‫And to get the next one is literally just add one to the memory address and we can get there. 44 00:03:27,910 --> 00:03:36,820 ‫And if you ask the CPU to fetch a value of that, whatever the the element has, whether an integer 45 00:03:36,820 --> 00:03:43,240 ‫or a string or anything, you can immediately go there and fetch the value. 46 00:03:43,240 --> 00:03:45,070 ‫And that's the power of the array. 47 00:03:45,070 --> 00:03:50,920 ‫And hash tables are nothing but a glorified arrays, to be honest. 48 00:03:50,920 --> 00:03:59,440 ‫So you can ask the CPU to fetch a value from an array in memory by giving it its index. 49 00:03:59,440 --> 00:04:07,360 ‫From the index we're going to find the first address and going to add X amount of value to find that 50 00:04:07,360 --> 00:04:08,170 ‫particular index. 51 00:04:08,170 --> 00:04:14,620 ‫So so the cost is really is an addition plus the cost to go to the memory from the CPU. 52 00:04:14,830 --> 00:04:15,190 ‫Right? 53 00:04:15,190 --> 00:04:21,700 ‫So the first one is constant addition is cheap, it is done in the CPU itself. 54 00:04:21,730 --> 00:04:28,980 ‫The other cost to actually jump to the memory is really depending on your architecture of the CPU and 55 00:04:28,990 --> 00:04:31,480 ‫we can really dive deep into this. 56 00:04:31,480 --> 00:04:34,630 ‫This is probably another episode itself. 57 00:04:34,660 --> 00:04:44,830 ‫The cost to fetch a value from the RAM to the CPU really depends on the distance from where the CPU 58 00:04:44,830 --> 00:04:45,790 ‫lives. 59 00:04:45,970 --> 00:04:46,720 ‫The memory. 60 00:04:47,050 --> 00:04:55,870 ‫You know, in multicore architecture you have multiple CPUs and you have a single memory and they compete 61 00:04:55,870 --> 00:04:56,770 ‫in the memory. 62 00:04:56,770 --> 00:05:02,080 ‫Only one CPU can access the memory at a given time. 63 00:05:02,080 --> 00:05:05,530 ‫And so there's there's some sort of a competing computing going on. 64 00:05:05,950 --> 00:05:15,850 ‫So what happens is some in some certain architecture, there will be two memory locations. 65 00:05:15,850 --> 00:05:16,450 ‫Right? 66 00:05:16,450 --> 00:05:21,010 ‫And, and in each CPU access, it's local memory. 67 00:05:21,010 --> 00:05:23,940 ‫And that's obviously becomes faster, right? 68 00:05:23,950 --> 00:05:26,290 ‫Because there is no competing anymore. 69 00:05:26,290 --> 00:05:28,140 ‫No CPU is competing on the memory. 70 00:05:28,150 --> 00:05:37,600 ‫The problem with this is now this CPU sometimes might need to access a memory all over there that is 71 00:05:37,630 --> 00:05:40,210 ‫that is available that's closer to other CPU. 72 00:05:40,510 --> 00:05:47,020 ‫And and that's what it's called a numa architecture, the non-uniform memory access, you know, so 73 00:05:47,020 --> 00:05:56,050 ‫this can be slow and this is where guys like Apple is innovating to making the cost of access in the 74 00:05:56,050 --> 00:06:02,770 ‫moment as fast as possible by having everything into a single SOC chip, which is the MX one pro. 75 00:06:03,130 --> 00:06:12,580 ‫And then they just came up with an M1 ultra which fuses to M1, you know, SOC chips such that the cost 76 00:06:12,580 --> 00:06:16,960 ‫to to jump between different memory is negligible. 77 00:06:16,960 --> 00:06:24,160 ‫They just made this great, you know, design such that you don't as a programmer, you don't really 78 00:06:24,160 --> 00:06:31,990 ‫need to, oh, if my array is all over there and I really have to execute my code on this CPU, you 79 00:06:31,990 --> 00:06:34,690 ‫know, this is the normal aware programming. 80 00:06:34,690 --> 00:06:40,030 ‫Obviously, this is a little bit of discussion we can go into other day, but. 81 00:06:40,970 --> 00:06:43,820 ‫The cost is, you can't call it constant. 82 00:06:43,970 --> 00:06:44,910 ‫It's fast. 83 00:06:44,930 --> 00:06:47,060 ‫It can be faster than certain architecture. 84 00:06:47,300 --> 00:06:48,410 ‫So that's the array. 85 00:06:49,640 --> 00:06:50,500 ‫The array. 86 00:06:50,510 --> 00:06:53,780 ‫The problem with the array is it's a it's an integer based. 87 00:06:54,140 --> 00:06:56,720 ‫So zero one, two, three, four, five. 88 00:06:56,930 --> 00:07:06,720 ‫So if you're if you can have your key as that index, you one, right. 89 00:07:07,130 --> 00:07:13,250 ‫The problem is like how many applications where the key is actually an integer and it is actually this 90 00:07:13,250 --> 00:07:16,240 ‫this particular index that's almost very little. 91 00:07:16,250 --> 00:07:16,670 ‫Right. 92 00:07:17,150 --> 00:07:24,270 ‫This is where hashing table comes to use the concept of arrays in a very intelligent manner. 93 00:07:24,290 --> 00:07:25,100 ‫What do they do? 94 00:07:26,640 --> 00:07:27,350 ‫Second example. 95 00:07:27,350 --> 00:07:28,490 ‫So I have. 96 00:07:30,150 --> 00:07:30,730 ‫I don't know. 97 00:07:30,750 --> 00:07:36,200 ‫I have an ID student, you know, and I want to find their name. 98 00:07:36,240 --> 00:07:38,970 ‫Let's say this is a key value store, you know? 99 00:07:41,310 --> 00:07:45,480 ‫The ID is obviously a large number has nothing to do with the index. 100 00:07:45,480 --> 00:07:49,770 ‫So what you do is you want giving the ID. 101 00:07:49,770 --> 00:07:52,500 ‫I want immediately to find the name of the students. 102 00:07:52,620 --> 00:08:00,420 ‫So the value is the name, the the, the, the integer, the key as the ID of the student. 103 00:08:00,450 --> 00:08:06,120 ‫So given that ID student, what do you do as you build a hash table? 104 00:08:07,410 --> 00:08:08,460 ‫What do you do? 105 00:08:08,490 --> 00:08:17,640 ‫You effectively take the IDs and you create a hash of of them using a hash function that gives you a 106 00:08:17,640 --> 00:08:19,930 ‫garbage value, right? 107 00:08:20,130 --> 00:08:23,490 ‫So that if you hash the same value again, you're going to get the same hash. 108 00:08:24,030 --> 00:08:29,640 ‫Given that hash, you then take that value and then. 109 00:08:30,410 --> 00:08:33,800 ‫Calculate an index that points to an array. 110 00:08:35,110 --> 00:08:35,950 ‫That's the trick. 111 00:08:36,130 --> 00:08:36,880 ‫And. 112 00:08:38,270 --> 00:08:41,360 ‫The most easy way is to do a modular function. 113 00:08:41,360 --> 00:08:45,780 ‫So let's say your hash table size, your array size is ten, you know. 114 00:08:46,070 --> 00:08:52,760 ‫So if you calculate, if you give it this ID, you do modulo ten, you do a hash and then you do a modular 115 00:08:52,760 --> 00:08:54,710 ‫ten and you get a value between zero and nine. 116 00:08:54,710 --> 00:09:02,000 ‫And that's the power because you eventually have an index, you can immediately go to that array value 117 00:09:02,000 --> 00:09:06,710 ‫in that index and fetch the value, which is the name of the student in this particular case. 118 00:09:06,710 --> 00:09:09,170 ‫And that's the power of hash, hash tables. 119 00:09:09,170 --> 00:09:15,920 ‫And so now given an actual key, I can find the value using the power of array. 120 00:09:16,340 --> 00:09:18,890 ‫Well, you might say there are problems with this. 121 00:09:18,890 --> 00:09:23,270 ‫Obviously, it's like about collision. 122 00:09:23,450 --> 00:09:30,770 ‫You know, sometimes I might get to student IDs that maps to the same hash, which then maps to the 123 00:09:30,770 --> 00:09:31,730 ‫same index. 124 00:09:32,470 --> 00:09:34,480 ‫Well, that's your responsibility to solve. 125 00:09:34,480 --> 00:09:40,900 ‫And we're going to talk about this and the limitations option or portion of this video. 126 00:09:42,370 --> 00:09:44,680 ‫So now we know we no hash hash tables. 127 00:09:44,680 --> 00:09:50,800 ‫So now we understand hash table the idea of hashing something to get eventually get an index, which 128 00:09:50,800 --> 00:09:56,080 ‫we're going to use to point to an array which has our values. 129 00:09:56,320 --> 00:10:02,020 ‫So one popular use cases database joining. 130 00:10:02,020 --> 00:10:08,050 ‫So if you have two relations and you want to join these relations based on a foreign key, you know, 131 00:10:08,380 --> 00:10:11,710 ‫you're going to use the idea of. 132 00:10:12,430 --> 00:10:13,840 ‫Hashing hash tables. 133 00:10:14,110 --> 00:10:20,890 ‫Let's say you want to join the company table to the employee table where you have a company ID field. 134 00:10:20,890 --> 00:10:23,230 ‫So this employee belongs to this company, right? 135 00:10:23,680 --> 00:10:30,310 ‫So in order to join the two tables, you're going to pick one of the relations and you're going to create 136 00:10:30,310 --> 00:10:33,820 ‫a hash table and focus on this word. 137 00:10:33,850 --> 00:10:36,290 ‫Create a hash table. 138 00:10:36,310 --> 00:10:37,940 ‫The hash table doesn't exist. 139 00:10:37,960 --> 00:10:39,130 ‫You create it. 140 00:10:39,130 --> 00:10:42,940 ‫And there is a cost to creating a hash table. 141 00:10:44,060 --> 00:10:44,750 ‫So. 142 00:10:45,400 --> 00:10:52,150 ‫Usually in a joint you pick the smaller relation because creating hash table is expensive because you 143 00:10:52,150 --> 00:11:02,410 ‫get a loop through every single one and take their value, hash it and get the index and then go to 144 00:11:02,410 --> 00:11:11,320 ‫that array that you hopefully you have already pre allocated and then store the value for that index 145 00:11:11,320 --> 00:11:17,170 ‫in that particular array and you going to do the every thing for every single value. 146 00:11:17,170 --> 00:11:24,160 ‫So that's that means the hash table really has to have a fixed value to start with. 147 00:11:24,160 --> 00:11:30,250 ‫You start with Ayano I'm going to create I think this relation is around 1000, right? 148 00:11:30,250 --> 00:11:36,100 ‫It's going to increase 1000 if you know the size that's even better and you say okay, I want 1000, 149 00:11:36,100 --> 00:11:45,070 ‫a thousand array element and so, so you ask the ram, hey give me 1000 integer consecutive array values 150 00:11:45,070 --> 00:11:48,250 ‫and then you start building the hash tables and you fill all those thousand. 151 00:11:48,250 --> 00:11:51,460 ‫So the first value might go to index 999. 152 00:11:51,490 --> 00:11:54,550 ‫The second value might go to the index 78. 153 00:11:54,580 --> 00:11:57,910 ‫The third value might go to index zero and you start failing. 154 00:11:58,500 --> 00:12:00,010 ‫It's start filling this hash table. 155 00:12:00,010 --> 00:12:07,900 ‫So now once you build that hash table, you have a beautiful array in memory with with indexes, right? 156 00:12:07,900 --> 00:12:11,500 ‫But your hash values mean something. 157 00:12:11,500 --> 00:12:17,410 ‫There is a mapping between the hash, between the key to the hash, obviously, and then between the 158 00:12:17,410 --> 00:12:18,460 ‫hash to the index. 159 00:12:18,460 --> 00:12:20,860 ‫And that mapping is what's critical now. 160 00:12:20,860 --> 00:12:21,070 ‫Okay. 161 00:12:21,070 --> 00:12:22,210 ‫We have this hash table. 162 00:12:22,210 --> 00:12:23,080 ‫What do I do with it? 163 00:12:23,680 --> 00:12:30,610 ‫Now, you go to the other relation, whether whatever it is, is the company there or the employee, 164 00:12:30,610 --> 00:12:34,240 ‫whichever is larger in this case and you loop through it. 165 00:12:35,510 --> 00:12:36,710 ‫You pick the first one? 166 00:12:37,780 --> 00:12:39,640 ‫First entry of the company. 167 00:12:39,640 --> 00:12:44,380 ‫ID So at this point you don't really know the value is just picking one variable at a time. 168 00:12:44,380 --> 00:12:48,140 ‫Oh, this is company number 705. 169 00:12:48,160 --> 00:12:52,690 ‫Let's hash it and let's get the index by using a modulo right. 170 00:12:52,690 --> 00:12:58,390 ‫And then, oh, it's an index number 773. 171 00:12:58,390 --> 00:12:59,740 ‫Oh, let's go to the memory. 172 00:13:00,190 --> 00:13:01,690 ‫Right, let's go to the array. 173 00:13:01,690 --> 00:13:03,550 ‫So ask the CPA to fetch that value. 174 00:13:03,550 --> 00:13:09,880 ‫And then when you reach the value, you give that company complete row, which is the name of the company, 175 00:13:09,880 --> 00:13:14,920 ‫the whatever, how many employees, whatever you put the entire row that you need. 176 00:13:14,920 --> 00:13:20,640 ‫Obviously you don't put the entire row you put exactly based on the select query that joined, you know, 177 00:13:20,650 --> 00:13:26,410 ‫so that there is like a blob fields you don't just put it if the user didn't ask for it, right. 178 00:13:26,650 --> 00:13:32,680 ‫So that's why it select star sometimes kills the performance because look at all this stuff that it 179 00:13:32,680 --> 00:13:33,550 ‫needs to do, right? 180 00:13:33,820 --> 00:13:38,710 ‫It has to put in the hash table, has to fetch it from the database, from disk. 181 00:13:38,830 --> 00:13:45,520 ‫So select store is really harmful, starting to do it if you don't need to and yeah. 182 00:13:45,520 --> 00:13:46,330 ‫And you saw just. 183 00:13:47,950 --> 00:13:49,700 ‫Probing the second. 184 00:13:49,720 --> 00:13:50,320 ‫This is the second. 185 00:13:50,320 --> 00:13:51,660 ‫So it's called probing your probe. 186 00:13:53,410 --> 00:13:54,710 ‫And this probe is closed. 187 00:13:54,730 --> 00:14:01,090 ‫Big of one, big old one because it's very fast, because the hash has a cost is constant and the go 188 00:14:01,090 --> 00:14:04,930 ‫and give me that index also has a fixed cost. 189 00:14:04,960 --> 00:14:11,540 ‫Again, depends on whether you are numa versus not numa architecture and your CPU and how far your severe 190 00:14:11,590 --> 00:14:13,990 ‫from the ram all that jazz. 191 00:14:13,990 --> 00:14:16,750 ‫But it's almost let's say it's constant, right? 192 00:14:17,530 --> 00:14:20,670 ‫If you're at that stage, that's where you basically optimize. 193 00:14:21,520 --> 00:14:23,560 ‫That's a different level altogether. 194 00:14:25,210 --> 00:14:29,230 ‫Frankly, above my head, you know, it's just so, so much details. 195 00:14:30,010 --> 00:14:32,780 ‫But yeah, that's you go there and you fetch the value. 196 00:14:32,800 --> 00:14:36,470 ‫So now if you find the value, that means there is a match. 197 00:14:36,490 --> 00:14:41,260 ‫If you didn't, that means, hey, this value doesn't exist, so I don't join it and you create an output 198 00:14:41,260 --> 00:14:42,640 ‫relation as a result. 199 00:14:42,670 --> 00:14:43,780 ‫That's how you join. 200 00:14:44,550 --> 00:14:45,540 ‫That's a hash joint. 201 00:14:46,230 --> 00:14:50,580 ‫Obviously, there is optimisation and different hash joints, and I think I'm going to make another 202 00:14:50,850 --> 00:14:53,940 ‫episode talking about hash joints because they're like my methods. 203 00:14:54,120 --> 00:14:56,850 ‫What I just explained is the classic old hash joint, you know. 204 00:14:58,170 --> 00:15:06,570 ‫It becomes a problem because what happened if if the I cannot fit this stuff in memory, you know, 205 00:15:06,960 --> 00:15:08,700 ‫that's what a part of the limitation. 206 00:15:09,580 --> 00:15:12,310 ‫Hash tables are useless if you can't fit them in memory. 207 00:15:12,400 --> 00:15:12,790 ‫Right. 208 00:15:12,790 --> 00:15:18,490 ‫Because the whole point is because the whole thing is in array, you have to fit the array into memory 209 00:15:18,490 --> 00:15:22,030 ‫so you can use the byte address ability of the ram again. 210 00:15:22,240 --> 00:15:29,550 ‫Unless someone invents a byte addressable desk, all of these problems will go away hopefully. 211 00:15:31,330 --> 00:15:32,890 ‫I think there are a lot of attempts. 212 00:15:33,450 --> 00:15:42,160 ‫Ultra Ram was the latest that I'm aware of, you know, which is basically a persistent disk value that 213 00:15:42,160 --> 00:15:45,010 ‫is just as fast as the RAM, but it's also persistent. 214 00:15:45,040 --> 00:15:47,950 ‫You know, I'm going to reference the video that I talk about Altra Ram. 215 00:15:48,820 --> 00:15:56,860 ‫But yeah, a lot of the storage people, you know, they talk about this, they eat this for dinner 216 00:15:57,190 --> 00:15:58,000 ‫effectively. 217 00:15:58,600 --> 00:16:06,430 ‫But yeah, this is the idea of hash tables, hash joins, very, very popular concept. 218 00:16:06,460 --> 00:16:10,720 ‫Now I want to talk about the limitations of hash tables. 219 00:16:10,720 --> 00:16:16,480 ‫And this is where you really kind of link to the previous topic or hash joints. 220 00:16:18,610 --> 00:16:21,190 ‫Basically, watch out again. 221 00:16:21,190 --> 00:16:22,930 ‫Hash tables have to fit the memory. 222 00:16:24,680 --> 00:16:26,150 ‫To be utilized. 223 00:16:27,380 --> 00:16:30,760 ‫And that's the biggest, biggest problem, I guess. 224 00:16:30,770 --> 00:16:33,530 ‫You know, it's like what happened if you. 225 00:16:34,810 --> 00:16:39,910 ‫Because that's the first thing people will say is, hey, I'm going to build my entire key value store 226 00:16:39,910 --> 00:16:41,020 ‫as a hash table. 227 00:16:41,500 --> 00:16:44,020 ‫Hey, let's put that into our customer database, a hash table. 228 00:16:44,410 --> 00:16:54,130 ‫You can't how what kind of how how much memory do you need to put the entire thing in memory? 229 00:16:54,130 --> 00:16:55,690 ‫Because that's what you need to do, right? 230 00:16:56,200 --> 00:16:58,930 ‫Let alone just building that God -- thing. 231 00:16:58,930 --> 00:17:00,640 ‫You know, you need to build it first. 232 00:17:01,180 --> 00:17:05,680 ‫And to build it is costly because to build it, you have to scan the entire thing. 233 00:17:05,710 --> 00:17:12,760 ‫That's a big of when you're just scanning the entire thing, pull it in memory to build it, and that's 234 00:17:12,760 --> 00:17:13,550 ‫a cost. 235 00:17:13,570 --> 00:17:16,480 ‫Maybe you can consider this a fixed cost, right? 236 00:17:16,510 --> 00:17:22,040 ‫And then once you build that, then you have to do all of that, get it in memory. 237 00:17:22,060 --> 00:17:27,310 ‫Otherwise, if it's not a memory, you have no idea which parts will be fetched next. 238 00:17:27,730 --> 00:17:35,590 ‫I think you can play games here where you can play the idea of having I don't know if you have low level 239 00:17:35,590 --> 00:17:42,160 ‫access to the RAM where you say, hey, if if someone asks X is this location and it's not in Ram, 240 00:17:42,160 --> 00:17:43,580 ‫go fetch it from disk. 241 00:17:43,600 --> 00:17:44,890 ‫You can build. 242 00:17:46,310 --> 00:17:52,670 ‫Data structures like that, I believe, but I don't believe is going to be easy to do that. 243 00:17:53,060 --> 00:17:55,970 ‫Maybe they do exist and I'm not aware of them. 244 00:17:56,060 --> 00:17:57,260 ‫Please let me know if they. 245 00:17:57,290 --> 00:17:59,270 ‫If you have worked with such like. 246 00:17:59,810 --> 00:18:02,240 ‫Like disk hybrid model. 247 00:18:02,780 --> 00:18:03,530 ‫But that's the idea. 248 00:18:03,580 --> 00:18:05,210 ‫The limitation is the memory cost. 249 00:18:05,630 --> 00:18:16,640 ‫You know, it can't just have a billion size hash table, you know, plus the that's one thing, the 250 00:18:16,640 --> 00:18:22,940 ‫key size and the other thing is the actual values which is not really. 251 00:18:23,240 --> 00:18:27,290 ‫Might not be as problematic because they can live somewhere else. 252 00:18:27,290 --> 00:18:34,690 ‫The actual pointers you know it's just the indices is what what is critical and then the pointers can 253 00:18:34,700 --> 00:18:37,730 ‫can live somewhere else can in the heap of the memory. 254 00:18:37,730 --> 00:18:40,910 ‫They don't have to be consecutive like a string. 255 00:18:40,940 --> 00:18:46,640 ‫It's not really literally a string that lives in the element itself of the array. 256 00:18:46,670 --> 00:18:50,000 ‫It's a pointer to some other location that we don't care about. 257 00:18:50,030 --> 00:18:54,610 ‫As long as we have that pointer where we can jump to that memory location, then it's fine. 258 00:18:54,620 --> 00:19:02,360 ‫Yeah, it's an extra cost of the CPU, but until an AMD and all those people are taking care of these 259 00:19:02,360 --> 00:19:05,930 ‫optimizations, well, this is on to limitations. 260 00:19:05,930 --> 00:19:08,960 ‫The cost of building the hash table is a cost. 261 00:19:09,640 --> 00:19:09,860 ‫Right. 262 00:19:09,860 --> 00:19:12,560 ‫That's why you always pick the smallest in case of a join. 263 00:19:12,560 --> 00:19:12,970 ‫Right. 264 00:19:13,490 --> 00:19:15,350 ‫The other one is the. 265 00:19:16,270 --> 00:19:17,610 ‫Is a memory like you have. 266 00:19:18,670 --> 00:19:20,170 ‫You have to make sure it fits in the memory. 267 00:19:20,420 --> 00:19:22,660 ‫There are workarounds for this, I believe one. 268 00:19:22,830 --> 00:19:31,900 ‫One alternative to to hash joins is to avoid probing all over again because you have to scan this other 269 00:19:31,900 --> 00:19:34,450 ‫relation to always just go back and jump. 270 00:19:34,450 --> 00:19:34,720 ‫Right. 271 00:19:34,720 --> 00:19:37,810 ‫And that can be costly if you do it over and over and over again. 272 00:19:37,810 --> 00:19:40,570 ‫So you can effectively partition stuff. 273 00:19:40,640 --> 00:19:47,530 ‫So say, okay, let me just work with a smaller subset instead of their entire table. 274 00:19:47,530 --> 00:19:50,860 ‫Let me just create partitions. 275 00:19:50,860 --> 00:19:56,110 ‫Let's any value that I'm scanning, I'm going to create a partition that is closer to each other and 276 00:19:56,110 --> 00:20:01,180 ‫then I'm going to store it to disk and then read the partition from disk and then hash table that. 277 00:20:01,180 --> 00:20:06,880 ‫This way you don't have to scan the entire bill, the entire hash of the entire relation. 278 00:20:06,880 --> 00:20:09,670 ‫You just partition it and chunk it. 279 00:20:09,670 --> 00:20:11,620 ‫You work with that smaller, smaller size. 280 00:20:11,620 --> 00:20:13,260 ‫So that's another limitation. 281 00:20:13,270 --> 00:20:14,170 ‫There are workarounds. 282 00:20:14,170 --> 00:20:14,800 ‫Obviously. 283 00:20:14,800 --> 00:20:20,410 ‫The final thing is really with the biggest problem with hash tables is what happened. 284 00:20:20,410 --> 00:20:26,080 ‫If I want I want to keep adding new keys when I delete a key. 285 00:20:28,710 --> 00:20:30,750 ‫I want to add a new one and delete one. 286 00:20:31,440 --> 00:20:33,390 ‫Deleting and adding. 287 00:20:34,700 --> 00:20:36,140 ‫To a hash table. 288 00:20:36,880 --> 00:20:38,740 ‫Really screws up everything. 289 00:20:39,130 --> 00:20:39,710 ‫Why? 290 00:20:39,730 --> 00:20:45,640 ‫Because, remember, we're we're doing a hash and then we do a module on the size of the hash table. 291 00:20:45,940 --> 00:20:54,280 ‫So if you have a value and you hash it and a value of like from 0 to 10, like you should you you do 292 00:20:54,280 --> 00:20:57,880 ‫a module on ten, then you get a value from zero and nine, right? 293 00:20:57,940 --> 00:20:59,380 ‫So let's say you get eight. 294 00:20:59,380 --> 00:21:01,390 ‫But what happens if you add another element? 295 00:21:02,250 --> 00:21:04,740 ‫Right now, your hash table is 11. 296 00:21:05,250 --> 00:21:07,890 ‫Now the modulo operation. 297 00:21:07,920 --> 00:21:12,900 ‫If you do it modular 11, you're going to get a different value. 298 00:21:13,550 --> 00:21:16,100 ‫And now you're not pointing to the same index anymore. 299 00:21:16,580 --> 00:21:19,070 ‫So you have to do a remapping. 300 00:21:20,220 --> 00:21:22,230 ‫After you add or delete. 301 00:21:23,420 --> 00:21:26,270 ‫And that is costly, right? 302 00:21:26,540 --> 00:21:29,630 ‫So you have to scan the entire thing and then just reshift things. 303 00:21:29,630 --> 00:21:31,310 ‫It might not be as expensive. 304 00:21:31,340 --> 00:21:34,760 ‫Sometimes it's easy, sometimes it's slow, you know? 305 00:21:34,940 --> 00:21:37,820 ‫But but that cost of remapping, you have to keep it in mind. 306 00:21:38,210 --> 00:21:40,700 ‫That's why people will just say, okay, I'm going to work with. 307 00:21:40,730 --> 00:21:41,510 ‫I don't. 308 00:21:41,510 --> 00:21:43,670 ‫I'm not going to resize my hash table. 309 00:21:43,670 --> 00:21:47,030 ‫It's always going to be I don't know, it's always going to be ten, you know. 310 00:21:47,900 --> 00:21:56,360 ‫And as a result, you you don't you don't you don't anticipate that it's going to be increasing. 311 00:21:56,360 --> 00:21:56,780 ‫Right. 312 00:22:00,710 --> 00:22:05,840 ‫But again, if you fix it too large, then you're consuming a lot of memory and you have you might have 313 00:22:05,840 --> 00:22:06,620 ‫gaps. 314 00:22:06,650 --> 00:22:12,320 ‫If it's too small, then you might add values to the hash table, and that will cause remapping. 315 00:22:12,320 --> 00:22:15,760 ‫And that's a problem with Cassandra and SHA. 316 00:22:15,770 --> 00:22:18,140 ‫Any sharding database, a distributed database. 317 00:22:18,140 --> 00:22:18,500 ‫Right. 318 00:22:19,430 --> 00:22:22,490 ‫The problem is to do sharding. 319 00:22:23,600 --> 00:22:25,940 ‫You kind of use a hash table, right? 320 00:22:26,060 --> 00:22:32,780 ‫Because if I'm doing a query and this is my node, right? 321 00:22:33,350 --> 00:22:37,360 ‫You don't really specify a node per se, but you're basically based on the key. 322 00:22:37,370 --> 00:22:48,770 ‫So like if I'm inserting value eight, you hash that and then the value will map to a node or a shard 323 00:22:49,460 --> 00:22:53,120 ‫or a partition even partitioning you can do hash partitioning, right? 324 00:22:53,210 --> 00:22:59,450 ‫I'm talking about hash partitioning here, not range partition range is easy for like from 0 to 10000 325 00:22:59,450 --> 00:23:03,770 ‫goes to this server from 10001 to 20000. 326 00:23:03,770 --> 00:23:05,930 ‫Go to this server from 20,000. 327 00:23:06,140 --> 00:23:14,300 ‫You know, you get this idea, but if you are using hash based sharding, then every value is hashed 328 00:23:14,300 --> 00:23:18,500 ‫and then you calculate an index and that index pointer are array. 329 00:23:18,500 --> 00:23:23,990 ‫And the value of that value of that index in the array gives you the server name. 330 00:23:24,350 --> 00:23:25,670 ‫Then you had that server. 331 00:23:25,850 --> 00:23:26,540 ‫But what happened? 332 00:23:26,540 --> 00:23:29,750 ‫If you add a new server, then the whole thing changes. 333 00:23:30,620 --> 00:23:31,190 ‫Right. 334 00:23:31,640 --> 00:23:36,560 ‫Then what do you do if the change is then all of a sudden, if you try to read a key and instead of 335 00:23:36,560 --> 00:23:43,400 ‫going to that particular server, the the remapping has changed, now you're going to another server 336 00:23:43,400 --> 00:23:44,930 ‫and you're not going to find the key. 337 00:23:45,410 --> 00:23:50,450 ‫That's why Cassandra sometimes struggle with this. 338 00:23:50,530 --> 00:23:53,450 ‫If you when you do a remapping, it's really hard. 339 00:23:53,780 --> 00:23:55,940 ‫You know, they solved all these problems. 340 00:23:55,940 --> 00:23:59,600 ‫But I'm just discussing the the original problem here. 341 00:23:59,810 --> 00:24:00,740 ‫I say struggle. 342 00:24:00,770 --> 00:24:02,230 ‫I say initially. 343 00:24:02,240 --> 00:24:02,690 ‫Right. 344 00:24:03,140 --> 00:24:09,500 ‫So, like, if you add something I know to the ring, then the neighboring rings has to share values 345 00:24:09,500 --> 00:24:11,030 ‫because now you're remapping keys. 346 00:24:11,030 --> 00:24:19,220 ‫So the oh, if I'm if I'm adding value to a node, I need to join this value to shrink this, this cluster. 347 00:24:19,220 --> 00:24:25,040 ‫And now I have to borrow values from the my neighbors based on this remapping, you know, because now 348 00:24:25,040 --> 00:24:27,560 ‫oh, I no longer hold this key after this remapping. 349 00:24:27,560 --> 00:24:29,260 ‫Oh, this, this guy houses key. 350 00:24:29,270 --> 00:24:30,500 ‫So you have to rishard. 351 00:24:30,530 --> 00:24:37,340 ‫And this is where another concept was invented in 1997, I believe called consistent hashing, where 352 00:24:37,340 --> 00:24:44,990 ‫they try to minimize the effect of this remapping because remapping is expensive. 353 00:24:45,140 --> 00:24:45,740 ‫Right. 354 00:24:46,450 --> 00:24:47,860 ‫So consistent hashing. 355 00:24:47,860 --> 00:24:55,420 ‫Maybe another topic solves this to minimize that the effect after adding or deleting a node. 356 00:24:55,570 --> 00:25:00,070 ‫Because remember in a distributed architecture you're going to add and remove nodes all the time. 357 00:25:00,070 --> 00:25:06,730 ‫So consistent hashing the idea of having to consistently always hash to the same server no matter what 358 00:25:07,510 --> 00:25:09,100 ‫is very, very critical. 359 00:25:09,220 --> 00:25:13,180 ‫And maybe that's another topic for another day. 360 00:25:13,180 --> 00:25:14,530 ‫I hope you enjoyed this episode. 361 00:25:14,530 --> 00:25:17,530 ‫I'm going to see you on the next one, you guys, the awesome goodbye. 362 00:25:28,430 --> 00:25:33,350 ‫In a previous episode of the Vector Engineering Show, I talked about hash tables and how. 363 00:25:34,190 --> 00:25:39,590 ‫A powerful and very commonly used, they are given a key. 364 00:25:39,770 --> 00:25:46,550 ‫We can find its corresponding value in no time, in zero sick time. 365 00:25:46,550 --> 00:25:49,100 ‫We're not searching, we're not scanning or not doing anything. 366 00:25:49,100 --> 00:25:53,310 ‫It's a single access in memory retrieving that value. 367 00:25:53,330 --> 00:26:00,830 ‫The power behind hash tables is the use of the arrays, which is a very common data structure, obviously 368 00:26:00,860 --> 00:26:03,260 ‫a very, very common data structure. 369 00:26:03,620 --> 00:26:12,260 ‫Knowing the array position, the the index, you can get the value of the array immediately because 370 00:26:12,260 --> 00:26:19,190 ‫you know, how how does memory work if you know the address of the ram, where your cell exists, where 371 00:26:19,190 --> 00:26:22,610 ‫your values exist, you can get the value immediately. 372 00:26:22,880 --> 00:26:26,000 ‫The CPU can fetch the value immediately for you, right? 373 00:26:26,510 --> 00:26:29,810 ‫So once you find the index, you find the address, you can the value. 374 00:26:29,810 --> 00:26:35,960 ‫So what the hash table guys did, it says, okay, we're going to take your name, your string, your 375 00:26:35,960 --> 00:26:38,600 ‫color, your car, your anything. 376 00:26:38,600 --> 00:26:43,880 ‫The result, this key, we're going to hash it using a one way function and then convert that into an 377 00:26:43,880 --> 00:26:47,000 ‫index using a modulo function based on the array size. 378 00:26:47,000 --> 00:26:53,240 ‫So we eventually from the name we convert in getting an index and that will give us the address and 379 00:26:53,240 --> 00:26:54,560 ‫the memory and we can get the value. 380 00:26:54,560 --> 00:26:57,590 ‫So that's the trick the hash table guys use now. 381 00:26:57,590 --> 00:27:03,590 ‫So the only cost you add is like what the hashing function, which is not really that bad, but that's 382 00:27:03,590 --> 00:27:04,850 ‫the power of the hash table. 383 00:27:04,850 --> 00:27:11,210 ‫But the problem is the moment the size of the array changes, the hash table size changes. 384 00:27:11,630 --> 00:27:21,020 ‫This all falls apart because what what if blue, the key blue used to fit in index number 11. 385 00:27:21,110 --> 00:27:28,220 ‫If you increase or decrease the array size and you know where I'm going with this, then the blue will 386 00:27:28,220 --> 00:27:34,820 ‫fit into index 99 now and as a result, right, you either won't find the value or you're going to start 387 00:27:34,820 --> 00:27:40,190 ‫writing the values, duplicate and everything will basically be bad. 388 00:27:40,190 --> 00:27:45,830 ‫So you now have to really move things around when you resize and it's really a big problem. 389 00:27:45,830 --> 00:27:52,430 ‫So why I'm mentioning that the problem that we are seeing today with Distributed System is as follows 390 00:27:53,000 --> 00:28:00,650 ‫If my database are increasing in size and it no longer can fit in a single instance, i. 391 00:28:00,680 --> 00:28:04,250 ‫I bumped up the vertical scaling to its maximum. 392 00:28:04,250 --> 00:28:14,660 ‫It's now 48 core CPU, three gigahertz, whatever, you know, it's 512 gigabytes or even one terabyte 393 00:28:15,320 --> 00:28:16,040 ‫ram. 394 00:28:16,040 --> 00:28:18,860 ‫But it's still I have billions of rows. 395 00:28:19,250 --> 00:28:21,770 ‫I cannot I added all the indexes. 396 00:28:21,770 --> 00:28:25,760 ‫I try to partition it horizontally in the same instance by rain. 397 00:28:25,760 --> 00:28:29,390 ‫So but it's still the even the partitions are so large. 398 00:28:29,390 --> 00:28:36,650 ‫So I that machine cannot handle the queries throwing at my beautiful database. 399 00:28:36,650 --> 00:28:37,640 ‫So what do we do? 400 00:28:37,670 --> 00:28:38,690 ‫We just abuse it. 401 00:28:38,690 --> 00:28:40,910 ‫We shard the database. 402 00:28:40,910 --> 00:28:45,860 ‫That's I really don't like to go there unless I can exhaust all my options. 403 00:28:45,860 --> 00:28:49,820 ‫And believe me, people don't even look at the options anymore. 404 00:28:49,820 --> 00:28:55,760 ‫People are very quick to follow modern things, you know, without actually going to the basics and 405 00:28:56,030 --> 00:28:58,670 ‫trying to tune your database and getting a better performance. 406 00:28:58,670 --> 00:29:04,400 ‫But regardless, let's assume your YouTube scale lets you if you Google scale and you run out of ideas 407 00:29:04,400 --> 00:29:11,000 ‫that a single instance can't handle anymore, you need to distribute that billions of rows table or 408 00:29:11,000 --> 00:29:15,530 ‫key dictionary of collection and to multiple servers. 409 00:29:15,530 --> 00:29:15,830 ‫Right? 410 00:29:15,830 --> 00:29:20,780 ‫So instead of having billions, let's have a few millions on this server and a few millions in this 411 00:29:20,780 --> 00:29:20,980 ‫server. 412 00:29:20,980 --> 00:29:21,500 ‫A few millions. 413 00:29:21,500 --> 00:29:25,490 ‫And the server phenomenon in the server immediately a problem occur. 414 00:29:25,610 --> 00:29:28,700 ‫The problem is like, how do I know, right? 415 00:29:29,300 --> 00:29:31,670 ‫If I have a key, I want to look it up. 416 00:29:32,120 --> 00:29:37,370 ‫We introduced an intermediately an intermediary problem. 417 00:29:37,370 --> 00:29:37,630 ‫Right. 418 00:29:37,700 --> 00:29:38,990 ‫That didn't exist before. 419 00:29:39,960 --> 00:29:40,370 ‫What? 420 00:29:40,380 --> 00:29:46,360 ‫First, if I have a key, I go to my database server and I ask it and I immediately give it to me. 421 00:29:46,380 --> 00:29:48,010 ‫I only go to one hop. 422 00:29:48,030 --> 00:29:56,250 ‫Now, if you are distributed and you only have the key, you have to answer the first question Which 423 00:29:56,250 --> 00:29:59,370 ‫server should I connect to? 424 00:29:59,400 --> 00:30:04,160 ‫To retrieve the key, which server hosts my key? 425 00:30:04,170 --> 00:30:06,110 ‫And that is the problem here. 426 00:30:06,120 --> 00:30:10,890 ‫That is the original and only problem that we have in distributed system. 427 00:30:10,920 --> 00:30:15,630 ‫Which server should I connect to to fetch that key? 428 00:30:17,760 --> 00:30:19,980 ‫So the trick was always. 429 00:30:20,950 --> 00:30:26,200 ‫Let's figure out the server from the key, and that is the concept of hashing. 430 00:30:26,200 --> 00:30:27,840 ‫Hashing tables appear here. 431 00:30:27,850 --> 00:30:31,690 ‫So in this episode of the show, I'd like to talk about that a little bit. 432 00:30:31,690 --> 00:30:34,630 ‫Distributed, hashing and then. 433 00:30:35,310 --> 00:30:40,620 ‫What problems did we have and how consistent hashing solved this problem? 434 00:30:40,770 --> 00:30:44,120 ‫And obviously nothing is perfect in this hoard. 435 00:30:44,130 --> 00:30:50,250 ‫So I'd also like like to talk about the problems, consistent hashing I actually have today. 436 00:30:50,280 --> 00:30:55,620 ‫Welcome to the back of the energy show with your host say and also and the concept of distributed system 437 00:30:55,620 --> 00:30:58,590 ‫is a must when you get to a certain scale. 438 00:30:58,590 --> 00:31:07,590 ‫Yeah I always try as much as possible to, you know, push people against being distributed if they 439 00:31:07,590 --> 00:31:13,100 ‫can do things to have their single instance, you know, be more performant when it comes to query. 440 00:31:13,110 --> 00:31:21,060 ‫Because you see a lot of a lot of people are engineers hurry up to scale, right. 441 00:31:21,060 --> 00:31:28,440 ‫And spend more money to start to work with a distributor without actually while their query is actually 442 00:31:28,440 --> 00:31:37,650 ‫using 500% of their CPU and then doing like a million logical reads, you know, where they can tweak 443 00:31:37,650 --> 00:31:44,580 ‫it a little bit and tweak it and tune the database a little bit, understand their queries so they can 444 00:31:45,060 --> 00:31:50,160 ‫have or even, you know, lower that cost, you know, as a result. 445 00:31:50,160 --> 00:31:51,870 ‫But we don't think this way anymore. 446 00:31:51,870 --> 00:31:54,600 ‫We always take the shortcut, unfortunately. 447 00:31:54,600 --> 00:31:55,560 ‫But regardless. 448 00:31:55,560 --> 00:32:02,100 ‫So usually advance and adept DBA is try to optimize a single query. 449 00:32:02,100 --> 00:32:09,450 ‫As a result, if you if you can get a query to consume less CPU, even if it's a small query is scaling 450 00:32:09,450 --> 00:32:10,770 ‫that query right. 451 00:32:10,770 --> 00:32:14,070 ‫Will eventually gives you a better scaling on your instance. 452 00:32:14,070 --> 00:32:15,630 ‫But that's not our topic. 453 00:32:15,630 --> 00:32:16,590 ‫That's not today. 454 00:32:16,590 --> 00:32:20,910 ‫But let's say you reached a state where you exhausted all your options. 455 00:32:20,910 --> 00:32:23,980 ‫You know, when it comes to a single instance, right? 456 00:32:24,120 --> 00:32:28,050 ‫Then you moved, it says, hey, I have to move to a distributed system. 457 00:32:28,440 --> 00:32:29,670 ‫It's just too large. 458 00:32:30,210 --> 00:32:30,990 ‫You got to move. 459 00:32:31,110 --> 00:32:32,430 ‫So people what they did. 460 00:32:32,430 --> 00:32:33,120 ‫So it's okay. 461 00:32:33,360 --> 00:32:34,350 ‫I have a key. 462 00:32:35,350 --> 00:32:36,940 ‫And no, I don't know which. 463 00:32:36,940 --> 00:32:39,110 ‫I have ten servers. 464 00:32:39,670 --> 00:32:43,030 ‫I don't know where this QI lives in this ten server environment. 465 00:32:43,030 --> 00:32:48,580 ‫So say, all right, so we're going to do let's say I have four server, I cluster with four servers 466 00:32:48,580 --> 00:32:53,050 ‫and I want to distribute my values across these four servers. 467 00:32:53,050 --> 00:32:57,380 ‫So server as zero as one is two and three, right. 468 00:32:57,430 --> 00:32:59,350 ‫What are we going to do is like giving the key. 469 00:32:59,350 --> 00:33:06,640 ‫I need to know the server name, the server IP, and that's something that we are introducing as a problem. 470 00:33:06,640 --> 00:33:12,580 ‫Like we introduced a friction that didn't exist before first previously. 471 00:33:12,700 --> 00:33:15,070 ‫So I just one server we know the server, right? 472 00:33:15,070 --> 00:33:21,520 ‫But now we have to figure out from the key, we have to figure out the IP address of the server to connect 473 00:33:21,520 --> 00:33:24,640 ‫to in order where our key actually exists. 474 00:33:24,670 --> 00:33:29,830 ‫And that answer can be answered using a very simple hashing function. 475 00:33:29,830 --> 00:33:31,090 ‫So we're going to hash the value. 476 00:33:31,090 --> 00:33:36,940 ‫Let's say I have value number four, going to hash it, get some value and then def value. 477 00:33:36,940 --> 00:33:40,540 ‫We're going to do modulo number four and that gives you server zero. 478 00:33:40,540 --> 00:33:41,080 ‫So. 479 00:33:41,860 --> 00:33:45,430 ‫Server server zero right here in this case. 480 00:33:45,610 --> 00:33:47,140 ‫So four modulo four is zero. 481 00:33:47,140 --> 00:33:50,080 ‫And then, okay, how about key number five, right. 482 00:33:50,680 --> 00:33:53,230 ‫Or even whatever the key used to be. 483 00:33:53,240 --> 00:33:55,930 ‫There may be let's say it's a blue or red. 484 00:33:55,930 --> 00:33:56,770 ‫Right, red. 485 00:33:56,770 --> 00:34:01,420 ‫You're going to hash it, get that value number five, which is a number and then five, modulo four 486 00:34:01,420 --> 00:34:03,820 ‫will give you a swan and it's one. 487 00:34:03,820 --> 00:34:04,780 ‫We'll go right here. 488 00:34:04,780 --> 00:34:05,110 ‫Right. 489 00:34:05,110 --> 00:34:06,700 ‫So five. 490 00:34:06,910 --> 00:34:09,250 ‫Right lives in this one and same thing. 491 00:34:09,250 --> 00:34:13,330 ‫Six will live in a stew and seven will live in as three. 492 00:34:13,360 --> 00:34:18,010 ‫We're just go to the module four because four is your server pool size, right? 493 00:34:18,040 --> 00:34:22,660 ‫And then let's say going back, let's say eight, then eight, modulo four oh, so back to zero and 494 00:34:22,660 --> 00:34:27,310 ‫then you get the point which is, which works perfectly from the key. 495 00:34:28,310 --> 00:34:30,590 ‫I was able to figure out. 496 00:34:32,380 --> 00:34:33,250 ‫The server. 497 00:34:33,370 --> 00:34:35,530 ‫So cost is zero is nothing. 498 00:34:35,560 --> 00:34:38,650 ‫Here's we solve distributed system right there. 499 00:34:40,110 --> 00:34:41,000 ‫Here's one problem, though. 500 00:34:41,010 --> 00:34:46,680 ‫As long as you have four servers, you love his beautiful, you don't have to worry about anything. 501 00:34:47,310 --> 00:34:52,590 ‫But now even those four servers reached their limit again. 502 00:34:52,830 --> 00:34:54,210 ‫Whatever reach the limit. 503 00:34:54,210 --> 00:34:57,250 ‫That means your your at the globe scale. 504 00:34:57,270 --> 00:35:00,210 ‫You know you have so many users, right. 505 00:35:00,210 --> 00:35:07,860 ‫So to me that I need four machine more than four machines database instance that means I am at the global 506 00:35:07,860 --> 00:35:13,020 ‫scale and I have users left and right, millions of heads, you know. 507 00:35:13,020 --> 00:35:15,390 ‫So that's the scale we're talking about, right? 508 00:35:15,390 --> 00:35:21,330 ‫So I'd say, okay, all right, let's just add another server server for S4. 509 00:35:21,330 --> 00:35:25,920 ‫So we have s zero as one as two as three and s four. 510 00:35:25,950 --> 00:35:26,940 ‫Sounds good. 511 00:35:28,050 --> 00:35:29,910 ‫Comes back to the problem of hashing tables. 512 00:35:31,100 --> 00:35:36,680 ‫The moment you change the size that whatever value where original so. 513 00:35:37,490 --> 00:35:45,170 ‫The values, their original key values are now all shuffled right previously value number for use. 514 00:35:45,230 --> 00:35:51,440 ‫You do a for modulo for the key for use to live in server as zero because four modulo four is zero. 515 00:35:51,440 --> 00:35:59,040 ‫But now for modulo five which is now the new servers we have five servers is actually gives you a value 516 00:35:59,040 --> 00:35:59,360 ‫of four. 517 00:35:59,360 --> 00:36:01,760 ‫So it's now as an is four. 518 00:36:02,240 --> 00:36:12,140 ‫So now what you need to do is like move the key with value for from the server zero to server four and 519 00:36:12,140 --> 00:36:14,810 ‫now you have to move everything else. 520 00:36:14,810 --> 00:36:18,860 ‫And this is, this is basically the same thing five modulo five, right? 521 00:36:18,860 --> 00:36:26,800 ‫The server five, the key four will move to server four, the key eight will move to server three. 522 00:36:26,810 --> 00:36:29,570 ‫The key five will move to server zero. 523 00:36:29,600 --> 00:36:37,040 ‫The key six will move to server one and the key seven will move to the key seven will move to S1, right? 524 00:36:37,040 --> 00:36:45,980 ‫So everything will be shuffled and the operation of adding a new server will cost us a huge amount of 525 00:36:45,980 --> 00:36:48,560 ‫effort to kind of shuffle things around. 526 00:36:48,590 --> 00:36:54,770 ‫So look, think of networking, think about database usage just to add another server to not only bothering 527 00:36:54,770 --> 00:37:01,640 ‫one server, you're bothering the entire cluster with your operation because you have to shovel things 528 00:37:01,640 --> 00:37:02,110 ‫on. 529 00:37:02,120 --> 00:37:02,630 ‫Right. 530 00:37:02,870 --> 00:37:04,940 ‫Because the key no longer maps. 531 00:37:05,480 --> 00:37:10,850 ‫So what what people think, what the smart people say, okay, let's invent something that might solve 532 00:37:10,850 --> 00:37:13,190 ‫this problem, which is called constant hashing. 533 00:37:13,190 --> 00:37:15,260 ‫So how does it really work? 534 00:37:15,260 --> 00:37:22,730 ‫So they go they went back to the idea of having so they went back to idea the core idea. 535 00:37:22,760 --> 00:37:26,810 ‫We use the index as the server names here. 536 00:37:26,810 --> 00:37:27,410 ‫Right. 537 00:37:27,980 --> 00:37:29,720 ‫Let's flip this. 538 00:37:29,720 --> 00:37:31,310 ‫Let's change this a little bit. 539 00:37:31,340 --> 00:37:38,720 ‫Let's not be very discrete like oh, server zero one, two, three, four, let's build a ring. 540 00:37:39,170 --> 00:37:41,480 ‫Let's actually build a ring of these servers. 541 00:37:41,480 --> 00:37:41,990 ‫All right. 542 00:37:41,990 --> 00:37:47,090 ‫So instead, they think of this idea as an actual ring, as an actual circle. 543 00:37:47,090 --> 00:37:49,460 ‫So values go back, right. 544 00:37:49,460 --> 00:37:50,060 ‫And. 545 00:37:51,430 --> 00:37:55,480 ‫Remember, a lot of people thought a lot of all this to solve this problem. 546 00:37:55,480 --> 00:38:04,600 ‫And the key here is instead of the output of the key hash going to a server, we're going to approximate 547 00:38:04,600 --> 00:38:09,430 ‫it, you know, by by using this concept of the rank. 548 00:38:09,430 --> 00:38:10,900 ‫So let's take an example. 549 00:38:10,900 --> 00:38:11,950 ‫So clarify it. 550 00:38:12,160 --> 00:38:15,910 ‫The first thing we're going to do is the server rooms themselves. 551 00:38:15,940 --> 00:38:20,380 ‫We need to fit them on the rank, which this is something we didn't do before. 552 00:38:20,380 --> 00:38:23,740 ‫So the ring is think of this ring as, as a circle. 553 00:38:23,740 --> 00:38:24,250 ‫So. 554 00:38:24,980 --> 00:38:26,570 ‫The degree is based on the degrees. 555 00:38:26,570 --> 00:38:28,460 ‫This is a 360 degrees, right? 556 00:38:28,580 --> 00:38:32,570 ‫So a server, let's say server one, you're going to do a module 360. 557 00:38:33,170 --> 00:38:34,940 ‫Let's say we're going to get a zero, right? 558 00:38:34,940 --> 00:38:41,240 ‫So we're going to put server server one and ask we're going to call it zero based on the degree here. 559 00:38:41,850 --> 00:38:46,610 ‫And then let's say server two IP address ten 003 modulo 360. 560 00:38:46,610 --> 00:38:47,900 ‫We got 90, right? 561 00:38:47,900 --> 00:38:53,070 ‫And again, I'm picking and picking values that are so rounded up. 562 00:38:53,090 --> 00:38:57,230 ‫This is just, for example, you never get a gate, you're hardly going to get 90, right? 563 00:38:57,350 --> 00:38:58,350 ‫Exactly right. 564 00:38:58,370 --> 00:39:01,430 ‫You're going to probably get X as 27. 565 00:39:01,610 --> 00:39:03,290 ‫But yeah, x 90. 566 00:39:03,770 --> 00:39:04,310 ‫Same thing. 567 00:39:04,310 --> 00:39:10,670 ‫Server three or server four are going to get S 180 if after you do that and then server three going 568 00:39:10,670 --> 00:39:18,160 ‫to get SW 20 to 70, we have zero S 90, s 180 and SW 270. 569 00:39:18,170 --> 00:39:20,930 ‫You know, that's that completes an actual range. 570 00:39:20,930 --> 00:39:26,270 ‫And for people listening in the podcast, think of this as an actual ring with the values for servers 571 00:39:26,270 --> 00:39:27,980 ‫in each corner effectively. 572 00:39:27,980 --> 00:39:29,420 ‫So why are we doing this? 573 00:39:29,450 --> 00:39:33,950 ‫The beauty here is we have values and this is the key here. 574 00:39:34,700 --> 00:39:38,090 ‫Let's have a key value of 1000. 575 00:39:38,120 --> 00:39:39,830 ‫Same thing you're going to do, right? 576 00:39:40,310 --> 00:39:41,180 ‫The 1000. 577 00:39:41,180 --> 00:39:43,610 ‫The key value thousand. 578 00:39:43,610 --> 00:39:44,210 ‫Right. 579 00:39:44,240 --> 00:39:45,770 ‫Why over this key exists. 580 00:39:45,770 --> 00:39:49,790 ‫Maybe it was blue and then you did a hash and then you got the number thousand. 581 00:39:49,820 --> 00:39:52,100 ‫You take that thousand, right? 582 00:39:52,100 --> 00:39:53,930 ‫And you do a module of 360. 583 00:39:54,020 --> 00:39:57,320 ‫Will you get an actual server immediately? 584 00:39:57,320 --> 00:39:58,010 ‫The server index? 585 00:39:58,010 --> 00:39:59,420 ‫No, you're not going to get that. 586 00:39:59,810 --> 00:40:02,990 ‫So now let's let's actually put it in an action example right here. 587 00:40:02,990 --> 00:40:03,370 ‫Right. 588 00:40:04,340 --> 00:40:06,980 ‫So now I'm about to insert a value. 589 00:40:06,980 --> 00:40:08,690 ‫I'm going to insert a new key. 590 00:40:09,440 --> 00:40:11,240 ‫Value 1500. 591 00:40:11,240 --> 00:40:11,570 ‫That's. 592 00:40:11,900 --> 00:40:12,980 ‫That's my key right there. 593 00:40:12,980 --> 00:40:13,880 ‫So what are you going to do? 594 00:40:13,880 --> 00:40:22,430 ‫Is you going to do modulo 360, which is the circle again don't don't be very specific on the value 595 00:40:22,430 --> 00:40:23,000 ‫360. 596 00:40:23,000 --> 00:40:26,240 ‫You can double this and double this and double this to give more values. 597 00:40:26,240 --> 00:40:28,100 ‫But again, this is just an example. 598 00:40:28,100 --> 00:40:29,240 ‫So 360 here. 599 00:40:29,660 --> 00:40:32,960 ‫So if I take 1500, right. 600 00:40:33,200 --> 00:40:34,820 ‫And module 360. 601 00:40:35,550 --> 00:40:39,840 ‫I'm going to get a value of 60, if you think about it. 602 00:40:39,900 --> 00:40:40,950 ‫60 doesn't exist. 603 00:40:40,950 --> 00:40:43,580 ‫There is no server marked 60. 604 00:40:43,590 --> 00:40:45,600 ‫But here's a trick. 605 00:40:45,600 --> 00:40:49,920 ‫We do have a server mark as zero and we have an SX 90. 606 00:40:49,920 --> 00:40:53,040 ‫So 60 really fits between the zero and the 90. 607 00:40:53,040 --> 00:40:58,440 ‫And what you do is you pick the next one, so 65th between zero and 90 immediately. 608 00:40:58,440 --> 00:41:02,100 ‫So the key will go to the next value, which is s 90. 609 00:41:02,100 --> 00:41:05,730 ‫And that's, that's how you do it basically now. 610 00:41:06,940 --> 00:41:07,990 ‫Visually. 611 00:41:07,990 --> 00:41:11,350 ‫This is easy to understand right computationally. 612 00:41:11,350 --> 00:41:16,100 ‫This is not as simple, because what does that mean between zero and 90? 613 00:41:16,120 --> 00:41:17,630 ‫I have I have four servers. 614 00:41:17,630 --> 00:41:25,690 ‫And that's that's easy to find where where your value how 60 fits between zero and 90 is actually computational. 615 00:41:25,690 --> 00:41:30,850 ‫And that's another binary search to actually find that value that's a little bit outside us outside 616 00:41:30,850 --> 00:41:31,870 ‫the scope of this video. 617 00:41:31,870 --> 00:41:40,240 ‫But now once we have that value, we have the 60 now OC oh 60 is like what's right after 60 is 90 now. 618 00:41:40,240 --> 00:41:40,510 ‫Okay. 619 00:41:40,510 --> 00:41:43,600 ‫So the value of 1500 fits on server is 90. 620 00:41:43,990 --> 00:41:44,950 ‫That's how we solve it. 621 00:41:44,950 --> 00:41:45,700 ‫So. 622 00:41:46,400 --> 00:41:50,720 ‫By doing this scan, what is the next one? 623 00:41:50,900 --> 00:41:56,090 ‫We really remove that discreet value lock up. 624 00:41:56,090 --> 00:42:01,370 ‫You know, we instead changed it with a scan and the beauty of this scan. 625 00:42:02,310 --> 00:42:04,890 ‫It's going to save us a lot of things in the future. 626 00:42:04,920 --> 00:42:06,360 ‫Let's continue this example. 627 00:42:06,390 --> 00:42:07,580 ‫All right, 1500. 628 00:42:07,590 --> 00:42:10,650 ‫Let's take another example as, say, 2000, 2000. 629 00:42:10,650 --> 00:42:18,360 ‫Model 360 is 200, 250 between server s 180 and server is 270 again 180 and 270. 630 00:42:18,360 --> 00:42:19,950 ‫Think of that as degrees, right? 631 00:42:19,950 --> 00:42:28,710 ‫So right there in this case, OC what's exactly right after 200, it's 270, so 2002 70. 632 00:42:29,100 --> 00:42:32,050 ‫So think of it like you're looking at clockwise. 633 00:42:32,050 --> 00:42:34,620 ‫So whatever is the next value, you're going to get that. 634 00:42:34,620 --> 00:42:41,820 ‫And then you put another value, say 3000 modulo 360, that's 121 25th between server 90 and server 635 00:42:41,820 --> 00:42:42,480 ‫180. 636 00:42:42,510 --> 00:42:47,550 ‫So like move directly to the 180 and then you store the value 3000 there. 637 00:42:47,550 --> 00:42:51,270 ‫How about 20,000 modulo 360 is actually gives you 280. 638 00:42:51,300 --> 00:42:51,900 ‫Wait a minute. 639 00:42:51,900 --> 00:42:56,370 ‫280 is the there are no server after the last server. 640 00:42:56,400 --> 00:42:58,980 ‫The largest value is actually is 270. 641 00:42:58,980 --> 00:43:00,780 ‫There is no larger after that. 642 00:43:00,780 --> 00:43:01,060 ‫Right. 643 00:43:01,080 --> 00:43:04,050 ‫There is no larger than two server 270. 644 00:43:04,050 --> 00:43:05,300 ‫This is values to 80. 645 00:43:05,310 --> 00:43:08,700 ‫So what do you do if there is no larger values? 646 00:43:08,700 --> 00:43:10,740 ‫You go back to the circle, right? 647 00:43:10,740 --> 00:43:12,750 ‫And this is very nicely. 648 00:43:12,750 --> 00:43:19,050 ‫When you actually draw the circle, you see that it's two eight is between is 270 and is zero. 649 00:43:19,050 --> 00:43:26,400 ‫So you immediately go and put the value thousand in server zero and that's how you build the ring. 650 00:43:26,430 --> 00:43:29,100 ‫Let's take another example for completeness here. 651 00:43:29,610 --> 00:43:36,480 ‫4000, 4000, modulo 30, 60 gives you 40, 40 is between zero and 90, right? 652 00:43:36,480 --> 00:43:38,100 ‫Oh, that's very nice. 653 00:43:38,100 --> 00:43:39,040 ‫So four. 654 00:43:39,060 --> 00:43:44,250 ‫So now we have two keys lives in server is 90 in this case. 655 00:43:44,770 --> 00:43:46,500 ‫Yeah, that's very interesting. 656 00:43:46,830 --> 00:43:48,630 ‫Let's spice things up now. 657 00:43:48,630 --> 00:43:52,830 ‫What if I added a new server here? 658 00:43:52,830 --> 00:43:55,140 ‫Look at the beauty of this here. 659 00:43:55,170 --> 00:43:58,870 ‫Now, if I had a new server here, let's say it's IP address. 660 00:43:58,870 --> 00:44:01,740 ‫Is that ten zero zero 13. 661 00:44:01,740 --> 00:44:02,280 ‫Right. 662 00:44:02,760 --> 00:44:06,540 ‫I do it modulo 360 and I got 50. 663 00:44:06,540 --> 00:44:12,810 ‫So now my server is actually s 50 fits right between zero and is 90. 664 00:44:13,140 --> 00:44:15,660 ‫Keep that in mind now. 665 00:44:16,620 --> 00:44:18,450 ‫Things change because we. 666 00:44:19,410 --> 00:44:24,360 ‫Still need to move data around, but not as much. 667 00:44:24,390 --> 00:44:25,620 ‫Let's figure that out. 668 00:44:25,620 --> 00:44:26,790 ‫How much data? 669 00:44:27,150 --> 00:44:29,310 ‫I need to move around? 670 00:44:29,310 --> 00:44:32,190 ‫Well, the algorithm goes like this, right? 671 00:44:33,750 --> 00:44:37,530 ‫Remember, if you have values, if you add value says 50. 672 00:44:37,560 --> 00:44:44,640 ‫In this case, anything between the zero and the 50 really needs to be fed in server 50. 673 00:44:44,640 --> 00:44:45,210 ‫Right. 674 00:44:45,210 --> 00:44:46,500 ‫Because that's the algorithm. 675 00:44:46,500 --> 00:44:49,140 ‫That's the range here that we're looking at right here. 676 00:44:49,530 --> 00:44:57,060 ‫And so if you the algorithms look like this, like, okay, so what is the server right after me? 677 00:44:57,420 --> 00:45:00,330 ‫Well, the server right after me is right here is 90, right. 678 00:45:00,840 --> 00:45:09,840 ‫So let's query that server, find out all the hash values now, all the hash values where the actual 679 00:45:09,840 --> 00:45:11,570 ‫hash value is less than 40. 680 00:45:11,580 --> 00:45:15,570 ‫Right, because it's 90 would have stored everything between zero and 90. 681 00:45:15,570 --> 00:45:16,050 ‫Right. 682 00:45:16,050 --> 00:45:22,590 ‫So find out everything that is less than 50 because I should store this now. 683 00:45:22,590 --> 00:45:23,130 ‫Right. 684 00:45:23,250 --> 00:45:27,780 ‫So we found one entry which is 40 oh 40 doesn't belong 90 anymore. 685 00:45:27,780 --> 00:45:33,690 ‫So go ahead and talk to server 90, establish a communication, could be TCP can be anything and then 686 00:45:33,690 --> 00:45:35,220 ‫start moving data around. 687 00:45:35,220 --> 00:45:35,750 ‫Right? 688 00:45:35,760 --> 00:45:40,980 ‫Hey, it's 90 and do you have anything with a hash value that is less than 40? 689 00:45:40,980 --> 00:45:42,770 ‫It's less than 50, which is me. 690 00:45:42,780 --> 00:45:43,200 ‫Oh yeah. 691 00:45:43,200 --> 00:45:45,780 ‫I have this value move it to me. 692 00:45:45,780 --> 00:45:49,200 ‫And as a result, we just establish the data here. 693 00:45:49,200 --> 00:45:53,460 ‫So now think of it as, as I add another server. 694 00:45:54,580 --> 00:45:57,130 ‫The only change is my neighbor. 695 00:45:57,130 --> 00:46:01,660 ‫Really, I'm only going to bother my neighbor and only the neighbor right after me. 696 00:46:01,660 --> 00:46:02,320 ‫Right. 697 00:46:02,470 --> 00:46:05,320 ‫That's that's what I'm going to do in this case. 698 00:46:05,320 --> 00:46:05,740 ‫Right. 699 00:46:05,890 --> 00:46:08,710 ‫And so instead of actually. 700 00:46:09,560 --> 00:46:13,820 ‫Bothering all the servers in my cluster, I'm only bothering one server. 701 00:46:13,820 --> 00:46:17,480 ‫So that's much, much, much better than actual hashing. 702 00:46:17,480 --> 00:46:20,300 ‫But is still there is a cost to it, right? 703 00:46:20,480 --> 00:46:21,980 ‫That is complexity. 704 00:46:21,980 --> 00:46:24,380 ‫You need to build all that out. 705 00:46:24,380 --> 00:46:26,630 ‫What if the operation failed? 706 00:46:26,630 --> 00:46:28,220 ‫How do you roll back? 707 00:46:28,220 --> 00:46:30,770 ‫What if do you add the server any way? 708 00:46:30,770 --> 00:46:33,320 ‫What if one of the keys didn't transmit? 709 00:46:33,320 --> 00:46:35,900 ‫So we're adding so much complexity. 710 00:46:35,900 --> 00:46:39,440 ‫So distributed systems are not easy, you guys, right? 711 00:46:40,070 --> 00:46:44,960 ‫Even if you think about it, let's take another example where we're removing a server. 712 00:46:44,960 --> 00:46:52,910 ‫If we go back to our original case where we have zero as 90 is 180 and it's 27 and let's say I'm about 713 00:46:52,910 --> 00:46:55,070 ‫to remove it's 90 altogether. 714 00:46:55,070 --> 00:47:03,530 ‫So now what happens here if you remove is 90, then anything that is 90 had must go to S1 80 if you 715 00:47:03,530 --> 00:47:04,040 ‫think about it. 716 00:47:04,040 --> 00:47:05,780 ‫Right, because any value. 717 00:47:05,780 --> 00:47:12,290 ‫So now what happens is the operation should go okay remove server is 90 that the operation that we need 718 00:47:12,290 --> 00:47:18,140 ‫to do right I'm going to remove that physically remove it not talking about a crash that's a completely 719 00:47:18,140 --> 00:47:19,520 ‫different story, right? 720 00:47:19,850 --> 00:47:20,190 ‫Crash. 721 00:47:20,210 --> 00:47:21,950 ‫You don't even have time to move anything. 722 00:47:21,950 --> 00:47:22,370 ‫Right. 723 00:47:22,880 --> 00:47:25,850 ‫So if you that's why you have to have redundancy. 724 00:47:25,850 --> 00:47:27,680 ‫That's another complication right there. 725 00:47:27,680 --> 00:47:33,860 ‫But if I have a physical operation, let's say, okay, I want to remove this for maintenance, remove 726 00:47:33,860 --> 00:47:39,290 ‫this 90 well, it's 90 holes and all that is between zero and 90. 727 00:47:39,290 --> 00:47:45,050 ‫So all of those should go directly to the server right after me clockwise, which is 180. 728 00:47:45,050 --> 00:47:50,840 ‫So remove all those puppies and stack them to where to server 180. 729 00:47:50,840 --> 00:47:53,180 ‫That's a very expensive operation as well. 730 00:47:53,180 --> 00:47:57,530 ‫So if you think about it, consistent hashing are powerful is powerful. 731 00:47:57,530 --> 00:47:59,930 ‫The algorithm of consistency are very powerful. 732 00:47:59,990 --> 00:48:07,130 ‫But the limitation here becomes you still need to move data around and the more data you have, you 733 00:48:07,130 --> 00:48:12,740 ‫know, in these instances, then the transmit of these will take more time. 734 00:48:12,740 --> 00:48:19,860 ‫So adding or removing server is actually not a trivial operation, you know, and then you really didn't 735 00:48:19,910 --> 00:48:22,010 ‫think need to think about that. 736 00:48:22,700 --> 00:48:28,490 ‫Another thing that is called replication, you need to duplicate this data as much as possible because 737 00:48:28,550 --> 00:48:29,750 ‫a server might crash. 738 00:48:29,750 --> 00:48:31,220 ‫So you need to have a backup. 739 00:48:31,220 --> 00:48:35,330 ‫So just using this algorithm blindly is not enough. 740 00:48:35,330 --> 00:48:40,700 ‫You have to cater for crashes, you know, and as a result, you have to have a backup. 741 00:48:40,700 --> 00:48:46,550 ‫Oh, if this sort of wasn't unavailable, let's, let's have a mapping server that actually directly 742 00:48:46,550 --> 00:48:48,380 ‫copies that data immediately. 743 00:48:48,380 --> 00:48:48,740 ‫Right? 744 00:48:48,950 --> 00:48:50,150 ‫So that's what you do. 745 00:48:50,270 --> 00:48:51,080 ‫You have to do that. 746 00:48:51,080 --> 00:48:54,290 ‫Well, what if two servers mapped to the same hash? 747 00:48:54,290 --> 00:48:55,010 ‫You can't have it. 748 00:48:55,180 --> 00:49:00,680 ‫If you have a lot of more than 360 servers, you're going to you're bound to have the value that fits 749 00:49:00,680 --> 00:49:01,730 ‫in the same server. 750 00:49:01,730 --> 00:49:04,400 ‫And I'm not sure what you can do here. 751 00:49:04,400 --> 00:49:10,160 ‫You can what you can do is like, I suppose in this case, you can treat it as a as a replication. 752 00:49:10,160 --> 00:49:16,400 ‫So you have like a backup scenario where we have this server, you can either put it in both. 753 00:49:16,970 --> 00:49:18,980 ‫That was an episode of the consistent hashing. 754 00:49:18,980 --> 00:49:20,180 ‫I hope you enjoyed this video. 755 00:49:20,210 --> 00:49:21,480 ‫I'm going to see you in the next one, guys. 756 00:49:21,950 --> 00:49:22,220 ‫Goodbye.