1 00:00:00,120 --> 00:00:01,180 ‫What's going on, guys? 2 00:00:01,200 --> 00:00:07,010 ‫My name is Hussein, and in this video I want to discuss indexing, database indexing. 3 00:00:07,020 --> 00:00:14,140 ‫It's a very interesting topic and I think everybody can engineer specifically if you are involved with 4 00:00:14,160 --> 00:00:20,190 ‫databases and relational databases, any kind of databases, you need to know what indexes and how indexes 5 00:00:20,190 --> 00:00:22,650 ‫work and what is the benefit of indexing. 6 00:00:22,660 --> 00:00:26,600 ‫So on this video will describe what is indexing. 7 00:00:26,610 --> 00:00:31,050 ‫We're going to work with indexes, we're going to work without indexing, and we're going to see how 8 00:00:31,050 --> 00:00:35,790 ‫the performance differ, whether you have an index or you don't have index. 9 00:00:35,940 --> 00:00:38,130 ‫So first of all, what is an index? 10 00:00:38,490 --> 00:00:47,940 ‫An index is is a data structure that you build and you assign on top of an existing table that basically 11 00:00:47,940 --> 00:00:57,270 ‫looks through your table and then tries to analyze it and summarizes so that it can create kind of shortcuts. 12 00:00:57,810 --> 00:01:03,930 ‫They the best way I can think of an index is like, if you know these, Fairvale, if you ever seen 13 00:01:03,930 --> 00:01:12,480 ‫a secretary's hand bulk or phone book where you have like thick big book and it is labeled, there is 14 00:01:12,480 --> 00:01:16,470 ‫like a label colored A, B, C, D until Z. 15 00:01:16,470 --> 00:01:16,840 ‫Right. 16 00:01:17,250 --> 00:01:25,530 ‫So the the letter A. starts with all the companies phone books that that start with the letter A and 17 00:01:25,530 --> 00:01:25,880 ‫B.. 18 00:01:26,160 --> 00:01:27,330 ‫So with all the B's. 19 00:01:27,330 --> 00:01:27,610 ‫Right. 20 00:01:27,630 --> 00:01:28,440 ‫And C's all that. 21 00:01:28,560 --> 00:01:35,940 ‫So if you want to look through, find a specific company name, phone number and you know like oh is 22 00:01:35,940 --> 00:01:37,860 ‫it, I don't know, SQL Xebra. 23 00:01:37,980 --> 00:01:40,920 ‫So you go immediately, you go to the Z, right. 24 00:01:40,980 --> 00:01:42,630 ‫And then you start searching there. 25 00:01:42,970 --> 00:01:50,250 ‫That's the simplest thing I can think of that and X Y and basically the index guys are are are two types 26 00:01:50,250 --> 00:01:58,110 ‫so far that we know of something called B three and LSM trees, and I'm not going to go through that. 27 00:01:59,250 --> 00:02:01,890 ‫The depth of how they are constructed. 28 00:02:01,890 --> 00:02:06,810 ‫This is outside of the scope of this video, but essentially it's a, it's a data structure and you 29 00:02:06,810 --> 00:02:07,470 ‫can look it up. 30 00:02:08,280 --> 00:02:12,030 ‫It's basically allowing you to search very effectively. 31 00:02:12,060 --> 00:02:12,450 ‫Right. 32 00:02:13,050 --> 00:02:16,410 ‫And you find quickly what you're looking for. 33 00:02:16,800 --> 00:02:25,500 ‫So guys, here I have a table called employees with around 11 million rows. 34 00:02:26,160 --> 00:02:28,680 ‫That's a quite large table. 35 00:02:29,490 --> 00:02:33,750 ‫So I have an ID field is integer. 36 00:02:34,110 --> 00:02:36,390 ‫It's not null and it's sequential. 37 00:02:36,390 --> 00:02:41,670 ‫So every time I enter a row, I don't touch this field and it will automatically insert and increment 38 00:02:41,700 --> 00:02:42,180 ‫the value. 39 00:02:42,630 --> 00:02:44,310 ‫And it's also a primary keep. 40 00:02:45,060 --> 00:02:45,960 ‫That means unique. 41 00:02:45,960 --> 00:02:46,320 ‫Right. 42 00:02:47,040 --> 00:02:51,030 ‫And it also has an index by default. 43 00:02:51,210 --> 00:02:54,990 ‫Every primary key has an index by default, and it's a B three index. 44 00:02:56,490 --> 00:02:58,680 ‫So that's what I have here. 45 00:02:59,540 --> 00:03:05,960 ‫And I also have a name field on this, which is we do have an index is just a bunch of characters, 46 00:03:06,500 --> 00:03:07,160 ‫nothing fancy. 47 00:03:07,160 --> 00:03:08,420 ‫It's not really the actual name. 48 00:03:08,420 --> 00:03:13,550 ‫I just spent some time generating a hundred million random strength's. 49 00:03:13,550 --> 00:03:13,880 ‫Right. 50 00:03:14,410 --> 00:03:25,730 ‫So so now if I, for example, to clear this up and do select ideas from employees where I'd equal thousand, 51 00:03:27,320 --> 00:03:31,070 ‫I will return the thousands. 52 00:03:31,070 --> 00:03:32,030 ‫I'd write that. 53 00:03:32,060 --> 00:03:33,650 ‫That's not fancy. 54 00:03:33,950 --> 00:03:38,840 ‫But if I do select start, I will get everything back. 55 00:03:39,770 --> 00:03:45,860 ‫I.D. and name and one is a name, so don't don't pay attention to actual name is just this drink and 56 00:03:45,860 --> 00:03:47,150 ‫that's the idea of the strike here. 57 00:03:47,660 --> 00:03:52,910 ‫So guys, what I want to do here is I want to go through a steps of different queries. 58 00:03:52,910 --> 00:03:58,180 ‫I'm going to show you the performance of each of these queries in order to do that in Postgres. 59 00:03:58,430 --> 00:04:07,100 ‫That's the database I'm using, by the way, Postgres, I'm going to do explain, analyze, which will 60 00:04:07,100 --> 00:04:12,140 ‫actually explain the query, then I'm going to do and I'm going to tell me how long it took. 61 00:04:12,170 --> 00:04:12,420 ‫Right. 62 00:04:12,440 --> 00:04:13,870 ‫Which is which is really neat. 63 00:04:14,210 --> 00:04:20,750 ‫So I'm going to do a select ID from employees where ID equal, let's say two thousand because the first 64 00:04:20,750 --> 00:04:21,380 ‫one is cached. 65 00:04:21,380 --> 00:04:23,980 ‫I don't want to mess up the statistics. 66 00:04:24,350 --> 00:04:24,680 ‫All right. 67 00:04:24,690 --> 00:04:25,780 ‫So what happened here guys? 68 00:04:26,360 --> 00:04:28,340 ‫Let's let's decipher what happened. 69 00:04:28,610 --> 00:04:35,420 ‫It says, OK, your predicate filter, you're trying to find ID two thousand. 70 00:04:35,420 --> 00:04:38,590 ‫That's the employee with ID two thousand and an order. 71 00:04:38,600 --> 00:04:46,790 ‫I find that I didn't go through the actual table, 11 million and I searched sequentially. 72 00:04:47,120 --> 00:04:55,820 ‫No, I actually scanned the index, which is way faster because the index is always smaller than the 73 00:04:55,820 --> 00:05:02,360 ‫actual table, because it tells you that OK, it is structured in a way so that if you search and I'm 74 00:05:02,360 --> 00:05:07,790 ‫going to put an animation here a little bit, you can see like if I went if I'd idea number two thousand, 75 00:05:08,300 --> 00:05:13,280 ‫start searching, OK, two thousand is between four thousand and and one thousand. 76 00:05:13,280 --> 00:05:15,950 ‫So it was OK, let me go to this three. 77 00:05:15,950 --> 00:05:21,650 ‫So it goes you go to another three and then search for from another between this number and this number. 78 00:05:21,650 --> 00:05:23,980 ‫So until you actually find the number. 79 00:05:24,020 --> 00:05:24,130 ‫Right. 80 00:05:24,200 --> 00:05:26,120 ‫That's how Beatriz's basically work. 81 00:05:26,360 --> 00:05:26,590 ‫Right. 82 00:05:27,290 --> 00:05:33,250 ‫So now what I did, I only scanned the index, which is very, very effective. 83 00:05:33,280 --> 00:05:33,600 ‫Right. 84 00:05:34,130 --> 00:05:34,990 ‫He pitches. 85 00:05:35,210 --> 00:05:36,950 ‫That's another interesting thing. 86 00:05:36,950 --> 00:05:41,480 ‫So says the value that I actually queried, which is the ID. 87 00:05:41,840 --> 00:05:48,350 ‫I did not have to go to the heap to fetch this information because the ID, since that's the only thing 88 00:05:48,350 --> 00:05:50,480 ‫you're selecting is in the index. 89 00:05:50,600 --> 00:05:52,250 ‫So I just pulled it in line. 90 00:05:52,250 --> 00:05:53,570 ‫That's called inline query. 91 00:05:53,840 --> 00:05:59,630 ‫If the information is available in the index, that is the sweetest query to our database engineer. 92 00:06:00,020 --> 00:06:07,220 ‫If you can put as much information as you can in the index so that as I, as I scan the index and you 93 00:06:07,220 --> 00:06:10,750 ‫only select the ID, that is the sweetest thing you can do. 94 00:06:10,940 --> 00:06:18,800 ‫So the whole thing took point six millisecond and to plan the query and to to study what we go with 95 00:06:19,850 --> 00:06:27,530 ‫the I the the fact of oh should we use the index or should you scan the table is a query is a decision 96 00:06:27,650 --> 00:06:29,000 ‫and that's called the planning time. 97 00:06:29,180 --> 00:06:32,570 ‫The execution is that is the actual work to go and do the work. 98 00:06:32,590 --> 00:06:32,840 ‫Right. 99 00:06:34,190 --> 00:06:35,960 ‫So let's spice things a little bit. 100 00:06:36,170 --> 00:06:39,170 ‫What I want to do here in this case, I want to. 101 00:06:41,510 --> 00:06:50,900 ‫I want to select 3000 in this case, and you can see that point one point eight, what I want to do 102 00:06:50,900 --> 00:06:54,290 ‫is I'm going to change the ID five thousand. 103 00:06:55,040 --> 00:07:00,200 ‫And instead this time I'm going to ask for the name and compare that cost. 104 00:07:00,200 --> 00:07:02,690 ‫Here, guys, look how long it took. 105 00:07:03,260 --> 00:07:10,690 ‫So selecting only the ID, where do you know the idea, which is a Cassilly query, if you ask me, 106 00:07:10,890 --> 00:07:13,910 ‫it's a little bit of silico because you know the idea, why are you just like me? 107 00:07:13,970 --> 00:07:16,010 ‫But it's just for science. 108 00:07:16,100 --> 00:07:17,060 ‫So we're doing this for science. 109 00:07:17,220 --> 00:07:18,800 ‫So it's pretty cool, right? 110 00:07:18,920 --> 00:07:22,300 ‫Point one milliseconds, less than a millisecond, which is quite fast. 111 00:07:22,730 --> 00:07:31,310 ‫However, when we actually selected the name, which is a column in the table, it is not the column 112 00:07:31,310 --> 00:07:32,330 ‫is not in the index. 113 00:07:32,330 --> 00:07:33,350 ‫It is on the table. 114 00:07:33,350 --> 00:07:34,070 ‫It's in the heap. 115 00:07:34,070 --> 00:07:36,160 ‫It's in a different data structure. 116 00:07:36,170 --> 00:07:36,530 ‫Right. 117 00:07:37,790 --> 00:07:45,980 ‫And in this case, it took us to thinking two and a half milliseconds, not not seconds, two and a 118 00:07:45,980 --> 00:07:49,990 ‫half millisecond, which is which is quite fast, but it's still there. 119 00:07:49,990 --> 00:07:50,690 ‫It may speak. 120 00:07:50,690 --> 00:07:51,660 ‫It's a little bit slow. 121 00:07:51,740 --> 00:07:53,840 ‫This is a slower query. 122 00:07:53,840 --> 00:07:59,030 ‫And you guys need to understand when you execute these squares what it is really mean, because I found 123 00:07:59,030 --> 00:08:08,720 ‫the ID in the index, but I had to jump into the table row on disk, which is a different structure. 124 00:08:08,720 --> 00:08:12,290 ‫By the way, the table is in in a different place than the index. 125 00:08:12,530 --> 00:08:15,770 ‫The index is a data structure and the table is a different data structure. 126 00:08:15,950 --> 00:08:16,300 ‫Right. 127 00:08:16,510 --> 00:08:18,230 ‫Table is actually the heavier thing. 128 00:08:18,260 --> 00:08:24,800 ‫We try to avoid going to the table as much as possible because there's eleven stinking million rules 129 00:08:24,800 --> 00:08:25,100 ‫there. 130 00:08:25,400 --> 00:08:28,340 ‫So we try not to touch it as much as well, but sometimes we do. 131 00:08:28,880 --> 00:08:36,130 ‫So what we did is we found the ID and then we actually tried to go and seek to the desk. 132 00:08:36,890 --> 00:08:39,290 ‫I mean I haven't this is this or that is not seek. 133 00:08:39,290 --> 00:08:43,700 ‫We go to the page that has this information and we retrieve it from disk. 134 00:08:44,360 --> 00:08:46,160 ‫So that's another read from that. 135 00:08:46,170 --> 00:08:48,380 ‫So it's a little bit slower, took us to milliseconds. 136 00:08:48,860 --> 00:08:52,670 ‫And obviously if I execute the same thing again, it's faster. 137 00:08:52,670 --> 00:08:53,060 ‫Why? 138 00:08:53,090 --> 00:08:55,830 ‫Because caching because we cache things, cache. 139 00:08:55,910 --> 00:09:03,170 ‫There's so many caches going on, SSD caches, the control is the caches, the database caches, everyone 140 00:09:03,170 --> 00:09:03,650 ‫caches. 141 00:09:03,830 --> 00:09:06,950 ‫So God knows what wise is faster now. 142 00:09:07,190 --> 00:09:09,770 ‫But that's why, because I'm executing the same query. 143 00:09:10,010 --> 00:09:12,380 ‫But if I change this to something else, 144 00:09:15,410 --> 00:09:16,760 ‫I might not get it right. 145 00:09:16,760 --> 00:09:19,210 ‫But you get the idea like two milliseconds again. 146 00:09:20,210 --> 00:09:20,690 ‫All right. 147 00:09:21,320 --> 00:09:22,880 ‫So now I want to spice things a little bit. 148 00:09:23,150 --> 00:09:35,470 ‫I'm going to do explain, analyze, select ideas from employees where name is equal zests. 149 00:09:36,940 --> 00:09:39,970 ‫Let's do it, you guys, are you feeling it? 150 00:09:40,420 --> 00:09:41,360 ‫You feeling it right? 151 00:09:42,080 --> 00:09:44,340 ‫That was slow. 152 00:09:44,740 --> 00:09:48,000 ‫That was that was deeply slow. 153 00:09:48,010 --> 00:09:48,580 ‫Why? 154 00:09:49,090 --> 00:09:54,340 ‫Because that's what the name column does not have an index. 155 00:09:54,640 --> 00:10:00,760 ‫And that means the only way to actually search through the name the value Z. 156 00:10:00,760 --> 00:10:09,050 ‫S is to actually go one by one and do a sequential scan on employees table. 157 00:10:09,160 --> 00:10:10,380 ‫That's the worst thing. 158 00:10:10,390 --> 00:10:11,790 ‫It's called the full table scan. 159 00:10:12,340 --> 00:10:18,730 ‫You want to avoid full table scans as much as possible over tries to be a little bit smarter by executing 160 00:10:19,000 --> 00:10:26,170 ‫multiple threads, worker threads and does does the sequential scan on parallel, which is still a good. 161 00:10:26,890 --> 00:10:30,280 ‫So the planning two point eighty three milliseconds, which is not much because. 162 00:10:30,640 --> 00:10:35,890 ‫Well, does this does this column, which is the name equal, this filter? 163 00:10:36,040 --> 00:10:38,740 ‫Does this have a workload that does this have an index? 164 00:10:38,920 --> 00:10:42,040 ‫No, it does not have an index move on full tables. 165 00:10:42,130 --> 00:10:42,970 ‫It doesn't have a choice. 166 00:10:43,200 --> 00:10:43,540 ‫Right. 167 00:10:43,780 --> 00:10:45,280 ‫So it went through and scanned. 168 00:10:45,400 --> 00:10:49,910 ‫11 million rows were not necessary. 169 00:10:49,940 --> 00:10:51,110 ‫Well, yeah, it isn't. 170 00:10:51,120 --> 00:10:52,540 ‫So they have to scan the whole thing. 171 00:10:52,810 --> 00:10:54,190 ‫Otherwise, how do you know that? 172 00:10:55,120 --> 00:10:55,820 ‫All of them. 173 00:10:55,830 --> 00:10:56,070 ‫Right. 174 00:10:56,090 --> 00:10:59,160 ‫You have to check every single one of them, which is the slowest thing. 175 00:11:00,080 --> 00:11:03,270 ‫How long it took, it took three seconds. 176 00:11:03,720 --> 00:11:04,420 ‫Yikes. 177 00:11:04,850 --> 00:11:05,640 ‫Yikes. 178 00:11:06,090 --> 00:11:08,100 ‫See, that is so slow. 179 00:11:08,120 --> 00:11:08,570 ‫Yeah. 180 00:11:08,570 --> 00:11:12,390 ‫So it is pretty slow, guys, to do this. 181 00:11:12,410 --> 00:11:19,640 ‫Another another thing that we notice all the time, as I used to do this all the time without knowing 182 00:11:20,080 --> 00:11:26,960 ‫this, that is a command and sequel called like like if you don't know, hey, is this before and after 183 00:11:27,200 --> 00:11:33,110 ‫you just add this percentage like this is the war squarish. 184 00:11:34,280 --> 00:11:34,730 ‫Why? 185 00:11:34,730 --> 00:11:42,620 ‫Because it is literally has to go through all the rows and does matching in this case. 186 00:11:42,620 --> 00:11:42,980 ‫Right. 187 00:11:43,260 --> 00:11:47,020 ‫It doesn't really matter because we don't have an index began to show you with an index. 188 00:11:47,030 --> 00:11:48,420 ‫How does it does it do it right. 189 00:11:48,740 --> 00:11:54,260 ‫So what this does is it will return all rows that have anything in the beginning, anything of the end, 190 00:11:54,260 --> 00:12:02,000 ‫but has to have Z capital as small in the middle, slow to go one point one one second to do all that 191 00:12:02,000 --> 00:12:02,450 ‫stuff. 192 00:12:02,450 --> 00:12:02,890 ‫But yeah. 193 00:12:02,900 --> 00:12:04,100 ‫So 11 million row. 194 00:12:04,100 --> 00:12:06,710 ‫But we had two workers to do that work for us. 195 00:12:06,830 --> 00:12:07,400 ‫All right, guys. 196 00:12:07,550 --> 00:12:08,010 ‫All right, guys. 197 00:12:08,020 --> 00:12:16,190 ‫So what I want to do here is actually going to create an index on I'm going to name it employee's name 198 00:12:16,880 --> 00:12:21,370 ‫on employees and it is on the name column. 199 00:12:21,680 --> 00:12:24,770 ‫So this was going to take some time to create. 200 00:12:24,770 --> 00:12:25,220 ‫Why? 201 00:12:25,490 --> 00:12:26,510 ‫Because guess what, guys? 202 00:12:26,510 --> 00:12:31,610 ‫It is building the B three bitmap index on top of all the stuff. 203 00:12:31,610 --> 00:12:32,040 ‫Right. 204 00:12:32,240 --> 00:12:38,990 ‫So it has to fit all the 11 million rows and does all this magic to actually create the fancy 205 00:12:41,300 --> 00:12:41,920 ‫index. 206 00:12:43,040 --> 00:12:43,820 ‫So that's a way for it. 207 00:12:44,960 --> 00:12:48,470 ‫All right, guys, we've finished creating our index. 208 00:12:48,780 --> 00:12:55,080 ‫So now let's try the same query that we have done just with all the like, just directly equal. 209 00:12:56,210 --> 00:12:59,780 ‫So now if I do this query, look how fast it is. 210 00:13:00,050 --> 00:13:01,400 ‫Still a little bit slow. 211 00:13:01,550 --> 00:13:03,080 ‫Forty seven millisecond. 212 00:13:03,080 --> 00:13:06,820 ‫But damn, that was quick. 213 00:13:06,830 --> 00:13:07,160 ‫Why? 214 00:13:07,160 --> 00:13:11,810 ‫Because now we're doing a bitmap HEB scan on employees. 215 00:13:11,810 --> 00:13:12,130 ‫Right. 216 00:13:12,290 --> 00:13:17,420 ‫So we're using the actual bitmap index scan on employee's name. 217 00:13:17,720 --> 00:13:22,850 ‫So we're actually scanning the employee's name index that we created. 218 00:13:23,000 --> 00:13:31,450 ‫And this is way faster because we have fewer rows to work with and we're going to actually jump between 219 00:13:31,460 --> 00:13:36,350 ‫few rows until we find exactly what we are looking for. 220 00:13:36,610 --> 00:13:36,840 ‫Right. 221 00:13:37,530 --> 00:13:44,540 ‫And obviously, guys, the same thing applies where A, if you only PILT pulled the name, which is 222 00:13:44,540 --> 00:13:48,640 ‫already in this index, is going to be faster than pulling the idea in the name. 223 00:13:49,100 --> 00:13:51,380 ‫I'm not quite sure about this. 224 00:13:51,410 --> 00:13:59,330 ‫Maybe this actually doesn't hit the heap because the index, the primary key I know this is a MySQL. 225 00:13:59,330 --> 00:14:07,850 ‫I'm not sure about Bosco's, but the primary key is usually stored with every single index in in in 226 00:14:08,030 --> 00:14:09,200 ‫in other databases. 227 00:14:09,200 --> 00:14:09,390 ‫Right. 228 00:14:09,410 --> 00:14:12,530 ‫So the name index here, the primary will be store. 229 00:14:12,530 --> 00:14:19,910 ‫So pulling the primary key from the name index will be fast because it's it's right there, right with 230 00:14:19,910 --> 00:14:20,050 ‫it. 231 00:14:20,300 --> 00:14:25,520 ‫So however, if you're if you have more than other tables, then we had to go to the table structure 232 00:14:25,520 --> 00:14:26,880 ‫and that will be a little bit slow. 233 00:14:27,080 --> 00:14:32,120 ‫All right, guys, so what will happen if I add like here? 234 00:14:32,210 --> 00:14:33,380 ‫What do you think will happen? 235 00:14:36,270 --> 00:14:40,630 ‫Back to this local area, back to the exact same local area. 236 00:14:40,670 --> 00:14:48,530 ‫Guys, why let's call this again as the let's do something else so we don't have we had the cash like 237 00:14:48,540 --> 00:14:51,150 ‫this back to the same slow query. 238 00:14:51,150 --> 00:14:51,600 ‫Why? 239 00:14:51,840 --> 00:14:55,050 ‫Because we could not scan the index. 240 00:14:55,050 --> 00:14:55,560 ‫Why? 241 00:14:55,770 --> 00:14:58,570 ‫This is this is a lot of people make this mistake. 242 00:14:58,950 --> 00:14:59,430 ‫Yeah. 243 00:14:59,580 --> 00:15:02,520 ‫The name column has an index. 244 00:15:02,850 --> 00:15:08,550 ‫But what you did is you're not actually asking for a for literal value. 245 00:15:08,880 --> 00:15:11,100 ‫You're asking for an expression. 246 00:15:11,670 --> 00:15:20,160 ‫And there is no index that will satisfy this expression because we cannot search the index on this on 247 00:15:20,160 --> 00:15:23,750 ‫this expression because this is not a single value. 248 00:15:24,270 --> 00:15:30,510 ‫So the planner quickly detects this, says, OK, planning time, we check less than then. 249 00:15:30,900 --> 00:15:32,790 ‫Sorry, the filter sucks. 250 00:15:33,480 --> 00:15:40,590 ‫And that's why you have to do explain on your queries and see why are they actually slow because of 251 00:15:40,590 --> 00:15:40,950 ‫this? 252 00:15:41,190 --> 00:15:42,270 ‫Because look at this. 253 00:15:42,270 --> 00:15:44,700 ‫We did a parallel sequential scan. 254 00:15:44,700 --> 00:15:45,480 ‫Again, parallel. 255 00:15:45,540 --> 00:15:49,380 ‫Not all databases do parallel scanning, by the way, but possibly still. 256 00:15:50,280 --> 00:15:51,330 ‫We we had to do it. 257 00:15:51,330 --> 00:15:56,980 ‫We had to go to the thinking table and scan my 11 million rolls if I have a billion rows, is going 258 00:15:56,980 --> 00:15:58,340 ‫to be even slower than that. 259 00:15:58,740 --> 00:16:00,930 ‫So we are going through all that stuff. 260 00:16:00,930 --> 00:16:04,230 ‫And I'm going to explain all this number and then other videos, because I don't want to make this video 261 00:16:04,230 --> 00:16:04,860 ‫longer than that. 262 00:16:04,860 --> 00:16:08,210 ‫I'm going to make an advance video discussing these actual numbers. 263 00:16:08,240 --> 00:16:08,850 ‫What do they mean? 264 00:16:08,850 --> 00:16:09,480 ‫All that stuff. 265 00:16:09,480 --> 00:16:13,320 ‫But but yeah, we did a parallel scan on employees. 266 00:16:13,320 --> 00:16:16,740 ‫Guys, look at that is so slow we could undo in lining. 267 00:16:16,890 --> 00:16:18,660 ‫We couldn't do any optimizations. 268 00:16:19,000 --> 00:16:20,430 ‫This is an expression, obviously. 269 00:16:20,430 --> 00:16:22,920 ‫So that's we had to we had to do all that stuff. 270 00:16:22,920 --> 00:16:24,240 ‫So we took a hit. 271 00:16:24,510 --> 00:16:30,680 ‫We took a hit despite us having an index, guys, despite X having an index. 272 00:16:30,720 --> 00:16:36,660 ‫So, guys, that's very quickly indexing in a nutshell in a very, very quick video. 273 00:16:36,660 --> 00:16:37,980 ‫I wanted to explain of that. 274 00:16:38,160 --> 00:16:43,260 ‫Yeah, having an index does not mean that the database will always use it. 275 00:16:43,260 --> 00:16:48,180 ‫It's going to plan and according to the planner, will decide to use the index or not. 276 00:16:48,330 --> 00:16:56,220 ‫It's up to you as as the engineer who is executing queries against the database to actually give hints 277 00:16:56,220 --> 00:16:58,920 ‫to the database to actually use the index or not. 278 00:16:59,490 --> 00:17:01,680 ‫This is a bad query in general. 279 00:17:01,830 --> 00:17:04,290 ‫Select Star is a bad query in general. 280 00:17:04,290 --> 00:17:04,770 ‫Why? 281 00:17:04,920 --> 00:17:12,270 ‫Because if because if you think about the cost to go to the actual disk and pull all the information. 282 00:17:13,520 --> 00:17:15,350 ‫That is expensive, right? 283 00:17:15,770 --> 00:17:22,490 ‫Really, really depends on how much like, let's say you have a blob or column on the field, on the 284 00:17:22,490 --> 00:17:24,730 ‫table that is expensive to pull up. 285 00:17:24,740 --> 00:17:26,600 ‫So if you don't need it, don't ask for it. 286 00:17:26,960 --> 00:17:27,290 ‫Right. 287 00:17:27,440 --> 00:17:30,320 ‫And only ask for things that are absolutely there. 288 00:17:30,650 --> 00:17:34,700 ‫And if you can ask for things that are in the index. 289 00:17:34,700 --> 00:17:39,020 ‫Right, I'm going to talk about that in another video where is like a multi column index where you can 290 00:17:39,260 --> 00:17:44,150 ‫add other columns in the index that you're about to create. 291 00:17:44,180 --> 00:17:51,200 ‫This is called In Lining where where as I search my beautiful index, my neat, efficient index, I 292 00:17:51,200 --> 00:17:53,050 ‫want their name, for example. 293 00:17:53,330 --> 00:17:53,530 ‫Right. 294 00:17:53,540 --> 00:17:55,580 ‫That's bad example, but I want the name. 295 00:17:55,580 --> 00:18:01,220 ‫I'm searching for the ID and I don't want the name or the name is right there is just literally sitting 296 00:18:01,220 --> 00:18:02,060 ‫in my index. 297 00:18:02,060 --> 00:18:03,750 ‫So I just put and retrieve it to the user. 298 00:18:03,770 --> 00:18:10,040 ‫I don't have to go back on disk and find the location, seek and desk if that's a mechanical drive or 299 00:18:10,370 --> 00:18:12,230 ‫pull the page in case of SSD. 300 00:18:13,370 --> 00:18:15,010 ‫All right, guys, that's it for me today. 301 00:18:15,200 --> 00:18:16,430 ‫I'm going to see you in the next one. 302 00:18:16,580 --> 00:18:17,510 ‫You guys stay awesome. 303 00:18:17,690 --> 00:18:18,110 ‫Goodbye.