1 00:00:00,090 --> 00:00:05,820 ‫In this video, I want to go through some of the best practices that you can create indexes with so 2 00:00:05,820 --> 00:00:09,420 ‫that you can gain the best performance for your queries. 3 00:00:09,600 --> 00:00:15,150 ‫I'm going to execute four different queries on this beautiful table that I have here. 4 00:00:15,750 --> 00:00:21,960 ‫I have a table of three integers, A, B and C, and I'm going to play with the indexes on A and B, 5 00:00:22,140 --> 00:00:27,720 ‫so I'm going to create a configuration where I have an index on both A and B, I'm going to create a 6 00:00:27,720 --> 00:00:33,210 ‫configuration where I have a composite index on A and B, and I'm going to create a different kind of 7 00:00:33,210 --> 00:00:33,670 ‫configuration. 8 00:00:33,670 --> 00:00:38,430 ‫I want to have a composite index and a normal index on the right and I'm going to execute the following 9 00:00:38,430 --> 00:00:45,930 ‫queries where I'm going to query only on a and a great query only on B, and I'm going to query A and 10 00:00:45,930 --> 00:00:51,540 ‫B and query on A or B, because that's how most of the stuff that we do on the back end. 11 00:00:51,540 --> 00:00:53,990 ‫We query with these particular filters. 12 00:00:54,000 --> 00:00:58,890 ‫I understand we have in the statements we have other kind of aggregates, but we'll take it step by 13 00:00:58,890 --> 00:00:59,210 ‫step. 14 00:00:59,220 --> 00:01:05,100 ‫I you know, something useful with this video that said, let's just jump into this and play with this 15 00:01:05,100 --> 00:01:05,310 ‫thing. 16 00:01:05,490 --> 00:01:12,570 ‫All right, guys, so let's start with creating an index on a at another index on B, it's going to 17 00:01:12,570 --> 00:01:20,370 ‫create an index on this A, by the way, I think I have around 12 million rows in this table, not much, 18 00:01:20,370 --> 00:01:23,660 ‫but enough for us to experiment with. 19 00:01:25,130 --> 00:01:27,980 ‫So I'm going to create another index on B. 20 00:01:29,810 --> 00:01:38,140 ‫All right, all done as a test, so we have an index on an index number, so let's check our queries. 21 00:01:38,600 --> 00:01:45,460 ‫So the first question I want to do is select C from test where A's equals 70. 22 00:01:46,040 --> 00:01:53,050 ‫And I'm going to prefect's this was explain and analyze that will tell us how much of the analysis take 23 00:01:53,270 --> 00:02:02,600 ‫talk and the plan that has been used to do this thing about jump into it, do it fast, faster, this 24 00:02:02,930 --> 00:02:03,840 ‫very faster. 25 00:02:04,520 --> 00:02:04,820 ‫All right. 26 00:02:04,820 --> 00:02:05,480 ‫So what happened? 27 00:02:05,780 --> 00:02:10,550 ‫Obviously, the Postgres decided to use the index. 28 00:02:10,550 --> 00:02:16,040 ‫However, it created a bitmap in order to query this index because there are a lot of rows. 29 00:02:16,040 --> 00:02:17,390 ‫Look at how many rows we have. 30 00:02:17,390 --> 00:02:19,900 ‫Nine hundred nine thousand roll. 31 00:02:20,030 --> 00:02:27,230 ‫So it has to create a bitmap Y because we have to jump back to the heap to pull C. 32 00:02:27,470 --> 00:02:28,860 ‫C is not in the index. 33 00:02:28,860 --> 00:02:30,680 ‫C is not included in the index. 34 00:02:30,830 --> 00:02:35,700 ‫So we have to jump back to the heap to pull the value of C right after looking at. 35 00:02:35,870 --> 00:02:42,530 ‫So we are building a map of all the values, the role, the tuples in order to jump back to the table 36 00:02:42,530 --> 00:02:43,060 ‫and pull that. 37 00:02:43,220 --> 00:02:48,810 ‫And that explains why we have a bitmap index scan scanning the index itself that we just created on 38 00:02:48,830 --> 00:02:54,950 ‫a and then we jump back to the PEOP to actually pull those roles and we have to talk to one hundred 39 00:02:54,960 --> 00:02:56,350 ‫fifty three millisecond. 40 00:02:56,630 --> 00:02:57,770 ‫Not that bad. 41 00:02:58,520 --> 00:02:58,960 ‫All right. 42 00:02:59,210 --> 00:03:05,360 ‫So that's, that's why we have done and what I want to do here and show you that we can easily change 43 00:03:05,360 --> 00:03:12,680 ‫this kind of infer the decision of Postgres by actually limiting the number. 44 00:03:12,680 --> 00:03:12,930 ‫Of course. 45 00:03:12,990 --> 00:03:19,520 ‫So if I say OK, give me only two queries back two rows back, look what the public has decided to do 46 00:03:19,970 --> 00:03:22,340 ‫because we have only two rows came back. 47 00:03:23,210 --> 00:03:27,950 ‫Postgres doesn't need to take the overhead to build a bitmap because building A takes a little bit of 48 00:03:27,950 --> 00:03:28,280 ‫time. 49 00:03:28,590 --> 00:03:30,530 ‫You have to compile this structure. 50 00:03:30,530 --> 00:03:35,570 ‫You have to come up with old values, but it says, Oh, I only found two rows. 51 00:03:35,570 --> 00:03:36,820 ‫I don't need to build a MacNab. 52 00:03:36,830 --> 00:03:39,080 ‫So I decided to do an index scan. 53 00:03:39,080 --> 00:03:43,370 ‫A clean index against these roles are are continuous. 54 00:03:43,370 --> 00:03:44,890 ‫They are order, they are sorted. 55 00:03:45,080 --> 00:03:49,100 ‫So the values that I'm going to get, I don't need I don't need a map for this. 56 00:03:49,100 --> 00:03:54,950 ‫I just come go pull the those values directly from the heap. 57 00:03:54,950 --> 00:03:55,330 ‫All right. 58 00:03:56,540 --> 00:04:00,320 ‫So we use the index about to go to the other query. 59 00:04:00,530 --> 00:04:04,900 ‫My other query is B is equal hundred. 60 00:04:04,940 --> 00:04:07,340 ‫So stop going see what we're being Soquel. 61 00:04:07,340 --> 00:04:08,200 ‫OK, let's take a look. 62 00:04:09,230 --> 00:04:11,480 ‫So it took two hundred and fifty millisecond. 63 00:04:11,660 --> 00:04:13,100 ‫We also use the index. 64 00:04:13,100 --> 00:04:16,820 ‫This time we use the B index because the allocating on B natural. 65 00:04:16,830 --> 00:04:23,450 ‫Now I'm going to use then next one B and then I hit the table directly with that heap scan using the 66 00:04:23,450 --> 00:04:25,280 ‫values that I picked up from B. 67 00:04:26,150 --> 00:04:26,480 ‫Right. 68 00:04:27,050 --> 00:04:28,310 ‫This is also a bad idea. 69 00:04:28,430 --> 00:04:30,950 ‫Right, because you're not going to pull ten thousand rolls. 70 00:04:30,950 --> 00:04:34,520 ‫What are you going to do with ten thousand rolls unless you're doing some sort of aggregate. 71 00:04:34,520 --> 00:04:35,540 ‫That's a different story. 72 00:04:35,930 --> 00:04:38,650 ‫But all right, so we have the value so we're using the index. 73 00:04:38,660 --> 00:04:38,980 ‫All right. 74 00:04:39,020 --> 00:04:39,920 ‫So let's do this. 75 00:04:40,430 --> 00:04:47,570 ‫The query now, but we're going to query on both A and B, so I'm going to do equal, I don't know, 76 00:04:47,570 --> 00:04:49,760 ‫a hundred and B is equal to hundred. 77 00:04:50,600 --> 00:04:52,810 ‫Let's see what world will decide. 78 00:04:52,880 --> 00:04:55,430 ‫So now we're going to do an add between those square. 79 00:04:55,440 --> 00:05:03,980 ‫So both because the size is OK in parallel, I'm going to scan the index and I'm going to scan the B 80 00:05:03,980 --> 00:05:06,080 ‫in this because you you gave me two values. 81 00:05:06,080 --> 00:05:06,380 ‫Right? 82 00:05:06,620 --> 00:05:11,090 ‫And Skåne, those two values, I'm going to scan them and I'm going to build a bitmap. 83 00:05:11,300 --> 00:05:11,570 ‫Right. 84 00:05:11,750 --> 00:05:16,730 ‫And literally you've got a bitmap and then literally just and those bits, maps, you're going to end 85 00:05:16,730 --> 00:05:21,230 ‫up with a certain values, like some values will zero up some voters want. 86 00:05:21,770 --> 00:05:26,330 ‫And then you're going to end up with a set of tuples that you're going to hit the heap and pull the 87 00:05:26,330 --> 00:05:26,750 ‫table. 88 00:05:27,260 --> 00:05:28,610 ‫And that is extremely fast. 89 00:05:29,090 --> 00:05:33,470 ‫As you can see, it's just happened is like how how many values to do really return? 90 00:05:33,470 --> 00:05:33,710 ‫Right. 91 00:05:33,710 --> 00:05:35,000 ‫We return six rows. 92 00:05:35,030 --> 00:05:36,380 ‫That's why it's pretty fast. 93 00:05:36,590 --> 00:05:40,690 ‫And the ending operation is extremely fast as well. 94 00:05:40,730 --> 00:05:41,870 ‫So that's what postcodes do. 95 00:05:41,870 --> 00:05:45,710 ‫In case of an ad, the Steiner, what would an order do? 96 00:05:46,760 --> 00:05:53,240 ‫I'm just going to change this to an R pu with an OR it took obviously longer. 97 00:05:53,240 --> 00:05:59,570 ‫And that explains a lot, obviously, because or always brings more rolls than and. 98 00:05:59,690 --> 00:06:00,140 ‫Right. 99 00:06:00,290 --> 00:06:01,880 ‫And now we did exactly the same. 100 00:06:01,910 --> 00:06:03,440 ‫We scanned this index. 101 00:06:03,440 --> 00:06:12,350 ‫We scanned this index, we brought the bitmap, we ordered them got a lot more rows than we are or where 102 00:06:12,350 --> 00:06:12,950 ‫we wanted. 103 00:06:13,220 --> 00:06:14,420 ‫And then we got back. 104 00:06:14,420 --> 00:06:18,050 ‫That is also pay attention to what this really happened. 105 00:06:18,050 --> 00:06:18,350 ‫Right. 106 00:06:18,500 --> 00:06:23,360 ‫We're doing double the work we're doing, scanning this index scanning index and going to the table, 107 00:06:23,360 --> 00:06:26,500 ‫giving it just rolls and then going to the table to pull these rolls. 108 00:06:26,690 --> 00:06:27,140 ‫All right. 109 00:06:27,630 --> 00:06:28,010 ‫All right. 110 00:06:28,010 --> 00:06:29,090 ‫So now I'm going to. 111 00:06:29,280 --> 00:06:39,540 ‫All of these suckers and I create a composite index instead, as do that just to be all right, both 112 00:06:40,290 --> 00:06:41,150 ‫indexes dropped. 113 00:06:41,380 --> 00:06:50,790 ‫Now going to do is create index on test A B, so this is a composite index a little bit different. 114 00:06:50,790 --> 00:06:51,100 ‫Right. 115 00:06:51,300 --> 00:06:54,200 ‫So we have one index that have both values. 116 00:06:54,330 --> 00:07:00,750 ‫So it's going to take longer to create and it's going to be more efficient for queries that have both 117 00:07:00,750 --> 00:07:07,020 ‫A and B, especially the and cases, because the cases we did a lot of work and guys don't pay attention 118 00:07:07,020 --> 00:07:13,470 ‫to the values that the how long it took, because this is really depends on the machine, depends on 119 00:07:13,470 --> 00:07:14,250 ‫so many things. 120 00:07:14,370 --> 00:07:19,980 ‫I wanted to pay attention to how much work postscripts do, and that's what I'm explaining essentially. 121 00:07:20,670 --> 00:07:21,180 ‫All right. 122 00:07:21,510 --> 00:07:24,580 ‫Let's do the first query on this. 123 00:07:25,230 --> 00:07:30,140 ‫So now I'm going to only query against air and let's see what we're going to do. 124 00:07:30,270 --> 00:07:30,900 ‫Look at that. 125 00:07:31,410 --> 00:07:39,570 ‫Actually, Postgres decided to use the A B index despite me querying on air. 126 00:07:39,630 --> 00:07:40,110 ‫Why? 127 00:07:40,290 --> 00:07:47,370 ‫Because AEL is in the left hand side when I created that and that dang that matters because it's in 128 00:07:47,370 --> 00:07:50,870 ‫the left hand side, the values can be easily scanned. 129 00:07:50,880 --> 00:07:52,440 ‫That's the PostgreSQL rule. 130 00:07:52,500 --> 00:07:55,320 ‫You can scan the values from the left, but you cannot scan of them. 131 00:07:55,320 --> 00:07:55,560 ‫Right. 132 00:07:55,560 --> 00:07:57,840 ‫Just let the index are both left or right. 133 00:07:57,840 --> 00:07:58,190 ‫Right. 134 00:07:58,890 --> 00:08:05,580 ‫And when you build that query, the index decided to use the postgres, decided to use their index and 135 00:08:05,580 --> 00:08:06,840 ‫then jump back to the heap. 136 00:08:06,840 --> 00:08:07,060 ‫Right. 137 00:08:07,080 --> 00:08:11,910 ‫So it took two to fifty millisecond to find all those values. 138 00:08:12,450 --> 00:08:19,350 ‫And I believe if I do a limit, let's say limit ten, as you can see, it's an index scan instead of 139 00:08:19,350 --> 00:08:21,890 ‫a beat with the bitmap. 140 00:08:22,080 --> 00:08:23,640 ‫So it's like a little bit faster. 141 00:08:23,820 --> 00:08:25,430 ‫All right, let's do the B. 142 00:08:25,590 --> 00:08:29,100 ‫What do fossickers do when I do A B? 143 00:08:32,040 --> 00:08:33,990 ‫Oh, you felt that, huh? 144 00:08:34,740 --> 00:08:36,190 ‫It's a battle sequence. 145 00:08:36,570 --> 00:08:40,270 ‫Why Bosco's will not use the index. 146 00:08:40,320 --> 00:08:40,710 ‫Why? 147 00:08:40,710 --> 00:08:48,360 ‫Because you cannot use this composite index when you're scanning a filter that is not the left hand 148 00:08:48,360 --> 00:08:48,580 ‫side. 149 00:08:48,600 --> 00:08:54,250 ‫It's either A and B or A, you cannot use B, so be very careful. 150 00:08:54,480 --> 00:08:59,670 ‫So if you have a composite index on IMDB querying, OMB will not use the index. 151 00:08:59,970 --> 00:09:06,540 ‫So what the table does, what Postgres does is just jump directly to the table and does a full table 152 00:09:06,540 --> 00:09:10,020 ‫scan and using multiple walker threats to do that. 153 00:09:10,050 --> 00:09:10,280 ‫Right. 154 00:09:10,350 --> 00:09:13,210 ‫So obviously very, very slow for very careful. 155 00:09:13,300 --> 00:09:15,060 ‫Very careful with that. 156 00:09:15,210 --> 00:09:15,720 ‫All right. 157 00:09:16,320 --> 00:09:22,890 ‫What happens if I do A and B equals 70 and B equal to whatever values? 158 00:09:23,220 --> 00:09:25,200 ‫Look how vast this thing is. 159 00:09:25,530 --> 00:09:27,730 ‫Point five, half a millisecond. 160 00:09:28,140 --> 00:09:31,080 ‫This is absolutely insane. 161 00:09:31,290 --> 00:09:32,760 ‫That is the best case scenario. 162 00:09:32,760 --> 00:09:39,740 ‫If you really all the time if your queries are all the time like this, have that value add value. 163 00:09:40,290 --> 00:09:43,680 ‫It's a great way to create an index composite index. 164 00:09:43,870 --> 00:09:50,580 ‫If if if you if your application only does that and hopefully only does that and you're also you're 165 00:09:50,580 --> 00:09:56,820 ‫coding in a best case scenario is to create a composite index. 166 00:09:57,070 --> 00:09:57,660 ‫Yeah. 167 00:09:57,690 --> 00:10:03,570 ‫It's more costly to create a composite index compared to a single index because it stores more values 168 00:10:03,570 --> 00:10:04,170 ‫obviously. 169 00:10:04,200 --> 00:10:04,490 ‫Right. 170 00:10:04,860 --> 00:10:08,010 ‫But the performance is remarkable. 171 00:10:08,070 --> 00:10:08,340 ‫Right. 172 00:10:09,120 --> 00:10:09,560 ‫All right. 173 00:10:09,570 --> 00:10:16,000 ‫How about or how about all of this or this scar or. 174 00:10:17,640 --> 00:10:21,880 ‫No, or and or all bets are off. 175 00:10:21,900 --> 00:10:24,060 ‫You cannot use any of the indexes. 176 00:10:24,060 --> 00:10:24,300 ‫Right? 177 00:10:25,110 --> 00:10:30,920 ‫I mean, Postgres decide in this case, it's useless for me to query and try to use that because I know 178 00:10:30,960 --> 00:10:37,650 ‫what might be faster for me just to go and jump to the table and do a full table scan instead. 179 00:10:37,680 --> 00:10:39,960 ‫So that's what both scores and scores does. 180 00:10:40,590 --> 00:10:44,040 ‫Most databases do all these decisions on behalf of us? 181 00:10:44,330 --> 00:10:47,310 ‫OK, maybe it's better to do this, but there it is. 182 00:10:47,520 --> 00:10:51,900 ‫I mean, if you look at this, OK, you have an index, Owney, you can query on a I mean, you have 183 00:10:51,900 --> 00:10:56,570 ‫an index on IMDB, but you can query, you can use the index for a but decided. 184 00:10:56,580 --> 00:10:58,500 ‫No, you know what, it's way slower. 185 00:10:58,500 --> 00:10:59,190 ‫I'm going to do this. 186 00:10:59,190 --> 00:11:02,190 ‫I'm going to do a bit more and then go for B, I don't have an index. 187 00:11:02,190 --> 00:11:08,670 ‫So anyway, I have to go jump there and do the full scan anyway. 188 00:11:08,670 --> 00:11:08,900 ‫Right. 189 00:11:09,180 --> 00:11:13,830 ‫What happens if I do a limit limit one second just as can on this. 190 00:11:13,830 --> 00:11:17,880 ‫It's not even parallel, just sequential because we know we're going to pull only the first row. 191 00:11:18,180 --> 00:11:21,180 ‫That's fast and that's directly on the heap. 192 00:11:21,390 --> 00:11:21,780 ‫All right. 193 00:11:22,830 --> 00:11:23,280 ‫All right, guys. 194 00:11:23,280 --> 00:11:25,650 ‫The first final thing we want to do is. 195 00:11:26,680 --> 00:11:32,830 ‫We know what we know that, B, suffered queries on BSR when I have a composite index, so what I'm 196 00:11:32,830 --> 00:11:37,750 ‫going to do here is show you that I have a composite index on IMDB. 197 00:11:37,840 --> 00:11:39,700 ‫What I'm going to do is create an index. 198 00:11:40,930 --> 00:11:48,820 ‫On test beep in addition to this index and then execute the same queries again and see how things are 199 00:11:48,820 --> 00:11:49,120 ‫doing. 200 00:11:49,330 --> 00:11:56,560 ‫All right, so what will happen now if I query on and we saw this because we're going to use the composite 201 00:11:56,560 --> 00:11:56,970 ‫index. 202 00:11:56,980 --> 00:11:57,610 ‫That's fine. 203 00:11:57,610 --> 00:11:58,650 ‫That's fast, right? 204 00:11:58,660 --> 00:11:59,350 ‫That's nice. 205 00:11:59,620 --> 00:11:59,980 ‫Right. 206 00:12:00,340 --> 00:12:03,340 ‫How about if I use B, if I use B now. 207 00:12:03,660 --> 00:12:07,720 ‫So we're going to use your beautiful index that you just created for us. 208 00:12:07,730 --> 00:12:12,790 ‫So that's kind of a compromise as you have the backing engine, as you database engineer, you can know, 209 00:12:12,860 --> 00:12:15,430 ‫OK, I might add some queries. 210 00:12:15,850 --> 00:12:21,220 ‫I'm going to use the B to query this so I might need an index there. 211 00:12:22,000 --> 00:12:27,160 ‫So you going to create an index in this case to speed up these queries? 212 00:12:27,160 --> 00:12:27,450 ‫Right. 213 00:12:27,670 --> 00:12:33,610 ‫And then obviously we're going to head the table and and the heap in order to do that. 214 00:12:33,640 --> 00:12:34,240 ‫All right. 215 00:12:35,050 --> 00:12:37,510 ‫Let's do A and B and B. 216 00:12:38,980 --> 00:12:41,200 ‫Absolutely genius and fast look at this. 217 00:12:42,150 --> 00:12:47,430 ‫Right, we're going to use the composite index because to be right, right, how about A or B? 218 00:12:51,580 --> 00:12:59,120 ‫Or now that we have an index on B and an index on a B now because oh, what is that? 219 00:12:59,160 --> 00:13:06,100 ‫And actually we have an index and maybe I can query the index on B, do A, build A bitmap and then 220 00:13:06,100 --> 00:13:06,900 ‫I get a query. 221 00:13:07,390 --> 00:13:11,690 ‫Since we're counting on E, I can still use the composite index because it's a left hand side. 222 00:13:11,910 --> 00:13:18,690 ‫I can use the query index and then build my bitmap beautiful but not below single bitmap jump to the 223 00:13:18,690 --> 00:13:21,630 ‫heap and then pull the rolls that I want. 224 00:13:21,870 --> 00:13:30,440 ‫All right, guys, that was some of the best practices that you can do to work with indexes, right? 225 00:13:30,490 --> 00:13:32,360 ‫This is just just some of them, right? 226 00:13:32,400 --> 00:13:35,340 ‫Guys, sometimes indexes can shoot you in the foot. 227 00:13:35,520 --> 00:13:38,550 ‫Sometimes they are great, but just learned how to use them. 228 00:13:39,120 --> 00:13:45,090 ‫I can't possibly in ten minutes explain all possible combinations, but hopefully this helps you kind 229 00:13:45,090 --> 00:13:49,920 ‫of understand what the database is doing well when you are actually acquiring the index. 230 00:13:50,040 --> 00:13:52,110 ‫This is very, very clearly the cause of death and injury. 231 00:13:52,110 --> 00:13:57,840 ‫And this is one of the skill set that you have to put in your belt when you're building a database. 232 00:13:58,230 --> 00:14:00,740 ‫Applications as and should hope you enjoyed this video. 233 00:14:00,960 --> 00:14:03,090 ‫Give it a like if you like it, give it a dislike. 234 00:14:03,090 --> 00:14:05,340 ‫If you don't like it, I'm going to see in the next one. 235 00:14:05,370 --> 00:14:06,160 ‫You guys stay awesome. 236 00:14:06,580 --> 00:14:07,070 ‫Good bye.