1
00:00:00,120 --> 00:00:01,180
‫What's going on, guys?

2
00:00:01,200 --> 00:00:07,010
‫My name is Hussein, and in this video I want to discuss indexing, database indexing.

3
00:00:07,020 --> 00:00:14,140
‫It's a very interesting topic and I think everybody can engineer specifically if you are involved with

4
00:00:14,160 --> 00:00:20,190
‫databases and relational databases, any kind of databases, you need to know what indexes and how indexes

5
00:00:20,190 --> 00:00:22,650
‫work and what is the benefit of indexing.

6
00:00:22,660 --> 00:00:26,600
‫So on this video will describe what is indexing.

7
00:00:26,610 --> 00:00:31,050
‫We're going to work with indexes, we're going to work without indexing, and we're going to see how

8
00:00:31,050 --> 00:00:35,790
‫the performance differ, whether you have an index or you don't have index.

9
00:00:35,940 --> 00:00:38,130
‫So first of all, what is an index?

10
00:00:38,490 --> 00:00:47,940
‫An index is is a data structure that you build and you assign on top of an existing table that basically

11
00:00:47,940 --> 00:00:57,270
‫looks through your table and then tries to analyze it and summarizes so that it can create kind of shortcuts.

12
00:00:57,810 --> 00:01:03,930
‫They the best way I can think of an index is like, if you know these, Fairvale, if you ever seen

13
00:01:03,930 --> 00:01:12,480
‫a secretary's hand bulk or phone book where you have like thick big book and it is labeled, there is

14
00:01:12,480 --> 00:01:16,470
‫like a label colored A, B, C, D until Z.

15
00:01:16,470 --> 00:01:16,840
‫Right.

16
00:01:17,250 --> 00:01:25,530
‫So the the letter A. starts with all the companies phone books that that start with the letter A and

17
00:01:25,530 --> 00:01:25,880
‫B..

18
00:01:26,160 --> 00:01:27,330
‫So with all the B's.

19
00:01:27,330 --> 00:01:27,610
‫Right.

20
00:01:27,630 --> 00:01:28,440
‫And C's all that.

21
00:01:28,560 --> 00:01:35,940
‫So if you want to look through, find a specific company name, phone number and you know like oh is

22
00:01:35,940 --> 00:01:37,860
‫it, I don't know, SQL Xebra.

23
00:01:37,980 --> 00:01:40,920
‫So you go immediately, you go to the Z, right.

24
00:01:40,980 --> 00:01:42,630
‫And then you start searching there.

25
00:01:42,970 --> 00:01:50,250
‫That's the simplest thing I can think of that and X Y and basically the index guys are are are two types

26
00:01:50,250 --> 00:01:58,110
‫so far that we know of something called B three and LSM trees, and I'm not going to go through that.

27
00:01:59,250 --> 00:02:01,890
‫The depth of how they are constructed.

28
00:02:01,890 --> 00:02:06,810
‫This is outside of the scope of this video, but essentially it's a, it's a data structure and you

29
00:02:06,810 --> 00:02:07,470
‫can look it up.

30
00:02:08,280 --> 00:02:12,030
‫It's basically allowing you to search very effectively.

31
00:02:12,060 --> 00:02:12,450
‫Right.

32
00:02:13,050 --> 00:02:16,410
‫And you find quickly what you're looking for.

33
00:02:16,800 --> 00:02:25,500
‫So guys, here I have a table called employees with around 11 million rows.

34
00:02:26,160 --> 00:02:28,680
‫That's a quite large table.

35
00:02:29,490 --> 00:02:33,750
‫So I have an ID field is integer.

36
00:02:34,110 --> 00:02:36,390
‫It's not null and it's sequential.

37
00:02:36,390 --> 00:02:41,670
‫So every time I enter a row, I don't touch this field and it will automatically insert and increment

38
00:02:41,700 --> 00:02:42,180
‫the value.

39
00:02:42,630 --> 00:02:44,310
‫And it's also a primary keep.

40
00:02:45,060 --> 00:02:45,960
‫That means unique.

41
00:02:45,960 --> 00:02:46,320
‫Right.

42
00:02:47,040 --> 00:02:51,030
‫And it also has an index by default.

43
00:02:51,210 --> 00:02:54,990
‫Every primary key has an index by default, and it's a B three index.

44
00:02:56,490 --> 00:02:58,680
‫So that's what I have here.

45
00:02:59,540 --> 00:03:05,960
‫And I also have a name field on this, which is we do have an index is just a bunch of characters,

46
00:03:06,500 --> 00:03:07,160
‫nothing fancy.

47
00:03:07,160 --> 00:03:08,420
‫It's not really the actual name.

48
00:03:08,420 --> 00:03:13,550
‫I just spent some time generating a hundred million random strength's.

49
00:03:13,550 --> 00:03:13,880
‫Right.

50
00:03:14,410 --> 00:03:25,730
‫So so now if I, for example, to clear this up and do select ideas from employees where I'd equal thousand,

51
00:03:27,320 --> 00:03:31,070
‫I will return the thousands.

52
00:03:31,070 --> 00:03:32,030
‫I'd write that.

53
00:03:32,060 --> 00:03:33,650
‫That's not fancy.

54
00:03:33,950 --> 00:03:38,840
‫But if I do select start, I will get everything back.

55
00:03:39,770 --> 00:03:45,860
‫I.D. and name and one is a name, so don't don't pay attention to actual name is just this drink and

56
00:03:45,860 --> 00:03:47,150
‫that's the idea of the strike here.

57
00:03:47,660 --> 00:03:52,910
‫So guys, what I want to do here is I want to go through a steps of different queries.

58
00:03:52,910 --> 00:03:58,180
‫I'm going to show you the performance of each of these queries in order to do that in Postgres.

59
00:03:58,430 --> 00:04:07,100
‫That's the database I'm using, by the way, Postgres, I'm going to do explain, analyze, which will

60
00:04:07,100 --> 00:04:12,140
‫actually explain the query, then I'm going to do and I'm going to tell me how long it took.

61
00:04:12,170 --> 00:04:12,420
‫Right.

62
00:04:12,440 --> 00:04:13,870
‫Which is which is really neat.

63
00:04:14,210 --> 00:04:20,750
‫So I'm going to do a select ID from employees where ID equal, let's say two thousand because the first

64
00:04:20,750 --> 00:04:21,380
‫one is cached.

65
00:04:21,380 --> 00:04:23,980
‫I don't want to mess up the statistics.

66
00:04:24,350 --> 00:04:24,680
‫All right.

67
00:04:24,690 --> 00:04:25,780
‫So what happened here guys?

68
00:04:26,360 --> 00:04:28,340
‫Let's let's decipher what happened.

69
00:04:28,610 --> 00:04:35,420
‫It says, OK, your predicate filter, you're trying to find ID two thousand.

70
00:04:35,420 --> 00:04:38,590
‫That's the employee with ID two thousand and an order.

71
00:04:38,600 --> 00:04:46,790
‫I find that I didn't go through the actual table, 11 million and I searched sequentially.

72
00:04:47,120 --> 00:04:55,820
‫No, I actually scanned the index, which is way faster because the index is always smaller than the

73
00:04:55,820 --> 00:05:02,360
‫actual table, because it tells you that OK, it is structured in a way so that if you search and I'm

74
00:05:02,360 --> 00:05:07,790
‫going to put an animation here a little bit, you can see like if I went if I'd idea number two thousand,

75
00:05:08,300 --> 00:05:13,280
‫start searching, OK, two thousand is between four thousand and and one thousand.

76
00:05:13,280 --> 00:05:15,950
‫So it was OK, let me go to this three.

77
00:05:15,950 --> 00:05:21,650
‫So it goes you go to another three and then search for from another between this number and this number.

78
00:05:21,650 --> 00:05:23,980
‫So until you actually find the number.

79
00:05:24,020 --> 00:05:24,130
‫Right.

80
00:05:24,200 --> 00:05:26,120
‫That's how Beatriz's basically work.

81
00:05:26,360 --> 00:05:26,590
‫Right.

82
00:05:27,290 --> 00:05:33,250
‫So now what I did, I only scanned the index, which is very, very effective.

83
00:05:33,280 --> 00:05:33,600
‫Right.

84
00:05:34,130 --> 00:05:34,990
‫He pitches.

85
00:05:35,210 --> 00:05:36,950
‫That's another interesting thing.

86
00:05:36,950 --> 00:05:41,480
‫So says the value that I actually queried, which is the ID.

87
00:05:41,840 --> 00:05:48,350
‫I did not have to go to the heap to fetch this information because the ID, since that's the only thing

88
00:05:48,350 --> 00:05:50,480
‫you're selecting is in the index.

89
00:05:50,600 --> 00:05:52,250
‫So I just pulled it in line.

90
00:05:52,250 --> 00:05:53,570
‫That's called inline query.

91
00:05:53,840 --> 00:05:59,630
‫If the information is available in the index, that is the sweetest query to our database engineer.

92
00:06:00,020 --> 00:06:07,220
‫If you can put as much information as you can in the index so that as I, as I scan the index and you

93
00:06:07,220 --> 00:06:10,750
‫only select the ID, that is the sweetest thing you can do.

94
00:06:10,940 --> 00:06:18,800
‫So the whole thing took point six millisecond and to plan the query and to to study what we go with

95
00:06:19,850 --> 00:06:27,530
‫the I the the fact of oh should we use the index or should you scan the table is a query is a decision

96
00:06:27,650 --> 00:06:29,000
‫and that's called the planning time.

97
00:06:29,180 --> 00:06:32,570
‫The execution is that is the actual work to go and do the work.

98
00:06:32,590 --> 00:06:32,840
‫Right.

99
00:06:34,190 --> 00:06:35,960
‫So let's spice things a little bit.

100
00:06:36,170 --> 00:06:39,170
‫What I want to do here in this case, I want to.

101
00:06:41,510 --> 00:06:50,900
‫I want to select 3000 in this case, and you can see that point one point eight, what I want to do

102
00:06:50,900 --> 00:06:54,290
‫is I'm going to change the ID five thousand.

103
00:06:55,040 --> 00:07:00,200
‫And instead this time I'm going to ask for the name and compare that cost.

104
00:07:00,200 --> 00:07:02,690
‫Here, guys, look how long it took.

105
00:07:03,260 --> 00:07:10,690
‫So selecting only the ID, where do you know the idea, which is a Cassilly query, if you ask me,

106
00:07:10,890 --> 00:07:13,910
‫it's a little bit of silico because you know the idea, why are you just like me?

107
00:07:13,970 --> 00:07:16,010
‫But it's just for science.

108
00:07:16,100 --> 00:07:17,060
‫So we're doing this for science.

109
00:07:17,220 --> 00:07:18,800
‫So it's pretty cool, right?

110
00:07:18,920 --> 00:07:22,300
‫Point one milliseconds, less than a millisecond, which is quite fast.

111
00:07:22,730 --> 00:07:31,310
‫However, when we actually selected the name, which is a column in the table, it is not the column

112
00:07:31,310 --> 00:07:32,330
‫is not in the index.

113
00:07:32,330 --> 00:07:33,350
‫It is on the table.

114
00:07:33,350 --> 00:07:34,070
‫It's in the heap.

115
00:07:34,070 --> 00:07:36,160
‫It's in a different data structure.

116
00:07:36,170 --> 00:07:36,530
‫Right.

117
00:07:37,790 --> 00:07:45,980
‫And in this case, it took us to thinking two and a half milliseconds, not not seconds, two and a

118
00:07:45,980 --> 00:07:49,990
‫half millisecond, which is which is quite fast, but it's still there.

119
00:07:49,990 --> 00:07:50,690
‫It may speak.

120
00:07:50,690 --> 00:07:51,660
‫It's a little bit slow.

121
00:07:51,740 --> 00:07:53,840
‫This is a slower query.

122
00:07:53,840 --> 00:07:59,030
‫And you guys need to understand when you execute these squares what it is really mean, because I found

123
00:07:59,030 --> 00:08:08,720
‫the ID in the index, but I had to jump into the table row on disk, which is a different structure.

124
00:08:08,720 --> 00:08:12,290
‫By the way, the table is in in a different place than the index.

125
00:08:12,530 --> 00:08:15,770
‫The index is a data structure and the table is a different data structure.

126
00:08:15,950 --> 00:08:16,300
‫Right.

127
00:08:16,510 --> 00:08:18,230
‫Table is actually the heavier thing.

128
00:08:18,260 --> 00:08:24,800
‫We try to avoid going to the table as much as possible because there's eleven stinking million rules

129
00:08:24,800 --> 00:08:25,100
‫there.

130
00:08:25,400 --> 00:08:28,340
‫So we try not to touch it as much as well, but sometimes we do.

131
00:08:28,880 --> 00:08:36,130
‫So what we did is we found the ID and then we actually tried to go and seek to the desk.

132
00:08:36,890 --> 00:08:39,290
‫I mean I haven't this is this or that is not seek.

133
00:08:39,290 --> 00:08:43,700
‫We go to the page that has this information and we retrieve it from disk.

134
00:08:44,360 --> 00:08:46,160
‫So that's another read from that.

135
00:08:46,170 --> 00:08:48,380
‫So it's a little bit slower, took us to milliseconds.

136
00:08:48,860 --> 00:08:52,670
‫And obviously if I execute the same thing again, it's faster.

137
00:08:52,670 --> 00:08:53,060
‫Why?

138
00:08:53,090 --> 00:08:55,830
‫Because caching because we cache things, cache.

139
00:08:55,910 --> 00:09:03,170
‫There's so many caches going on, SSD caches, the control is the caches, the database caches, everyone

140
00:09:03,170 --> 00:09:03,650
‫caches.

141
00:09:03,830 --> 00:09:06,950
‫So God knows what wise is faster now.

142
00:09:07,190 --> 00:09:09,770
‫But that's why, because I'm executing the same query.

143
00:09:10,010 --> 00:09:12,380
‫But if I change this to something else,

144
00:09:15,410 --> 00:09:16,760
‫I might not get it right.

145
00:09:16,760 --> 00:09:19,210
‫But you get the idea like two milliseconds again.

146
00:09:20,210 --> 00:09:20,690
‫All right.

147
00:09:21,320 --> 00:09:22,880
‫So now I want to spice things a little bit.

148
00:09:23,150 --> 00:09:35,470
‫I'm going to do explain, analyze, select ideas from employees where name is equal zests.

149
00:09:36,940 --> 00:09:39,970
‫Let's do it, you guys, are you feeling it?

150
00:09:40,420 --> 00:09:41,360
‫You feeling it right?

151
00:09:42,080 --> 00:09:44,340
‫That was slow.

152
00:09:44,740 --> 00:09:48,000
‫That was that was deeply slow.

153
00:09:48,010 --> 00:09:48,580
‫Why?

154
00:09:49,090 --> 00:09:54,340
‫Because that's what the name column does not have an index.

155
00:09:54,640 --> 00:10:00,760
‫And that means the only way to actually search through the name the value Z.

156
00:10:00,760 --> 00:10:09,050
‫S is to actually go one by one and do a sequential scan on employees table.

157
00:10:09,160 --> 00:10:10,380
‫That's the worst thing.

158
00:10:10,390 --> 00:10:11,790
‫It's called the full table scan.

159
00:10:12,340 --> 00:10:18,730
‫You want to avoid full table scans as much as possible over tries to be a little bit smarter by executing

160
00:10:19,000 --> 00:10:26,170
‫multiple threads, worker threads and does does the sequential scan on parallel, which is still a good.

161
00:10:26,890 --> 00:10:30,280
‫So the planning two point eighty three milliseconds, which is not much because.

162
00:10:30,640 --> 00:10:35,890
‫Well, does this does this column, which is the name equal, this filter?

163
00:10:36,040 --> 00:10:38,740
‫Does this have a workload that does this have an index?

164
00:10:38,920 --> 00:10:42,040
‫No, it does not have an index move on full tables.

165
00:10:42,130 --> 00:10:42,970
‫It doesn't have a choice.

166
00:10:43,200 --> 00:10:43,540
‫Right.

167
00:10:43,780 --> 00:10:45,280
‫So it went through and scanned.

168
00:10:45,400 --> 00:10:49,910
‫11 million rows were not necessary.

169
00:10:49,940 --> 00:10:51,110
‫Well, yeah, it isn't.

170
00:10:51,120 --> 00:10:52,540
‫So they have to scan the whole thing.

171
00:10:52,810 --> 00:10:54,190
‫Otherwise, how do you know that?

172
00:10:55,120 --> 00:10:55,820
‫All of them.

173
00:10:55,830 --> 00:10:56,070
‫Right.

174
00:10:56,090 --> 00:10:59,160
‫You have to check every single one of them, which is the slowest thing.

175
00:11:00,080 --> 00:11:03,270
‫How long it took, it took three seconds.

176
00:11:03,720 --> 00:11:04,420
‫Yikes.

177
00:11:04,850 --> 00:11:05,640
‫Yikes.

178
00:11:06,090 --> 00:11:08,100
‫See, that is so slow.

179
00:11:08,120 --> 00:11:08,570
‫Yeah.

180
00:11:08,570 --> 00:11:12,390
‫So it is pretty slow, guys, to do this.

181
00:11:12,410 --> 00:11:19,640
‫Another another thing that we notice all the time, as I used to do this all the time without knowing

182
00:11:20,080 --> 00:11:26,960
‫this, that is a command and sequel called like like if you don't know, hey, is this before and after

183
00:11:27,200 --> 00:11:33,110
‫you just add this percentage like this is the war squarish.

184
00:11:34,280 --> 00:11:34,730
‫Why?

185
00:11:34,730 --> 00:11:42,620
‫Because it is literally has to go through all the rows and does matching in this case.

186
00:11:42,620 --> 00:11:42,980
‫Right.

187
00:11:43,260 --> 00:11:47,020
‫It doesn't really matter because we don't have an index began to show you with an index.

188
00:11:47,030 --> 00:11:48,420
‫How does it does it do it right.

189
00:11:48,740 --> 00:11:54,260
‫So what this does is it will return all rows that have anything in the beginning, anything of the end,

190
00:11:54,260 --> 00:12:02,000
‫but has to have Z capital as small in the middle, slow to go one point one one second to do all that

191
00:12:02,000 --> 00:12:02,450
‫stuff.

192
00:12:02,450 --> 00:12:02,890
‫But yeah.

193
00:12:02,900 --> 00:12:04,100
‫So 11 million row.

194
00:12:04,100 --> 00:12:06,710
‫But we had two workers to do that work for us.

195
00:12:06,830 --> 00:12:07,400
‫All right, guys.

196
00:12:07,550 --> 00:12:08,010
‫All right, guys.

197
00:12:08,020 --> 00:12:16,190
‫So what I want to do here is actually going to create an index on I'm going to name it employee's name

198
00:12:16,880 --> 00:12:21,370
‫on employees and it is on the name column.

199
00:12:21,680 --> 00:12:24,770
‫So this was going to take some time to create.

200
00:12:24,770 --> 00:12:25,220
‫Why?

201
00:12:25,490 --> 00:12:26,510
‫Because guess what, guys?

202
00:12:26,510 --> 00:12:31,610
‫It is building the B three bitmap index on top of all the stuff.

203
00:12:31,610 --> 00:12:32,040
‫Right.

204
00:12:32,240 --> 00:12:38,990
‫So it has to fit all the 11 million rows and does all this magic to actually create the fancy

205
00:12:41,300 --> 00:12:41,920
‫index.

206
00:12:43,040 --> 00:12:43,820
‫So that's a way for it.

207
00:12:44,960 --> 00:12:48,470
‫All right, guys, we've finished creating our index.

208
00:12:48,780 --> 00:12:55,080
‫So now let's try the same query that we have done just with all the like, just directly equal.

209
00:12:56,210 --> 00:12:59,780
‫So now if I do this query, look how fast it is.

210
00:13:00,050 --> 00:13:01,400
‫Still a little bit slow.

211
00:13:01,550 --> 00:13:03,080
‫Forty seven millisecond.

212
00:13:03,080 --> 00:13:06,820
‫But damn, that was quick.

213
00:13:06,830 --> 00:13:07,160
‫Why?

214
00:13:07,160 --> 00:13:11,810
‫Because now we're doing a bitmap HEB scan on employees.

215
00:13:11,810 --> 00:13:12,130
‫Right.

216
00:13:12,290 --> 00:13:17,420
‫So we're using the actual bitmap index scan on employee's name.

217
00:13:17,720 --> 00:13:22,850
‫So we're actually scanning the employee's name index that we created.

218
00:13:23,000 --> 00:13:31,450
‫And this is way faster because we have fewer rows to work with and we're going to actually jump between

219
00:13:31,460 --> 00:13:36,350
‫few rows until we find exactly what we are looking for.

220
00:13:36,610 --> 00:13:36,840
‫Right.

221
00:13:37,530 --> 00:13:44,540
‫And obviously, guys, the same thing applies where A, if you only PILT pulled the name, which is

222
00:13:44,540 --> 00:13:48,640
‫already in this index, is going to be faster than pulling the idea in the name.

223
00:13:49,100 --> 00:13:51,380
‫I'm not quite sure about this.

224
00:13:51,410 --> 00:13:59,330
‫Maybe this actually doesn't hit the heap because the index, the primary key I know this is a MySQL.

225
00:13:59,330 --> 00:14:07,850
‫I'm not sure about Bosco's, but the primary key is usually stored with every single index in in in

226
00:14:08,030 --> 00:14:09,200
‫in other databases.

227
00:14:09,200 --> 00:14:09,390
‫Right.

228
00:14:09,410 --> 00:14:12,530
‫So the name index here, the primary will be store.

229
00:14:12,530 --> 00:14:19,910
‫So pulling the primary key from the name index will be fast because it's it's right there, right with

230
00:14:19,910 --> 00:14:20,050
‫it.

231
00:14:20,300 --> 00:14:25,520
‫So however, if you're if you have more than other tables, then we had to go to the table structure

232
00:14:25,520 --> 00:14:26,880
‫and that will be a little bit slow.

233
00:14:27,080 --> 00:14:32,120
‫All right, guys, so what will happen if I add like here?

234
00:14:32,210 --> 00:14:33,380
‫What do you think will happen?

235
00:14:36,270 --> 00:14:40,630
‫Back to this local area, back to the exact same local area.

236
00:14:40,670 --> 00:14:48,530
‫Guys, why let's call this again as the let's do something else so we don't have we had the cash like

237
00:14:48,540 --> 00:14:51,150
‫this back to the same slow query.

238
00:14:51,150 --> 00:14:51,600
‫Why?

239
00:14:51,840 --> 00:14:55,050
‫Because we could not scan the index.

240
00:14:55,050 --> 00:14:55,560
‫Why?

241
00:14:55,770 --> 00:14:58,570
‫This is this is a lot of people make this mistake.

242
00:14:58,950 --> 00:14:59,430
‫Yeah.

243
00:14:59,580 --> 00:15:02,520
‫The name column has an index.

244
00:15:02,850 --> 00:15:08,550
‫But what you did is you're not actually asking for a for literal value.

245
00:15:08,880 --> 00:15:11,100
‫You're asking for an expression.

246
00:15:11,670 --> 00:15:20,160
‫And there is no index that will satisfy this expression because we cannot search the index on this on

247
00:15:20,160 --> 00:15:23,750
‫this expression because this is not a single value.

248
00:15:24,270 --> 00:15:30,510
‫So the planner quickly detects this, says, OK, planning time, we check less than then.

249
00:15:30,900 --> 00:15:32,790
‫Sorry, the filter sucks.

250
00:15:33,480 --> 00:15:40,590
‫And that's why you have to do explain on your queries and see why are they actually slow because of

251
00:15:40,590 --> 00:15:40,950
‫this?

252
00:15:41,190 --> 00:15:42,270
‫Because look at this.

253
00:15:42,270 --> 00:15:44,700
‫We did a parallel sequential scan.

254
00:15:44,700 --> 00:15:45,480
‫Again, parallel.

255
00:15:45,540 --> 00:15:49,380
‫Not all databases do parallel scanning, by the way, but possibly still.

256
00:15:50,280 --> 00:15:51,330
‫We we had to do it.

257
00:15:51,330 --> 00:15:56,980
‫We had to go to the thinking table and scan my 11 million rolls if I have a billion rows, is going

258
00:15:56,980 --> 00:15:58,340
‫to be even slower than that.

259
00:15:58,740 --> 00:16:00,930
‫So we are going through all that stuff.

260
00:16:00,930 --> 00:16:04,230
‫And I'm going to explain all this number and then other videos, because I don't want to make this video

261
00:16:04,230 --> 00:16:04,860
‫longer than that.

262
00:16:04,860 --> 00:16:08,210
‫I'm going to make an advance video discussing these actual numbers.

263
00:16:08,240 --> 00:16:08,850
‫What do they mean?

264
00:16:08,850 --> 00:16:09,480
‫All that stuff.

265
00:16:09,480 --> 00:16:13,320
‫But but yeah, we did a parallel scan on employees.

266
00:16:13,320 --> 00:16:16,740
‫Guys, look at that is so slow we could undo in lining.

267
00:16:16,890 --> 00:16:18,660
‫We couldn't do any optimizations.

268
00:16:19,000 --> 00:16:20,430
‫This is an expression, obviously.

269
00:16:20,430 --> 00:16:22,920
‫So that's we had to we had to do all that stuff.

270
00:16:22,920 --> 00:16:24,240
‫So we took a hit.

271
00:16:24,510 --> 00:16:30,680
‫We took a hit despite us having an index, guys, despite X having an index.

272
00:16:30,720 --> 00:16:36,660
‫So, guys, that's very quickly indexing in a nutshell in a very, very quick video.

273
00:16:36,660 --> 00:16:37,980
‫I wanted to explain of that.

274
00:16:38,160 --> 00:16:43,260
‫Yeah, having an index does not mean that the database will always use it.

275
00:16:43,260 --> 00:16:48,180
‫It's going to plan and according to the planner, will decide to use the index or not.

276
00:16:48,330 --> 00:16:56,220
‫It's up to you as as the engineer who is executing queries against the database to actually give hints

277
00:16:56,220 --> 00:16:58,920
‫to the database to actually use the index or not.

278
00:16:59,490 --> 00:17:01,680
‫This is a bad query in general.

279
00:17:01,830 --> 00:17:04,290
‫Select Star is a bad query in general.

280
00:17:04,290 --> 00:17:04,770
‫Why?

281
00:17:04,920 --> 00:17:12,270
‫Because if because if you think about the cost to go to the actual disk and pull all the information.

282
00:17:13,520 --> 00:17:15,350
‫That is expensive, right?

283
00:17:15,770 --> 00:17:22,490
‫Really, really depends on how much like, let's say you have a blob or column on the field, on the

284
00:17:22,490 --> 00:17:24,730
‫table that is expensive to pull up.

285
00:17:24,740 --> 00:17:26,600
‫So if you don't need it, don't ask for it.

286
00:17:26,960 --> 00:17:27,290
‫Right.

287
00:17:27,440 --> 00:17:30,320
‫And only ask for things that are absolutely there.

288
00:17:30,650 --> 00:17:34,700
‫And if you can ask for things that are in the index.

289
00:17:34,700 --> 00:17:39,020
‫Right, I'm going to talk about that in another video where is like a multi column index where you can

290
00:17:39,260 --> 00:17:44,150
‫add other columns in the index that you're about to create.

291
00:17:44,180 --> 00:17:51,200
‫This is called In Lining where where as I search my beautiful index, my neat, efficient index, I

292
00:17:51,200 --> 00:17:53,050
‫want their name, for example.

293
00:17:53,330 --> 00:17:53,530
‫Right.

294
00:17:53,540 --> 00:17:55,580
‫That's bad example, but I want the name.

295
00:17:55,580 --> 00:18:01,220
‫I'm searching for the ID and I don't want the name or the name is right there is just literally sitting

296
00:18:01,220 --> 00:18:02,060
‫in my index.

297
00:18:02,060 --> 00:18:03,750
‫So I just put and retrieve it to the user.

298
00:18:03,770 --> 00:18:10,040
‫I don't have to go back on disk and find the location, seek and desk if that's a mechanical drive or

299
00:18:10,370 --> 00:18:12,230
‫pull the page in case of SSD.

300
00:18:13,370 --> 00:18:15,010
‫All right, guys, that's it for me today.

301
00:18:15,200 --> 00:18:16,430
‫I'm going to see you in the next one.

302
00:18:16,580 --> 00:18:17,510
‫You guys stay awesome.

303
00:18:17,690 --> 00:18:18,110
‫Goodbye.