1 00:00:01,510 --> 00:00:02,290 Hi, everyone. 2 00:00:02,310 --> 00:00:08,380 So in the last period, we implemented our own hash map, we implemented insert function, lead function 3 00:00:08,380 --> 00:00:11,090 and search function now in this video. 4 00:00:11,110 --> 00:00:14,350 Let us try to find out the time, complexity of our hash map. 5 00:00:15,040 --> 00:00:17,620 And then we will also discuss the topic rehashing. 6 00:00:18,700 --> 00:00:24,490 So first, let us discuss about the insert function, let us try to find out the time complexity about 7 00:00:24,490 --> 00:00:25,390 the insert function. 8 00:00:26,440 --> 00:00:29,470 Now, the insert function, it will take years and put. 9 00:00:30,580 --> 00:00:37,720 Then the key will pass through a hash function, so it will give me the bucket index and then I will 10 00:00:37,720 --> 00:00:40,750 go to the bucket index, so for inserting what we have to do. 11 00:00:40,960 --> 00:00:46,300 So first of all, we have to traverse this link list to check whether a given key already exists or 12 00:00:46,300 --> 00:00:46,640 not. 13 00:00:47,200 --> 00:00:49,290 For example, you want to insert the key ABC. 14 00:00:49,990 --> 00:00:53,980 So first you have to check whether the key ABC exist in the linked list or not. 15 00:00:54,190 --> 00:00:56,980 So basically you have to traverse this link list. 16 00:00:58,390 --> 00:01:00,040 So what are the time complexities? 17 00:01:00,220 --> 00:01:04,310 So first of all, this hash function, this is not going to work. 18 00:01:04,330 --> 00:01:05,430 It will take some time. 19 00:01:05,650 --> 00:01:09,730 And similarly, you are trading over this link list, so there will be some time. 20 00:01:11,170 --> 00:01:13,630 So first, let us define some parameters. 21 00:01:14,410 --> 00:01:19,930 So what is and we will try to find out the complexity in terms of in so and is actually the total number 22 00:01:19,930 --> 00:01:21,670 of entities present in the hash map. 23 00:01:22,860 --> 00:01:28,200 So and is the total number of entries and it will be of 10 to the power five entries. 24 00:01:29,640 --> 00:01:32,580 And what is l so for example. 25 00:01:33,560 --> 00:01:38,660 This string ABC, so this hash function, it will not take on a certain time you have a string ABC. 26 00:01:39,810 --> 00:01:40,850 Then why do you have to do so? 27 00:01:40,860 --> 00:01:47,040 We're finding out the high school you are traversing this string, so EL is actually so this is the 28 00:01:47,220 --> 00:01:49,700 length of the word, length of the word. 29 00:01:50,040 --> 00:01:54,210 So in our case, we are digging string as key. 30 00:01:55,320 --> 00:01:58,140 So what would be the time taken by the other hash function? 31 00:01:58,890 --> 00:02:01,620 So the time taken by hash function is because of EL? 32 00:02:03,090 --> 00:02:08,850 Simple now will be very, very small, so you do not have to worry about l l will be small, let's say 33 00:02:08,850 --> 00:02:10,800 four or five or 10 20. 34 00:02:10,979 --> 00:02:14,580 So L will be small as compared to total number of entries. 35 00:02:14,850 --> 00:02:21,510 So it will be OK, let's say, for 10, 20 in this minute, but it will be very, very small as compared 36 00:02:21,510 --> 00:02:21,960 to an. 37 00:02:23,910 --> 00:02:31,480 Now, what we are doing so our main problem is basically this one, we are traversing this list. 38 00:02:32,520 --> 00:02:33,970 So this is our main problem. 39 00:02:34,950 --> 00:02:36,240 So in the worst case. 40 00:02:37,530 --> 00:02:40,390 In the worst case, what will be the length of this linked list? 41 00:02:41,340 --> 00:02:46,260 So in the worst case, this length of the list will be big often basically the length of the linked 42 00:02:46,320 --> 00:02:47,070 list will be in. 43 00:02:48,140 --> 00:02:54,260 In the worst case, what will happen so all the entries want to be present at this index, only you 44 00:02:54,270 --> 00:02:56,150 got a hash function is very, very bad. 45 00:02:56,390 --> 00:03:00,340 Fetch that all dentist wants to be present at this index only. 46 00:03:00,800 --> 00:03:04,430 So in that case, the length of our linked list will be in. 47 00:03:05,370 --> 00:03:12,810 And the time complexity will become big often and end is of the order to power five two, if you will 48 00:03:12,810 --> 00:03:18,450 add the time complexity of hash function and this time complexities of this big off. 49 00:03:18,450 --> 00:03:22,630 And then so the total time complexity in the insert function. 50 00:03:22,650 --> 00:03:29,260 So this is big often for editing on the list and bigger of L for finding out the hash code. 51 00:03:29,640 --> 00:03:33,690 Now see, PN is actually much, much bigger than. 52 00:03:34,230 --> 00:03:37,960 So what we can do, we can assume that our hash code is constant. 53 00:03:38,880 --> 00:03:44,490 So instead of big awful what we can do so we can assume that our hash function is taking constant time. 54 00:03:44,490 --> 00:03:45,360 Big off one time. 55 00:03:46,600 --> 00:03:48,430 So it is taking Constantine. 56 00:03:49,360 --> 00:03:56,080 Now, volunteers are in the worst case, in the worst case, the total time complexity will be big often 57 00:03:56,350 --> 00:03:59,590 when all dentist wants to be present at the same index. 58 00:04:00,670 --> 00:04:04,680 But a lot of research has been done in finding out a good hash function. 59 00:04:05,350 --> 00:04:07,610 So this case will never happen. 60 00:04:07,960 --> 00:04:09,310 This case will never be there. 61 00:04:09,970 --> 00:04:11,110 So I'm repeating myself. 62 00:04:12,620 --> 00:04:19,410 A lot of research work has been done in finding out a good hedge function and this case will never happen. 63 00:04:19,410 --> 00:04:23,840 So which case that all the entries are present at one index only. 64 00:04:24,030 --> 00:04:25,510 So this case will never happen. 65 00:04:25,530 --> 00:04:29,130 So that means this will never happen to time. 66 00:04:29,130 --> 00:04:35,610 Complexity will never be big often so on an average, but we can assume so on an average. 67 00:04:36,510 --> 00:04:43,080 So end is actually the total number of entries in the hash map and BS, basically the total number of 68 00:04:43,080 --> 00:04:44,400 boxes available to us. 69 00:04:44,410 --> 00:04:45,960 Basically this is Buckett says. 70 00:04:48,640 --> 00:04:54,850 So in this case, in our case, I took the brigade size, I think, five so beastly packages, so on 71 00:04:54,850 --> 00:04:58,860 an average we can assume that each box will contain and by the entries. 72 00:04:59,950 --> 00:05:03,820 So on an average, Madugalle, assume that each box. 73 00:05:05,620 --> 00:05:07,450 Will have and by Ventris. 74 00:05:08,570 --> 00:05:10,250 So we can assume on an average. 75 00:05:11,520 --> 00:05:12,150 Simple. 76 00:05:12,690 --> 00:05:19,800 So now what will be our time complexity so on an average each while traversing the list, so the linked 77 00:05:19,800 --> 00:05:22,790 list is actually of the size and maybe it has and baby entries. 78 00:05:23,130 --> 00:05:26,970 So on an average, our time complexity will be and baby. 79 00:05:29,070 --> 00:05:31,530 So this is on an average time complexity. 80 00:05:33,320 --> 00:05:40,130 So we call this number and by B. So this number has a name and we call this number Lord Vector. 81 00:05:42,060 --> 00:05:46,830 So Imbibes actually called load factor, and now it will do. 82 00:05:48,690 --> 00:05:54,600 So what we'll do, we will always ensure that end by be is less than zero point seven. 83 00:05:56,020 --> 00:06:03,190 So this thing, we will always ensure we will always ensure that, and maybe so this this racial overlord 84 00:06:03,190 --> 00:06:05,980 factor is always less than zero point seven. 85 00:06:06,160 --> 00:06:12,310 So that means if there are 100 entries in a map, so the number of boxes will be around 120. 86 00:06:13,330 --> 00:06:19,730 So this number is not fixed, it can be zero point six five, it can be zero point it something something. 87 00:06:20,770 --> 00:06:27,370 So basically, if you look at these numbers, if you look at the load factor, what we can see so we 88 00:06:27,370 --> 00:06:29,530 can see that on an average. 89 00:06:30,670 --> 00:06:38,160 On an average, each box will have less than one entry, each box will have less than one entry. 90 00:06:38,500 --> 00:06:43,600 So this and Bybee's basically less than one, and it is less than zero point seven. 91 00:06:43,630 --> 00:06:45,320 So obviously it is less than one. 92 00:06:45,790 --> 00:06:47,470 So over time, complexity was and. 93 00:06:48,190 --> 00:06:50,960 So basically our time complexities less than one. 94 00:06:51,310 --> 00:06:54,700 So on an average, our time complexities we can see, say, big of often. 95 00:06:57,420 --> 00:07:02,850 I repeating myself, on an average, each bucket will have less than one entry. 96 00:07:04,000 --> 00:07:09,810 So every box will have a constant number of entries, so let's say every box will have gone through 97 00:07:09,850 --> 00:07:10,570 a number of entries. 98 00:07:10,570 --> 00:07:12,520 Let's say one box is having two entries. 99 00:07:12,820 --> 00:07:14,920 The other one can have only one entry. 100 00:07:15,310 --> 00:07:17,490 The next one can have three entries. 101 00:07:17,770 --> 00:07:22,550 So basically the entries will be like the order one or two or three. 102 00:07:22,720 --> 00:07:24,430 So basically the size of the linked list. 103 00:07:25,180 --> 00:07:26,740 So let's say this is a long list. 104 00:07:26,770 --> 00:07:29,050 So basically the size mailing list will be constant. 105 00:07:30,090 --> 00:07:34,840 So on an average day will be less than what it is with the help of Lord Factor. 106 00:07:35,190 --> 00:07:36,840 This thing will always ensure. 107 00:07:38,200 --> 00:07:42,290 So on an average, each box will have less than one entry. 108 00:07:42,610 --> 00:07:48,310 They are basically constant number of entries, so it might be less than one because this thing we will 109 00:07:48,310 --> 00:07:48,970 always ensure. 110 00:07:49,120 --> 00:07:53,650 So this number is called load factor and we can say our time complexities. 111 00:07:53,650 --> 00:07:54,430 Big one. 112 00:07:56,380 --> 00:08:01,720 So how we will ensure that our load factor is less than zero point seven. 113 00:08:03,140 --> 00:08:08,990 So what I want to say is, how are we going to ensure how we can always ensure that and baby is always 114 00:08:08,990 --> 00:08:12,630 less than zero point seven, so how can we ensure this thing? 115 00:08:14,060 --> 00:08:15,340 So what is happening here? 116 00:08:16,130 --> 00:08:22,040 Let's say the size of your budget, that is 20 now you're inserted 17 elements, then you inserted it 117 00:08:22,040 --> 00:08:24,260 in elements, then you inserting 19 elements. 118 00:08:24,530 --> 00:08:27,200 Basically, you are increasing the number of elements. 119 00:08:27,410 --> 00:08:32,240 So obviously, what will happen, the value of N is increasing, but the value of Besim. 120 00:08:34,510 --> 00:08:40,570 You are inserting the developer inside the backstory, so the value of and is increasing and the value 121 00:08:40,570 --> 00:08:43,830 of B is remaining seem so how can we ensure this figure? 122 00:08:43,929 --> 00:08:45,320 How can we ensure this relation? 123 00:08:45,580 --> 00:08:49,150 So what we have to do, the only way with us is basically we will increase. 124 00:08:49,150 --> 00:08:52,840 B, we will increase B, so what will happen? 125 00:08:53,530 --> 00:08:55,450 B was want to hear what we will do. 126 00:08:55,460 --> 00:08:56,470 We will double the size. 127 00:08:57,430 --> 00:09:01,190 So we will double the size now, the areas of size 40. 128 00:09:01,420 --> 00:09:08,360 So what happened becomes twice so big, become straight, so the denominator increased by twice. 129 00:09:08,410 --> 00:09:13,480 So basically our ratio will become half because the numerator, because the denominator becomes twice. 130 00:09:13,690 --> 00:09:15,370 So this ratio will drop. 131 00:09:16,360 --> 00:09:23,320 So when the value of end is increasing and if it becomes so solvent and by becomes a zero point seven, 132 00:09:23,590 --> 00:09:25,180 what we will do so we will. 133 00:09:25,840 --> 00:09:31,440 So we will increase the number of buckets by two and then this figure will become half suddenly. 134 00:09:31,780 --> 00:09:35,620 So basically and baby will become less than zero point three five. 135 00:09:36,590 --> 00:09:42,950 And again, what will happen again will push the entries inside this, so what will happen, the value 136 00:09:42,950 --> 00:09:46,110 of N will again increase the value of B remains the same. 137 00:09:46,550 --> 00:09:51,020 So, again, as soon as the value of N by B becomes greater than zero point seven, what it will do, 138 00:09:51,350 --> 00:09:53,320 you will increase be by two times. 139 00:09:53,330 --> 00:09:56,080 So this time the B will become for 80. 140 00:09:56,540 --> 00:09:57,620 So it was 40. 141 00:09:57,920 --> 00:09:59,930 Now the value of B will become 80. 142 00:10:00,170 --> 00:10:03,680 So again, this value will suddenly become half. 143 00:10:04,340 --> 00:10:09,490 OK, so and by B will become less than zero point thirty five and this will go on. 144 00:10:10,470 --> 00:10:11,380 Now, one more thing. 145 00:10:12,030 --> 00:10:18,780 So initially you say the area was 20, so when you are making the size of the area 40 to when you increase 146 00:10:18,780 --> 00:10:22,320 the size, but you have to do so, you have to rehash all the values. 147 00:10:23,040 --> 00:10:25,500 You have to rehash all the key value-based. 148 00:10:27,750 --> 00:10:31,530 You have to rehash all the loopier and this is what we call rehashing. 149 00:10:33,180 --> 00:10:38,010 So if you will look at the hash function, so hash function comprises of two things. 150 00:10:38,340 --> 00:10:42,510 First, one is the hash called So which is which will give me a fixed number. 151 00:10:42,510 --> 00:10:44,520 And the second one is basically compressing function. 152 00:10:44,730 --> 00:10:47,740 So earlier we were taking Mordente in this case. 153 00:10:48,570 --> 00:10:53,230 Now, when we will rehash all these values, what we will do, our compression function will be more 154 00:10:53,250 --> 00:10:53,670 40. 155 00:10:54,640 --> 00:11:01,060 So it is possible that when 17, 18, 19, they were present in the same linked list, now in the new 156 00:11:01,060 --> 00:11:05,430 linked list, the bucket index will be different because this time we are taking more 40. 157 00:11:06,310 --> 00:11:09,220 So obviously this rehashing will take some time. 158 00:11:09,700 --> 00:11:15,120 Rehashing will also take some time, but we will have to do rehashing after login payloads. 159 00:11:16,280 --> 00:11:21,740 OK, we have to do reaction after very large intervals, so the frequency of rehashing will be very, 160 00:11:21,740 --> 00:11:22,340 very small. 161 00:11:22,920 --> 00:11:26,270 OK, so finally, what is the time complexity for the insert function? 162 00:11:27,670 --> 00:11:34,960 So finally, the time complexities question, because the hash function, it will take big off time 163 00:11:35,350 --> 00:11:40,520 and on an average each bucket will have only one or two entries. 164 00:11:40,870 --> 00:11:45,220 So basically, we can assume this constant amount of work so big, awful. 165 00:11:45,250 --> 00:11:46,790 We can also assume it is constant. 166 00:11:46,960 --> 00:11:51,060 So on an average, the insert function takes constant time over time. 167 00:11:52,380 --> 00:11:58,260 So if the insert function is taking constant time, then search and delete, there also seem so they 168 00:11:58,260 --> 00:12:02,490 will also take constant time and this is the required time complexity. 169 00:12:02,820 --> 00:12:08,300 So it also takes Konstantine now in the next to be doing what we will do. 170 00:12:08,820 --> 00:12:10,710 We will implement this rehashing. 171 00:12:11,940 --> 00:12:17,580 In the next video, we will implement rehashing with the help of Lord Fekter, we will check the load 172 00:12:17,580 --> 00:12:20,330 factor and we will do reaction. 173 00:12:21,150 --> 00:12:22,340 So I will see you in the next one. 174 00:12:23,010 --> 00:12:23,250 Bye.