1 00:00:00,150 --> 00:00:02,820 Head on before going ahead in the session. 2 00:00:02,850 --> 00:00:09,990 Let's have a glance on what we have done in all of our previous precision from initial two till now. 3 00:00:10,260 --> 00:00:17,310 So basically from retired bought today to cleaning, lots with analysis, lots of possibly lots of techniques 4 00:00:17,310 --> 00:00:24,810 for feature encoding, outline reduction, this feature selection as well that we have done automate 5 00:00:24,960 --> 00:00:28,350 our all the processes that we have done for our model. 6 00:00:28,350 --> 00:00:33,390 And then we check what exactly the accuracy of different different algorithm. 7 00:00:33,390 --> 00:00:38,640 And then we have played with all these different different kinds of algorithms, like getting linear 8 00:00:38,640 --> 00:00:39,460 regression. 9 00:00:39,480 --> 00:00:41,550 This isn't really random for us. 10 00:00:41,550 --> 00:00:44,010 And arrest is this one code. 11 00:00:44,160 --> 00:00:48,660 And you can play with multiple regression and whatever you want. 12 00:00:48,990 --> 00:00:55,500 So in this session, we have this assignment in which I have a problem, a statement in which I have 13 00:00:55,500 --> 00:01:04,980 to hyper to my model to why there is a need of performing hyper parameter tuning and why you have to 14 00:01:04,980 --> 00:01:10,320 apply on your data, why you have to apply on your machine learning algorithm. 15 00:01:10,630 --> 00:01:15,180 So let's see if I'm going to press shift plus tab over here. 16 00:01:15,510 --> 00:01:18,580 You will see it with respect to this entry. 17 00:01:19,160 --> 00:01:28,470 These are exactly my all the default parameters that are selected by your decision team, but there 18 00:01:28,470 --> 00:01:33,950 is not any guarantee with respect to my Justis, these are my best parameters. 19 00:01:34,350 --> 00:01:41,000 So what we are going to do, we are basically going to use our hyper parameter tuning approach. 20 00:01:41,100 --> 00:01:48,260 And in this we have randomizer, TV, great TV ad, different cross-validation approach. 21 00:01:48,960 --> 00:01:56,610 What they will do, they will basically return us best parameters for our model so that my training 22 00:01:56,610 --> 00:02:01,630 will happen in the best of it and it will exactly return as best escort. 23 00:02:01,800 --> 00:02:07,730 That's what all these cross-validation that are exactly my high for tuning approach. 24 00:02:07,830 --> 00:02:09,330 That's what this will do. 25 00:02:09,750 --> 00:02:11,700 So very close to what we are going to do. 26 00:02:12,420 --> 00:02:16,260 So basically there are two approaches that you can go ahead with. 27 00:02:16,530 --> 00:02:20,070 The first one is a randomized search approach randomizer. 28 00:02:20,070 --> 00:02:27,300 Such the second one is you'll see there is still some advanced approaches like Jandakot algorithm, 29 00:02:27,320 --> 00:02:28,740 some of do not algorithms. 30 00:02:28,890 --> 00:02:30,840 So they are basically very advanced. 31 00:02:31,140 --> 00:02:36,250 So we are basically going to deal with grid search or readymades says it's all up to you. 32 00:02:36,270 --> 00:02:39,530 So basically, we are going to deal with a randomized search. 33 00:02:40,140 --> 00:02:42,970 So what I'm going to do very first, I have to import it. 34 00:02:43,380 --> 00:02:50,760 Let's say I'm going to say from this cyclone dot model selection, you have to ready for this one that 35 00:02:50,760 --> 00:02:54,360 I'm going to import my randomizer CV. 36 00:02:54,720 --> 00:03:01,620 So if I'm going to, let's say, initialize it so you will see over here, it wasn't just going to copy, 37 00:03:02,070 --> 00:03:07,320 just paste and just to you will see all these different different parameters. 38 00:03:07,590 --> 00:03:14,130 What exactly is estimates are, which is nothing but object of your machine learning algorithm that 39 00:03:14,130 --> 00:03:21,810 I have parum distribution in which I have to pass whatever parameters of my machine learning algorithm. 40 00:03:21,810 --> 00:03:26,340 I have to pass that parameters in the form of dictionary. 41 00:03:26,360 --> 00:03:33,420 So this is not exactly a data in the form of keyword, because then I have this ad in the what are the 42 00:03:33,420 --> 00:03:37,990 number of applications I want, what is encoding parameters, all these different different languages? 43 00:03:38,040 --> 00:03:39,960 I have to play with it as well. 44 00:03:39,990 --> 00:03:46,520 So what I'm going to do over here, say if I'm going to let's say, oh, let's say random forest regression, 45 00:03:46,530 --> 00:03:51,960 and if you will pass shift crosstab, you will see all these different different parameters in case 46 00:03:51,960 --> 00:03:52,800 of random forest. 47 00:03:52,800 --> 00:03:54,370 We have all these parameters. 48 00:03:54,600 --> 00:03:57,840 No, this isn't exactly my NIST meters. 49 00:03:58,080 --> 00:03:59,460 What is your criterion? 50 00:03:59,460 --> 00:04:02,620 Which is exactly what MFC what is a master depth of field. 51 00:04:03,030 --> 00:04:04,380 This is what I mean. 52 00:04:04,380 --> 00:04:11,010 Most samples is what are your minimum sample sleeve and all these different different parameters. 53 00:04:11,010 --> 00:04:15,370 So you have to play with all these parameters to hypotenuse model. 54 00:04:15,570 --> 00:04:20,280 So what I'm going to do, if you will, have these, are all these different, different parameters. 55 00:04:20,760 --> 00:04:27,750 So let's say what I'm going to do, I'm basically going to create a random grade as dictionary. 56 00:04:27,750 --> 00:04:28,530 So I to be ready. 57 00:04:28,980 --> 00:04:36,170 And here basically I'm going to define my all the parameters of random forest in the form of dictionary. 58 00:04:36,180 --> 00:04:41,400 So here I'm going to say the very first parameter is exactly my an estimate. 59 00:04:41,490 --> 00:04:46,150 So my end estimate is nothing, but number of result is exactly one. 60 00:04:46,590 --> 00:04:51,250 So here I'm going to say so I will receive the value of this key from here. 61 00:04:51,270 --> 00:04:57,520 So here I'm going to say four X in, so I'm just going to to it. 62 00:04:57,570 --> 00:04:59,430 As for X in. 63 00:04:59,980 --> 00:05:06,970 Umpired got lost, so here you have a function, this land space, and here I'm going to say, if you 64 00:05:06,970 --> 00:05:10,680 will pass shiftless tab, you will see from there you have to start. 65 00:05:10,690 --> 00:05:16,780 And at what point you have to stop a number of things that you want or number of items that you want 66 00:05:16,780 --> 00:05:20,140 in your party or list or whatever you can don't ask. 67 00:05:20,560 --> 00:05:29,040 So here I'm going to say here mine is DOT is nothing but my hundred and stop. 68 00:05:29,230 --> 00:05:38,700 I'm going to say I have to stop it at my dual handed decision and how many total estimates I want here. 69 00:05:38,710 --> 00:05:42,280 I'm going to say I just need, let's say six. 70 00:05:42,400 --> 00:05:48,190 And after what I have to do, I have to let's say I'm going to convert it into some Intisar so far that 71 00:05:48,190 --> 00:05:49,020 I can call this. 72 00:05:49,390 --> 00:05:53,940 So this is exactly nothing but a code of lisp comprehension. 73 00:05:54,670 --> 00:06:00,340 So if you're not much comfortable with this list comprehension code, you guys can follow my basics 74 00:06:00,340 --> 00:06:07,880 of Python goes in which I have got all these basics of Python in just approx in an hour. 75 00:06:08,170 --> 00:06:09,720 So now what we have to do. 76 00:06:09,730 --> 00:06:13,780 So here I have to mention my and and a score estimate. 77 00:06:14,140 --> 00:06:17,530 And it is exactly that list that you have to mention over here. 78 00:06:17,830 --> 00:06:25,570 So here I would say and an estimate was after that, what we have to do, we have to also set what are 79 00:06:25,570 --> 00:06:26,990 my maximum features. 80 00:06:27,190 --> 00:06:33,580 So here I'm going to say, Max, on the spot features of all these features, all about number of features 81 00:06:33,580 --> 00:06:37,450 to consider at every split of the season. 82 00:06:37,880 --> 00:06:40,000 So maximum on its core features. 83 00:06:40,000 --> 00:06:45,640 And here I am going to accept fill it with my list for him to see. 84 00:06:45,670 --> 00:06:48,580 The very first one is nothing but my auto. 85 00:06:49,420 --> 00:06:52,060 The second one is nothing but my squatty. 86 00:06:52,390 --> 00:06:59,350 After what we have to do, we have to deal with our max on this code that features, which is exactly 87 00:06:59,370 --> 00:07:01,990 maximum number of levels in our decision. 88 00:07:02,500 --> 00:07:08,950 So here I am going to say Max underscored that and it is nothing but here I'm going to something mentionable 89 00:07:09,060 --> 00:07:13,160 the maximum that is nothing, but it is just a list. 90 00:07:13,360 --> 00:07:20,530 So here I'm going to say let's say I'm just going to copy this entire code and I have to do some modifications 91 00:07:20,530 --> 00:07:21,050 over here. 92 00:07:21,340 --> 00:07:22,680 So just do copy paste. 93 00:07:22,690 --> 00:07:31,690 And this time, let's say I have to basically start from five and till 30 I have to move. 94 00:07:31,720 --> 00:07:35,740 And I'd say I just want four values. 95 00:07:36,040 --> 00:07:39,030 And in this max depth, you have all these things. 96 00:07:39,030 --> 00:07:44,530 Then you have to assign this value, which is exactly new, Max, on this contempt after it. 97 00:07:44,560 --> 00:07:50,860 What you have to do, except I still need some more data, some more parameters. 98 00:07:51,220 --> 00:07:58,340 So the last one that you need is exactly what I mean, unless all samples and a score split. 99 00:07:58,630 --> 00:08:05,260 So what this feature is all about, minimum number of speeds required to split node here. 100 00:08:05,260 --> 00:08:13,660 I'm going to say you guys can consider these random values that a five, 10, 15 and 100. 101 00:08:13,660 --> 00:08:19,250 So I'm going to consider these values from my own experience, working machine learning domain. 102 00:08:19,660 --> 00:08:25,540 So after it, what we have to do, we have to just execute it as well as we have to also execute this 103 00:08:25,720 --> 00:08:29,090 dictionary, which is exactly you're right on the school grid. 104 00:08:29,300 --> 00:08:36,310 Now, if I'm going to print this dictionary, which is exactly in this Cinematical grade use, this 105 00:08:36,310 --> 00:08:44,290 is exactly the dictionary that you have to parse to randomizer see in your bottom on the score distribution 106 00:08:44,290 --> 00:08:51,490 parameter so that if I have to initialize man and my CV and here in this estimate, or I'm going to 107 00:08:51,490 --> 00:08:57,850 say in this estimate, you have to parse object of your random forest very first. 108 00:08:58,090 --> 00:09:03,540 So here I'm going to say let's say I'm going to create an object of random forest very first. 109 00:09:03,850 --> 00:09:13,720 So here I'm going to say from this ascalon, from this ascalon dot in Sambell, just best Dabb, I have 110 00:09:13,720 --> 00:09:16,590 to import my random forest. 111 00:09:16,600 --> 00:09:18,400 So I'm just going to import this. 112 00:09:18,400 --> 00:09:22,690 And if you stamp this is exactly that, just execute it. 113 00:09:22,840 --> 00:09:26,780 Now, what you have to do, you have to simply initialize this. 114 00:09:26,830 --> 00:09:34,360 I'm going to say let's say it is nothing, but it is mine that they are e.g. score R.F. whatever you 115 00:09:34,360 --> 00:09:38,060 want to name it, it's all up to you, just as you did. 116 00:09:38,410 --> 00:09:44,480 Now what you have to do in this estimate, you have to parse its object some way to say regression and 117 00:09:44,480 --> 00:09:47,770 a school or you are a school area. 118 00:09:48,040 --> 00:09:52,880 And in the very second parameter you have to set you have to set this random grade. 119 00:09:52,900 --> 00:09:57,620 So here I am going to say my random great parameter and let's see. 120 00:09:57,640 --> 00:09:59,320 And we will say Mycenae. 121 00:09:59,600 --> 00:10:02,940 Is my cross validation equal to three by default? 122 00:10:02,960 --> 00:10:13,400 It is a swipe and after in my verbose parameter, so is basically to show your whatever activity is 123 00:10:13,400 --> 00:10:17,120 happening across your cell once you will, as you said. 124 00:10:17,420 --> 00:10:18,800 So what was it goes to do? 125 00:10:19,190 --> 00:10:25,980 And after that, I'm going to set mine and in good jobs parameter as let's say, minus one. 126 00:10:26,000 --> 00:10:31,130 So whenever you are going to parse this minus one, it means it will use all the course. 127 00:10:31,130 --> 00:10:33,680 It means it will use all the resources of. 128 00:10:34,580 --> 00:10:37,610 Let's say I want to store it in some variables here. 129 00:10:37,610 --> 00:10:42,070 I would say that one is random, so just executed. 130 00:10:42,710 --> 00:10:46,640 Now what we have to do, we have to simplify it, our data. 131 00:10:46,650 --> 00:10:49,940 So I'm going to say out of an it's got random, not fit. 132 00:10:49,970 --> 00:10:58,280 So what we have to fit, we have to fit basically X on the screen and definitely Y and just execute 133 00:10:58,280 --> 00:10:58,610 it. 134 00:10:58,640 --> 00:11:00,500 It will take some couple of seconds. 135 00:11:00,500 --> 00:11:03,250 You will see all these things come across. 136 00:11:03,290 --> 00:11:07,410 So sad because you have said your variables equals two. 137 00:11:07,550 --> 00:11:13,310 So it will take a couple of seconds depending upon what processor you are using, depending upon what 138 00:11:13,380 --> 00:11:15,440 specifications of the system. 139 00:11:15,470 --> 00:11:23,480 Now you will see all your stuff gets executed and it has taken that much time in my system having my 140 00:11:23,600 --> 00:11:30,370 own specification for in your case, it will take definitely depending upon what it is in your system, 141 00:11:30,370 --> 00:11:31,300 how you will see. 142 00:11:31,310 --> 00:11:38,450 These are all the parameters written by my cross-validation, written by my randomizer cross-validation 143 00:11:38,810 --> 00:11:44,620 software it what we are going to do, let's say I'm going to check what are my best parameters. 144 00:11:44,630 --> 00:11:52,460 So here I'm going to say right part this for random dot best underscore firearms to just have the order. 145 00:11:52,460 --> 00:11:57,500 You'll see these are my best parameters selected by my cross-validation validation. 146 00:11:57,770 --> 00:12:02,860 So now what I'm going to do very first, I have to do a prediction. 147 00:12:02,870 --> 00:12:12,080 So now I'm going to say this on this random dot product, because you have to predict on your X underscored 148 00:12:12,260 --> 00:12:14,570 test data supporting prediction. 149 00:12:14,870 --> 00:12:21,680 I'm going to store it in, let's say, prediction and let's say if I'm going to execute it and if this 150 00:12:21,680 --> 00:12:30,770 time you are going to check what is exactly distribution between your actual data, minus whatever prediction 151 00:12:30,770 --> 00:12:32,090 you have done or what did. 152 00:12:32,090 --> 00:12:40,220 So I'm going to say this is nothing, but this is this type of distribution that I have achieved using 153 00:12:40,220 --> 00:12:41,710 my randomized search. 154 00:12:42,230 --> 00:12:50,330 And if you are going to check what exactly is accuracy after doing this, all these highfaluting for 155 00:12:50,330 --> 00:12:57,410 this, I'm going to say metrics dot are to underscore a square, and here you have to pass. 156 00:12:57,410 --> 00:13:01,040 What is the actual data and what exactly is a prediction? 157 00:13:01,850 --> 00:13:08,780 Just executer you will see now you have in case of random forest, you have somewhere approx. 158 00:13:08,780 --> 00:13:11,060 Eighty three percent accuracy. 159 00:13:11,360 --> 00:13:20,110 But before when you use random forest, you can observe you have some tools to 80 percent accuracy that 160 00:13:20,120 --> 00:13:23,710 are bovver of your model highfaluting. 161 00:13:23,810 --> 00:13:30,260 Whenever you are going to work on real world projects, you have to always hyper tuned model. 162 00:13:30,440 --> 00:13:33,590 You have to always cross validation model. 163 00:13:33,590 --> 00:13:35,360 That's a power of your model. 164 00:13:36,380 --> 00:13:39,490 It will definitely increase your accuracy. 165 00:13:39,620 --> 00:13:41,450 Let's say I have to say it. 166 00:13:41,540 --> 00:13:46,980 Let's say I have to dump this model, this best model that I have created over here. 167 00:13:46,980 --> 00:13:48,890 It's very first what I'm going to do. 168 00:13:48,890 --> 00:13:52,280 I have to open some flat in which I can dump this. 169 00:13:52,280 --> 00:13:55,640 For this, I'm going to say where I have to open it. 170 00:13:55,640 --> 00:13:58,280 So I'm just going to copy this part. 171 00:13:58,760 --> 00:14:01,970 Just going to paste over there and here. 172 00:14:01,970 --> 00:14:08,390 I'm going to say this time, my model name is, let's say AALDEF on a random dot Beaky. 173 00:14:08,390 --> 00:14:13,970 And then I have to say, in what model I have to open this file. 174 00:14:13,980 --> 00:14:15,800 So here I would say in writable. 175 00:14:16,160 --> 00:14:17,390 So just execute. 176 00:14:17,390 --> 00:14:22,730 And now what we have to do, we have to use this pikul dot dump. 177 00:14:22,880 --> 00:14:29,530 And here I have to say I have to dump this order from the random into my file. 178 00:14:29,720 --> 00:14:31,220 So just execute it. 179 00:14:31,430 --> 00:14:38,510 And now you will see over here, here is exactly your model that is right now creating over here. 180 00:14:38,690 --> 00:14:45,240 Let's say we have to do prediction using that model, using that model that I have created over here. 181 00:14:45,260 --> 00:14:49,850 So what we have to do at first, we have to load this model that we have done. 182 00:14:50,210 --> 00:14:54,250 So now I'm going to say very first, I'm very first. 183 00:14:54,260 --> 00:14:57,530 I'm going to say what I have to open very first. 184 00:14:57,860 --> 00:14:59,420 So let's say I'm just. 185 00:14:59,500 --> 00:15:05,710 Going to copy this, but let's see, so I'm just going to go up with is just going to Pastoria, let's 186 00:15:05,710 --> 00:15:13,350 say I have to load my previous model that I have created, which is exactly my model dot beacon. 187 00:15:13,540 --> 00:15:20,530 And here I have to say to my read in binary mode, because I have to read that plan after what I have 188 00:15:20,530 --> 00:15:24,210 to do, I'm going to save it in the fly. 189 00:15:24,430 --> 00:15:28,300 And after it I have to basically load my models here. 190 00:15:28,300 --> 00:15:32,290 I would say bikal dot load. 191 00:15:32,290 --> 00:15:37,840 And here I have to let's say I'm going to store it, let's say in model. 192 00:15:38,140 --> 00:15:41,810 And after I left it, I would say I have to load this model. 193 00:15:41,830 --> 00:15:50,890 So once you were executed, you will see this is that model of random forest class that you have imported 194 00:15:50,890 --> 00:15:51,470 earlier. 195 00:15:51,700 --> 00:15:57,870 Similarly, you can load this crossover little model that you have dump over here. 196 00:15:58,090 --> 00:16:02,040 Let's say I have to do prediction using this model. 197 00:16:02,050 --> 00:16:05,210 So I'm going to say model for it. 198 00:16:05,210 --> 00:16:14,320 Let's say I am going to say model dot product and you have to just use of credit function and on what 199 00:16:14,320 --> 00:16:21,490 data you have to predict to what I am going to do very first, you have to store this model somewhere. 200 00:16:21,530 --> 00:16:23,410 So let's say I'm with the store again. 201 00:16:23,410 --> 00:16:26,680 Let's say for this or random forest, it's all up to you. 202 00:16:26,800 --> 00:16:32,560 And after it, what we have to do using this forest, you have to call a Braddick function. 203 00:16:32,740 --> 00:16:35,800 And here you have to mention, let's say, X test. 204 00:16:36,040 --> 00:16:42,640 And if you will execute you, you'll see over here you have all your predictions with respect to this 205 00:16:42,670 --> 00:16:43,780 X test data. 206 00:16:43,780 --> 00:16:44,270 That's it. 207 00:16:44,500 --> 00:16:47,800 I have to store these predictions. 208 00:16:48,040 --> 00:16:49,840 I have to store these predictions. 209 00:16:49,840 --> 00:16:53,430 We are somewhere as, let's say, predictions doom. 210 00:16:53,500 --> 00:16:54,850 So just executed. 211 00:16:54,880 --> 00:16:57,370 All the prediction is exactly over here. 212 00:16:57,670 --> 00:17:00,420 I'd say I have to check the accuracy as well. 213 00:17:00,700 --> 00:17:06,910 So here I would say Kranks dot to describe breast. 214 00:17:07,540 --> 00:17:15,610 And here you have to mention what exactly is your exact data and after that, what is your predictions 215 00:17:16,060 --> 00:17:16,630 exactly. 216 00:17:16,630 --> 00:17:24,190 And predictions do so just execute and you will see this is the previous accuracy that you have achieved 217 00:17:24,190 --> 00:17:31,290 earlier and similarly you can perform all these is similarly for your cross village model. 218 00:17:31,630 --> 00:17:38,440 So in such scenarios, you will definitely get somewhere approx, 80 percent accuracy that you have 219 00:17:38,440 --> 00:17:39,870 achieved over here. 220 00:17:40,180 --> 00:17:41,910 So that's what I'm trying to show you. 221 00:17:42,130 --> 00:17:44,080 So let's see if you have some new data. 222 00:17:44,440 --> 00:17:49,690 You have to just pass that new data over here and you will get your accuracy. 223 00:17:49,990 --> 00:17:52,120 So that's all about this project. 224 00:17:52,150 --> 00:17:58,480 Hopefully you'll love this project very much and try to explore from your own site as much as you can. 225 00:17:58,610 --> 00:17:59,590 So thank you, guys. 226 00:17:59,590 --> 00:18:00,570 Have a nice day. 227 00:18:01,000 --> 00:18:01,870 Keep learning. 228 00:18:01,870 --> 00:18:04,120 Keep growing, keep motivating.