1 00:00:00,210 --> 00:00:01,310 Welcome back. 2 00:00:01,350 --> 00:00:07,080 Last video we saw how to tune one of our machine learning models specifically the K newest name is classifier 3 00:00:07,440 --> 00:00:08,700 by hand. 4 00:00:08,700 --> 00:00:13,900 Now this is hyper parameter tuning what I've done is I've just adjusted this little title here to say 5 00:00:13,900 --> 00:00:16,410 hyper parameter tuning by hand. 6 00:00:16,410 --> 00:00:22,140 But as you could imagine if you had more than one parameter or hyper parameter in our case we only tuned 7 00:00:22,230 --> 00:00:26,340 in neighbors would go up to the K neighbors classifier. 8 00:00:26,520 --> 00:00:28,580 We only tuned this one parameter. 9 00:00:28,740 --> 00:00:32,160 We wanted to chain all of these I get a bit tedious. 10 00:00:32,160 --> 00:00:34,300 We were writing for loops for days. 11 00:00:34,350 --> 00:00:36,820 So what we're going to do here is. 12 00:00:36,900 --> 00:00:41,730 And remember we've also written off our K nearest neighbors square in the search of better results. 13 00:00:42,030 --> 00:00:46,770 So because our logistic regression in random forest classifier performing better we're going to cut 14 00:00:47,220 --> 00:00:50,310 the cane to its neighbors model out of our experimentation. 15 00:00:50,310 --> 00:00:59,710 So now we're going to go hyper parameter tuning with randomized search CV. 16 00:00:59,730 --> 00:01:03,500 And so this is what we saw in the socket line section. 17 00:01:03,540 --> 00:01:14,400 And if we search it out randomized search save a S.K. line we'd find it here beautiful so there's the 18 00:01:14,410 --> 00:01:17,700 documentation there you can read through that on your own time. 19 00:01:17,710 --> 00:01:19,400 But we're going to see it implemented. 20 00:01:19,540 --> 00:01:24,820 And again if you're wondering how you could find the hyper parameters to tune a set machine learning 21 00:01:24,820 --> 00:01:25,380 model. 22 00:01:25,480 --> 00:01:27,870 Well here's what I would do for logistic regression. 23 00:01:27,940 --> 00:01:36,160 So how to tune a logistic regression machine learning model in Python something like that. 24 00:01:36,190 --> 00:01:42,900 If we switch that up logistic regression model training with psychic line beautiful logistic regression 25 00:01:43,140 --> 00:01:48,450 using Python psychic line high parameter optimization of machine learning models chaining parameters 26 00:01:48,450 --> 00:01:49,660 for logistic regression. 27 00:01:49,770 --> 00:01:54,690 If we were to go through these and research and do our own experiments and figure out things and ask 28 00:01:54,690 --> 00:01:57,600 more questions and probably find some pretty good answers. 29 00:01:57,990 --> 00:02:02,460 But for now we're going to pretend we've gone through those steps and we figured out that we can use 30 00:02:02,460 --> 00:02:07,280 randomize search CV and that we can tuna a bunch of different parameters. 31 00:02:07,350 --> 00:02:08,970 Let's do that. 32 00:02:08,970 --> 00:02:12,570 We're going to have to Choon 33 00:02:16,800 --> 00:02:25,800 logistic regression and logistic regression model which is that and random forest classifier 34 00:02:31,190 --> 00:02:38,930 using randomized search CEV and again if you need information about which parameters you can use for 35 00:02:38,930 --> 00:02:40,880 logistic regression I've even spent that wrong. 36 00:02:41,090 --> 00:02:47,300 Difficult Daniel logistic regression and random forest classifier can go up here as canine logistic 37 00:02:48,140 --> 00:02:55,820 regression so if we look in here this is going to give us a list of different parameters and you might 38 00:02:55,820 --> 00:02:57,910 read the documentation and trust me. 39 00:02:57,920 --> 00:03:03,350 I did this the first time I did this I read the documentation and I'm like this means all of these terms 40 00:03:03,350 --> 00:03:09,830 like elastic net L2 penalties bool actually I know what it all means true or false right. 41 00:03:10,370 --> 00:03:12,280 All these scenes didn't mean much to me. 42 00:03:12,290 --> 00:03:19,280 It was only once I started to do some research and go through and figure out what all these different 43 00:03:19,280 --> 00:03:24,500 parameters actually meant was that I started to understand it and the same goes for the random forest 44 00:03:24,500 --> 00:03:25,250 classifier. 45 00:03:25,310 --> 00:03:31,940 You could do the same thing put that into the SBA loan documentation or change this to be how to tune 46 00:03:32,600 --> 00:03:33,650 a random forest 47 00:03:36,260 --> 00:03:37,640 machine learning model in Python. 48 00:03:37,640 --> 00:03:42,830 These are steps that people like machine learning engineers and data scientists take every single day 49 00:03:42,890 --> 00:03:45,320 to figure out how to improve their models. 50 00:03:45,320 --> 00:03:47,720 No one in the beginning knows how to do this off by heart. 51 00:03:47,750 --> 00:03:51,810 Even after doing it many times I still have to look these things up. 52 00:03:51,890 --> 00:03:55,020 So we go here let's see how we would do it. 53 00:03:55,140 --> 00:04:00,480 So the first thing to do with randomized search CV is and if you're wondering what CV stands for it 54 00:04:00,480 --> 00:04:05,550 stands for cross validation we saw that in psychic loan and more specifically cross validation what 55 00:04:05,550 --> 00:04:11,040 it does is instead of doing a normal training test split like we've done before by creating one training 56 00:04:11,040 --> 00:04:17,370 split and a test blend a.k.a. 80 percent in the training split and 20 percent in the test split it creates. 57 00:04:17,370 --> 00:04:18,540 This should be really be okay. 58 00:04:18,840 --> 00:04:21,270 But I've done five because it looks nice here. 59 00:04:21,330 --> 00:04:23,570 This can be k fold cross validation. 60 00:04:23,670 --> 00:04:30,480 So 5 is the default in get line in the latest version but you can really adjust this and what it's going 61 00:04:30,480 --> 00:04:35,220 to do is go on it create five different versions of training data and five different versions of the 62 00:04:35,220 --> 00:04:42,210 test data and then evaluate different parameters Rama because we're doing a hyper parameter search evaluate 63 00:04:42,210 --> 00:04:47,220 different hyper parameters on all of these different sets of all of these versions of the training and 64 00:04:47,220 --> 00:04:54,660 test data and work out which set of parameters or hyper parameters is best across these five different 65 00:04:54,660 --> 00:04:58,590 splits rather than just being one single split. 66 00:04:58,590 --> 00:05:01,470 So if we go here let's do it. 67 00:05:01,710 --> 00:05:09,210 So what we need to do is create a hyper parameter grid for logistic 68 00:05:11,620 --> 00:05:19,480 regression and reading the logistic regression documentation as well as searching up here for logistic 69 00:05:19,480 --> 00:05:20,380 regression. 70 00:05:20,380 --> 00:05:25,660 We figure out that there's a few high parameters that we can tune such as the value for C which if we 71 00:05:25,660 --> 00:05:28,690 look in here we go down here see. 72 00:05:28,900 --> 00:05:32,310 So inverse of regularization strength must be a positive flight. 73 00:05:32,380 --> 00:05:36,950 Again you could read this the first on the documentation go what any of these words mean. 74 00:05:36,970 --> 00:05:39,560 That's why it requires research to check it out. 75 00:05:39,820 --> 00:05:41,980 And then there's another one called solver. 76 00:05:42,100 --> 00:05:46,440 So we'll go here we'll just have a look at what a parameter grid looks like. 77 00:05:46,720 --> 00:05:47,300 See. 78 00:05:47,680 --> 00:05:50,560 And again there are more where the using to here. 79 00:05:50,560 --> 00:05:58,160 So we're only using C and solver you might find in your research that you could adjust the penalty you 80 00:05:58,160 --> 00:06:03,830 could adjust the fit intercept you could adjust a whole bunch of height parameters but the overall concept 81 00:06:03,860 --> 00:06:07,500 of adjusting hot parameters is what we're focused on. 82 00:06:07,500 --> 00:06:13,130 So we're gonna set this after our research we know that a good value is NDP log space negative four 83 00:06:13,130 --> 00:06:18,110 and if you're wondering what the long space is I'm kind of throwing these things out relatively quickly 84 00:06:18,450 --> 00:06:22,370 log space does if we go here. 85 00:06:22,380 --> 00:06:26,550 Returns numbers spaced evenly along a log scale. 86 00:06:26,550 --> 00:06:28,890 So the start stop number. 87 00:06:28,920 --> 00:06:33,720 So this is gonna be between negative four between four and 50 of them. 88 00:06:33,780 --> 00:06:37,180 And if you're wondering what a log space is these are the ways you can do it. 89 00:06:37,320 --> 00:06:37,740 Let's go. 90 00:06:37,770 --> 00:06:39,160 What is a log. 91 00:06:39,210 --> 00:06:39,750 Space 92 00:06:43,010 --> 00:06:45,340 log space reduction and complexity. 93 00:06:45,410 --> 00:06:51,960 Can you explain in simple words what is log space reduction so we could read that again for a few complex 94 00:06:51,960 --> 00:06:55,860 words there but after a little bit of effort you'll be able to figure it out. 95 00:06:56,070 --> 00:07:03,550 So we go here we can only really use one solver here because really after the research and finding what 96 00:07:03,670 --> 00:07:08,370 parameters we should adjust we find that c is probably the most valuable one for logistic regression. 97 00:07:08,830 --> 00:07:10,840 So we're only going to use one value for solver. 98 00:07:10,870 --> 00:07:18,550 So really what we're testing here we're creating a grid of numbers on a log space between negative 4 99 00:07:18,580 --> 00:07:21,470 and 4 so there we go. 100 00:07:21,770 --> 00:07:22,140 Oh sorry. 101 00:07:22,150 --> 00:07:23,200 Because it's a long space. 102 00:07:23,200 --> 00:07:29,330 It's 1 times 10 to the power of negative for up to 1 times tend to the powerful. 103 00:07:29,350 --> 00:07:30,740 So a pretty big space there. 104 00:07:30,760 --> 00:07:32,480 These numbers are pretty well separated. 105 00:07:32,920 --> 00:07:37,510 So let's get rid of that and now we'll create type of parameter 106 00:07:40,090 --> 00:07:51,050 read for random forest classifier wonderful and through our research we find that some of the best parameters 107 00:07:51,050 --> 00:07:59,660 for a random forest number of estimates which is if we're using a random forest and estimate us is how 108 00:07:59,660 --> 00:08:01,990 many trees that we have in our forest. 109 00:08:02,030 --> 00:08:09,050 So N.P. a range and you might be wondering why I'm using ranges here when I'm going through this. 110 00:08:10,200 --> 00:08:16,080 Well the reason is because if we look up the documentation I know in a previous video we have an explicit 111 00:08:16,080 --> 00:08:16,850 the use range. 112 00:08:16,850 --> 00:08:21,850 We've used a list just for a refresher on what a range does. 113 00:08:21,860 --> 00:08:30,220 Let's check out this so it's basically just going to create a range of numbers space 50 apart because 114 00:08:30,220 --> 00:08:33,300 at 50 here between 10 and 1000. 115 00:08:33,300 --> 00:08:37,800 So if we look up randomized I think we already have it maybe here. 116 00:08:37,810 --> 00:08:40,780 There we go there somewhere here. 117 00:08:40,840 --> 00:08:41,830 In contrast 118 00:08:44,420 --> 00:08:49,060 it is highly recommended to use continuous distributions of all continuous parameters. 119 00:08:49,100 --> 00:08:52,030 So that's why we create a range of different values. 120 00:08:52,040 --> 00:08:57,560 This is where we read that from rather than just being an explicit list we're creating a range of continuous 121 00:08:57,560 --> 00:09:02,280 distribution for our hyper parameters for randomized search CVA. 122 00:09:02,450 --> 00:09:04,490 But that's just in the documentation. 123 00:09:04,490 --> 00:09:09,740 So we've got here you could still just use a list but the documentation of people have written ask a 124 00:09:09,740 --> 00:09:12,700 loan lobby recommend us to use a range of values. 125 00:09:12,770 --> 00:09:18,920 So I'm gonna go Max depth here which is another one that we found through our own research of a hyper 126 00:09:18,920 --> 00:09:22,640 parameter that we can tune remember for any machine learning model. 127 00:09:22,820 --> 00:09:28,120 A quick search of going hey I'm using this model I've figured it out I followed the psychic line. 128 00:09:28,250 --> 00:09:29,370 I don't have the map do I. 129 00:09:29,810 --> 00:09:30,530 Yeah we do. 130 00:09:30,530 --> 00:09:34,400 I find the map and I've decided I'm going to use a random forest classifier. 131 00:09:34,400 --> 00:09:36,150 How can I do high parameter tuning Oh. 132 00:09:36,770 --> 00:09:39,830 Well that's where it comes to searching something like this. 133 00:09:39,830 --> 00:09:40,090 Right. 134 00:09:41,160 --> 00:09:49,810 So we go here we find another one is the mean samples split and we go here and we're going to go with 135 00:09:49,810 --> 00:09:50,330 the range. 136 00:09:50,340 --> 00:09:56,370 Actually I've said that we're gonna use a distribution but here Max depth we haven't used a distribution 137 00:09:56,370 --> 00:09:58,150 we've tissues an explicit list. 138 00:09:58,300 --> 00:10:08,340 We'll use another range here and then we'll go here and we'll also use mean samples leaf and p a range 139 00:10:10,250 --> 00:10:12,390 one twenty two. 140 00:10:12,710 --> 00:10:13,680 Beautiful. 141 00:10:13,700 --> 00:10:20,240 So now we have to hyper parameter grids for the two models that we're going to try and have a parameter 142 00:10:20,250 --> 00:10:22,970 tune using randomized search CV. 143 00:10:23,840 --> 00:10:28,580 So what we'll probably do is we'll end this video here and then we'll come back and we'll we'll use 144 00:10:28,580 --> 00:10:34,640 these two grids along with randomized search CV to try and improve our results of our logistic regression 145 00:10:35,030 --> 00:10:39,680 model and our random forest classifier beyond what we initially got. 146 00:10:40,490 --> 00:10:43,000 So beyond these initial values that we got. 147 00:10:43,370 --> 00:10:50,960 Remember these values were obtained without using cross validation because we only used a single train 148 00:10:50,960 --> 00:10:56,320 and test split whereas if we go into keynote and cross validation. 149 00:10:56,330 --> 00:11:02,450 So this is what we used to get our original scores a single train and test split whereas in cross validation 150 00:11:02,900 --> 00:11:07,130 we're going to be using multiple train in test plate remember in the psychic loan section we discuss 151 00:11:07,130 --> 00:11:13,370 that if you're going to provide a metric in terms of how a model is performing it's probably best to 152 00:11:13,370 --> 00:11:18,920 use a cross validation metric especially with classification problems. 153 00:11:19,610 --> 00:11:26,130 So let's revisit that in the next video of tuning our random forest and logistic regression with randomize 154 00:11:26,150 --> 00:11:26,840 search CV.