1 00:00:00,170 --> 00:00:01,080 All righty. 2 00:00:01,240 --> 00:00:06,300 So we found a logistic regression model that performs pretty well but we went and showed that to our 3 00:00:06,300 --> 00:00:11,910 boss and then she said we had to find a whole bunch of things like half revenue turning feature importance 4 00:00:11,940 --> 00:00:18,600 confusion matrix cross validation and these things could sound made up to someone who is not a budding 5 00:00:18,600 --> 00:00:24,340 data scientist or machine learning engineer like yourself but being the bonding machine learning engineer 6 00:00:24,340 --> 00:00:28,550 and data scientists that you are you know that data scientists make up words all the time. 7 00:00:28,830 --> 00:00:30,300 And the good thing is that these aren't made up. 8 00:00:30,330 --> 00:00:32,630 These are things that we can actually find. 9 00:00:32,640 --> 00:00:47,600 So now we've got a baseline model and we know a models first predictions aren't always what we should 10 00:00:48,800 --> 00:00:52,570 base our next steps off. 11 00:00:53,180 --> 00:00:53,990 What should we do 12 00:00:58,000 --> 00:01:02,280 if we remind ourselves of the classification metrics and regression metrics. 13 00:01:02,290 --> 00:01:05,860 We're not worried about regression right because we're working on a classification problem. 14 00:01:05,860 --> 00:01:08,490 We've used the default accuracy. 15 00:01:08,530 --> 00:01:13,960 Then there's precision there's recall there's F1 which are all part of the things that the boss asked 16 00:01:13,960 --> 00:01:15,840 us and we got a confusion matrix. 17 00:01:15,850 --> 00:01:20,010 Okay we should probably make one of them since we're working on a classification problem. 18 00:01:20,290 --> 00:01:26,020 Then we've got a classification report which has a bunch of information here like precision recall f1 19 00:01:26,020 --> 00:01:29,810 score support accuracy macro average far out. 20 00:01:29,830 --> 00:01:35,040 And then we've got hyper parameter tuning we've covered before but we've got a lot of steps. 21 00:01:35,050 --> 00:01:43,980 So first we should put down what we're going to tackle let's look at the following because we're a part 22 00:01:43,980 --> 00:01:46,050 of this experimental phase. 23 00:01:46,050 --> 00:01:47,880 That's what we're going to be working towards. 24 00:01:47,880 --> 00:01:50,890 We're going to be taking our models that we've got. 25 00:01:50,910 --> 00:01:56,040 These are our baseline models here and we're going to be experimenting with them further seeing if we 26 00:01:56,040 --> 00:02:03,240 can improve them and seeing does logistic regression perform best on accuracy or could attuned or hyper 27 00:02:03,240 --> 00:02:07,650 parameter tuned version of a random forest beat it out or maybe K and M could improve. 28 00:02:07,650 --> 00:02:08,360 Who knows. 29 00:02:08,640 --> 00:02:09,540 Let's write to analysts. 30 00:02:09,540 --> 00:02:14,760 Let's look at the following hyper parameter tuning 31 00:02:17,310 --> 00:02:21,520 feature importance confusion matrix 32 00:02:24,570 --> 00:02:25,800 cross validation 33 00:02:29,170 --> 00:02:29,860 precision 34 00:02:33,940 --> 00:02:46,420 recall if one school classification report ROIC curve area under the curve. 35 00:02:47,250 --> 00:02:52,590 So as we saw on the psychic Loan Section These are some of the things that you should pay attention 36 00:02:52,590 --> 00:02:56,130 to when youre working on a classification model. 37 00:02:56,130 --> 00:03:01,890 These two here are actually part of almost any machine learning model that you'll be working on. 38 00:03:01,980 --> 00:03:04,610 But these are specific to classification. 39 00:03:04,620 --> 00:03:09,150 We've seen the difference between regression and classification metrics but since we're focusing on 40 00:03:09,150 --> 00:03:13,470 the classification problem whether or not someone has heart disease based on their health parameters 41 00:03:13,740 --> 00:03:16,240 we're gonna be focused on looking at these. 42 00:03:16,350 --> 00:03:19,020 So let's jump into it. 43 00:03:19,020 --> 00:03:20,340 So first things first. 44 00:03:20,400 --> 00:03:26,800 Why don't we try hyper parameter tuning and cross validation and remember what's hyper parameter churning. 45 00:03:26,820 --> 00:03:32,690 Well to cook your favorite dish you know that to set the oven to 180 degrees and turn that grill on. 46 00:03:33,090 --> 00:03:39,000 But when your roommate cooks their favorite dish they set the oven to 200 degrees and use the fan force 47 00:03:39,000 --> 00:03:43,000 mode same oven different settings different outcomes. 48 00:03:43,020 --> 00:03:47,970 This is the same for machine learning algorithms you can use the same algorithms but change the settings 49 00:03:48,030 --> 00:03:51,470 a.k.a. hyper barometers and get different results. 50 00:03:51,690 --> 00:03:56,490 But just like turning up the oven too high if you do that for machine learning algorithms you tweak 51 00:03:56,490 --> 00:04:02,550 the settings too much you can change the settings and it works well or so well that it over fits. 52 00:04:02,550 --> 00:04:07,470 So we're really here what we're doing to do is look for the Goldilocks model one which does well in 53 00:04:07,470 --> 00:04:14,530 our dataset but also performs well on unseen examples should we turn AK neighbors classifier. 54 00:04:15,030 --> 00:04:15,440 Mm hmm. 55 00:04:15,900 --> 00:04:21,450 Yeah I might do that even though we've said goodbye to the cayenne and classify we might just see how 56 00:04:21,450 --> 00:04:22,230 would you tune it. 57 00:04:22,260 --> 00:04:27,030 And if you're wondering where I would start if I was trying to approach this this is what I'd start 58 00:04:27,030 --> 00:04:37,330 with how to tune a K neighbors classifier model we go here in depth parameter tuning for cayenne in 59 00:04:37,780 --> 00:04:41,200 model selection tuning and evaluation and came nearest neighbors. 60 00:04:41,260 --> 00:04:45,700 So if we were to read through these we would probably find some information that we're going to cover 61 00:04:45,700 --> 00:04:48,510 here so let's start it out. 62 00:04:48,960 --> 00:04:54,030 Let's make a little heading here we'll go here. 63 00:04:54,740 --> 00:05:02,400 Hyper parameter tuning let's change. 64 00:05:02,790 --> 00:05:03,170 Kanan 65 00:05:06,510 --> 00:05:13,980 let's go a list of training schools an empty list so try and scores and then a list of test scores. 66 00:05:14,020 --> 00:05:19,450 Empty list because what we want to do is compare different versions of the same model. 67 00:05:19,460 --> 00:05:24,500 So model with different settings and compare their scores and the two different datasets. 68 00:05:24,650 --> 00:05:29,720 Looking up what we looked at before we would to read into these again you can do this with almost any 69 00:05:29,720 --> 00:05:32,680 machine learning model or basically any machine learning model. 70 00:05:32,690 --> 00:05:33,890 This is a part of the research. 71 00:05:33,890 --> 00:05:39,230 This is a part of the experimentation step of many machine learning problems is is searching up things 72 00:05:39,230 --> 00:05:44,900 like this then reading what you can find and bringing it back to a scenario like what we're doing here. 73 00:05:45,020 --> 00:05:47,220 But let's just pretend we've gone through that. 74 00:05:48,230 --> 00:05:51,080 So we're going to create a list of different values 75 00:05:53,570 --> 00:05:56,030 for and neighbors. 76 00:05:56,030 --> 00:05:57,560 Now why and neighbors. 77 00:05:57,560 --> 00:06:06,480 Because as I said before when we did our reading we found that if we go S.K. loan K and in K neighbors 78 00:06:06,550 --> 00:06:14,060 classifier we see that n neighbors is one of the parameters and we could see this by reading through 79 00:06:14,060 --> 00:06:16,550 the documentation of psychic loan as well. 80 00:06:16,580 --> 00:06:21,200 So the number of neighbors used by default for K neighbors queries in our research on how to tune a 81 00:06:21,300 --> 00:06:22,070 and and model. 82 00:06:22,100 --> 00:06:26,810 We found that we can adjust the number of neighbors and if you want to find out more about this you 83 00:06:26,810 --> 00:06:28,090 can check out the documentation. 84 00:06:28,100 --> 00:06:29,090 You can read it up here. 85 00:06:29,570 --> 00:06:30,610 So that's what we're going to do. 86 00:06:30,620 --> 00:06:33,680 We're going to create a list of different values. 87 00:06:33,830 --> 00:06:35,750 So the default here is five. 88 00:06:35,870 --> 00:06:41,270 So maybe what we might do is try different values from one to 20 or something like that. 89 00:06:41,960 --> 00:06:48,500 So let's see here neighbors equals range 1 to 21. 90 00:06:48,500 --> 00:06:49,140 Wonderful. 91 00:06:49,580 --> 00:06:58,620 And then we're going to set up K an instance so go Mary Kay and N equals what is it. 92 00:06:59,180 --> 00:07:02,070 You know this kind enables classifier. 93 00:07:02,070 --> 00:07:03,050 Beautiful. 94 00:07:03,150 --> 00:07:05,320 And then we might go loop through. 95 00:07:05,340 --> 00:07:09,420 We need to loop through different and neighbors 96 00:07:12,630 --> 00:07:18,350 which is our ace that we made here for I in neighbors thinking out loud here. 97 00:07:18,660 --> 00:07:20,350 We'll go Cannon Dot. 98 00:07:20,400 --> 00:07:22,160 Set parameters. 99 00:07:22,170 --> 00:07:27,390 Remember you can adjust the parameters of a machine learning model and socket line using dot set parameters 100 00:07:28,050 --> 00:07:32,490 equals I because I'm going to loop through this range here. 101 00:07:32,490 --> 00:07:33,530 Wonderful. 102 00:07:33,630 --> 00:07:41,700 And then we're going to fit the algorithm what we're trying to do here is improve on this baseline score. 103 00:07:41,700 --> 00:07:44,950 So this is with the default of 5. 104 00:07:45,060 --> 00:07:52,470 And so we're going to try 20 different versions 1 to 20 to see if it's any better I do anything I accidentally 105 00:07:52,470 --> 00:07:58,560 hit shift into trigger happy you know my fingers are just they just default to shift in into whenever 106 00:07:58,560 --> 00:08:00,170 they want. 107 00:08:00,180 --> 00:08:07,510 Now we want to update training scores list so we're going to go training. 108 00:08:07,710 --> 00:08:12,860 We want train scores up here train scores dot append. 109 00:08:12,960 --> 00:08:18,250 Now we want our canine Ensign stopped score on X train y train. 110 00:08:18,600 --> 00:08:19,680 Wonderful. 111 00:08:19,680 --> 00:08:30,090 And then we want to update the test scores list and we'll do that test scores dot append K and N dot 112 00:08:30,090 --> 00:08:34,420 score and you can probably guess this is going to be on the test set. 113 00:08:34,500 --> 00:08:36,270 Beautiful bonus points if he did. 114 00:08:36,960 --> 00:08:43,140 Now what this is going to do is just gonna loop through this range of 1 to 20 and then it's gonna create 115 00:08:43,650 --> 00:08:48,190 20 different canine models and append their scores to lists. 116 00:08:48,240 --> 00:08:56,950 So let's check out those lists train scores shift into wonderful and then go test scores. 117 00:08:57,220 --> 00:08:59,190 Typo. 118 00:09:00,010 --> 00:09:02,110 Now we've got 20 different schools or so. 119 00:09:02,200 --> 00:09:04,630 But these are probably best visualized. 120 00:09:04,690 --> 00:09:06,040 Let's see how we would do that. 121 00:09:06,820 --> 00:09:07,960 Let's go penalty. 122 00:09:08,110 --> 00:09:20,410 No lot neighbors train schools and then we'll get a label equals train score. 123 00:09:20,410 --> 00:09:21,130 Of course. 124 00:09:21,130 --> 00:09:22,030 Yes. 125 00:09:22,030 --> 00:09:32,570 BLT will add another plot neighbors test scores label equals test score. 126 00:09:32,690 --> 00:09:33,690 Wonderful. 127 00:09:33,860 --> 00:09:36,950 And we want to go peyote actually. 128 00:09:36,950 --> 00:09:43,290 Let's just see what this looks like ex label will go a number of neighbors because that's what's on 129 00:09:43,290 --> 00:09:45,840 the X spelling neighbors. 130 00:09:45,860 --> 00:09:48,210 The Australian way here with you there. 131 00:09:48,250 --> 00:09:49,220 Well maybe it's not Australian. 132 00:09:49,220 --> 00:09:54,270 Maybe it's just a different version but neighbours in a socket line is spelled with no u. 133 00:09:54,380 --> 00:10:03,470 So this is the x and then the dot y label is going to be the scores so this is X Y so this is going 134 00:10:03,470 --> 00:10:11,380 to be model score and then we want P LTE legend and we'll see what comes up. 135 00:10:13,070 --> 00:10:24,120 Actually we want little tidbit here Max F so maximum Kane and score on the test data. 136 00:10:24,380 --> 00:10:30,530 We're going to go Max test scores. 137 00:10:30,530 --> 00:10:39,530 It's gonna be accuracy so we want times one hundred point to f close parentheses and then close string. 138 00:10:39,670 --> 00:10:44,420 That should work wonderful OK. 139 00:10:44,450 --> 00:10:49,940 So we can see here that we're really paying attention to this test score here. 140 00:10:49,970 --> 00:10:59,300 So we get the highest value here is around about 11 for K nearest neighbors. 141 00:10:59,350 --> 00:11:05,310 So if we look at the graph actually that's let's adjust these x2 so we can see where it actually is. 142 00:11:05,310 --> 00:11:10,920 X takes dot NDP range 1 to twenty one space of 1. 143 00:11:10,930 --> 00:11:20,040 So what this is gonna do is just bring out a list of range of 1 to twenty one of space 1 which is the 144 00:11:20,040 --> 00:11:22,480 exact same as what we're using here. 145 00:11:22,560 --> 00:11:25,020 So neighbors equals range 1 to twenty. 146 00:11:25,290 --> 00:11:33,350 So let's do that and yes our assumption before before we adjusted the X ticks which is these just little 147 00:11:33,350 --> 00:11:34,370 labels down here. 148 00:11:34,550 --> 00:11:36,010 Was that okay. 149 00:11:36,200 --> 00:11:42,670 And then or n neighbors value of eleven yields the best score on our test data set. 150 00:11:42,680 --> 00:11:46,370 So if we come up here the default we remember is five. 151 00:11:46,370 --> 00:11:51,650 So we've just done a little bit of hyper parameter tuning and we were able to improve our cayenne and 152 00:11:51,650 --> 00:11:59,000 classifiers results on the test data set by changing it from the default 5 and and neighbors parameter. 153 00:11:59,000 --> 00:12:04,970 So we're doing hyper parameter tuning here change that to 11 and we get seventy five point four one 154 00:12:05,060 --> 00:12:11,880 versus our initial result which is back up here of sixty eight per cent Mm hmm. 155 00:12:12,070 --> 00:12:19,290 Well even then even after improving it to 75 or whatever it was it's still far below Alo drastic regression 156 00:12:19,620 --> 00:12:21,500 and random forest. 157 00:12:21,660 --> 00:12:27,280 So I think this has put the nail in the coffin for our K N N model is that because even with the type 158 00:12:27,290 --> 00:12:33,330 of parameter tuning it still hasn't reached the scores that we got with logistic regression or random 159 00:12:33,330 --> 00:12:34,590 forest. 160 00:12:34,830 --> 00:12:37,890 And so because of this we're going to discard K and end for now. 161 00:12:37,950 --> 00:12:39,750 This is part of our experiment. 162 00:12:39,840 --> 00:12:44,970 A big part of experimenting a machine learning step six is going through different machine learning 163 00:12:44,970 --> 00:12:47,520 models and figuring out which work and which don't. 164 00:12:47,820 --> 00:12:53,520 So what we're trying to do here is as quickly as possible work through different little experiments 165 00:12:53,520 --> 00:12:59,640 like what we've just done tuning K and N and seeing which model performs best on our data. 166 00:12:59,640 --> 00:13:05,650 So what we might try and do is we've churned Kanan by hand like we wrote out this little for loop here. 167 00:13:05,730 --> 00:13:06,810 That was a bit tedious right. 168 00:13:06,810 --> 00:13:10,980 So you had to do that for every single machine learning model you might run into some problems. 169 00:13:10,980 --> 00:13:15,750 It was okay here because we had one parameter tuned but if you had more parameters writing these for 170 00:13:15,750 --> 00:13:17,910 loops is going to be very inefficient. 171 00:13:17,910 --> 00:13:25,380 So what we might do next see how we can tune logistic regression and random forest classifier using 172 00:13:25,440 --> 00:13:30,280 randomized search CV which stands for cross validation. 173 00:13:30,330 --> 00:13:36,060 So instead of us having to manually try choosing different hybrid parameters by hand randomize search 174 00:13:36,060 --> 00:13:41,190 CV which we've seen in the psychic loan section it's gonna try a number of different combinations of 175 00:13:41,190 --> 00:13:47,070 hyper parameters for us and evaluate which ones are the best and then save them for us. 176 00:13:47,070 --> 00:13:48,900 So that's what we're going to have a look at that next video.