1 00:00:00,270 --> 00:00:01,300 Wonderful. 2 00:00:01,350 --> 00:00:06,660 We've seen how to quickly get a sniff of how our machine learning models doing and evaluate it using 3 00:00:06,660 --> 00:00:11,910 the score method and that will return a default evaluation metric depending on the problem we're working 4 00:00:11,910 --> 00:00:19,080 on in regression it's returns the coefficient of determination and in classification it returns the 5 00:00:19,080 --> 00:00:20,820 mean accuracy. 6 00:00:20,820 --> 00:00:27,210 However when you get further into a problem it's likely you'll want to start using some more powerful 7 00:00:27,210 --> 00:00:30,330 metrics to evaluate your model's performance. 8 00:00:30,330 --> 00:00:36,140 And so naturally the next step up from using score is to use a custom scoring parameter. 9 00:00:36,180 --> 00:00:41,340 So if we see here we've looked at this one estimate a score method and now we're gonna have a look at 10 00:00:41,340 --> 00:00:48,390 using a scoring parameter model evaluation tools using cross validation or we might have to see what 11 00:00:48,390 --> 00:00:48,930 that is. 12 00:00:49,290 --> 00:00:50,920 So let's dive into it. 13 00:00:50,910 --> 00:00:55,160 Hey what we're going to do is to work this out. 14 00:00:55,170 --> 00:00:59,400 Now you'll have to bear with me for the next few videos because we're going to cover a fair bit here. 15 00:00:59,400 --> 00:01:03,860 Reason being is because evaluating a machine learning model is such an important step. 16 00:01:03,860 --> 00:01:09,150 It's one thing to call a fit function on some data but the next most important thing is going hey is 17 00:01:09,150 --> 00:01:11,670 that model actually working is it learning something. 18 00:01:11,670 --> 00:01:14,320 Could we use that to predict in the future. 19 00:01:14,400 --> 00:01:18,270 And so that's why making sure that we're covering a lot of ground here. 20 00:01:18,330 --> 00:01:19,820 So let's get started. 21 00:01:20,400 --> 00:01:28,120 What we're going to do is go from S.K. learn not model selection import cross vowel score. 22 00:01:28,160 --> 00:01:33,930 And as always we're gonna run the code first before we dive into what's actually going on and what we 23 00:01:33,930 --> 00:01:37,050 do need is some classification code. 24 00:01:37,080 --> 00:01:37,800 So what we're going to do. 25 00:01:37,890 --> 00:01:39,510 We can copy this. 26 00:01:39,510 --> 00:01:43,300 You're allowed to copy this by the way because we've written in a fair few times now. 27 00:01:43,410 --> 00:01:50,710 So only thing different here of what we've done is we've imported cross vowel score from cyclones model 28 00:01:50,760 --> 00:01:51,730 selection. 29 00:01:51,840 --> 00:01:58,080 Now next thing we're going to do because we've called FET there whereas put a little semicolon so we 30 00:01:58,080 --> 00:01:59,400 don't get a big output. 31 00:01:59,490 --> 00:02:03,690 You know what this warning keeps coming up so I might just keep changing that and estimate as equals 32 00:02:03,690 --> 00:02:04,140 100. 33 00:02:04,740 --> 00:02:06,350 Wonderful. 34 00:02:06,350 --> 00:02:15,010 And so cross Val score it has the word score in it but we've seen what does this cross Val doing. 35 00:02:15,010 --> 00:02:16,700 Well let's have a look. 36 00:02:16,740 --> 00:02:24,910 Look I see a dot score we're going to do the test data first so we can compare and then we'll do the 37 00:02:24,910 --> 00:02:34,060 same but this time using cross Val scored cross Val school and cross Val score takes our classifier 38 00:02:34,480 --> 00:02:43,070 it takes X data and it takes a y data on not the test and not the train we'll see what's going on in 39 00:02:43,070 --> 00:02:47,770 the second huh what's happening here. 40 00:02:47,790 --> 00:02:52,560 And again we're getting another warning this is just to say that the default value of CV will change 41 00:02:52,560 --> 00:02:53,740 from three to five. 42 00:02:53,970 --> 00:03:01,040 So if we have a look at this shift tab let's read the doctoring evaluating a score by cross validation 43 00:03:01,070 --> 00:03:05,770 What even is cross validation and so CV is gonna give us a warning. 44 00:03:05,900 --> 00:03:07,430 This is where that warning is coming from. 45 00:03:07,490 --> 00:03:13,650 If we change this from CV Eagles the default is three if we change it to five. 46 00:03:13,760 --> 00:03:20,330 That warning is going to go away and then we're going to get an array back of five different scores. 47 00:03:20,780 --> 00:03:27,350 So that's really the first difference that you'll notice cross Val score returns an array whereas score 48 00:03:27,770 --> 00:03:31,410 only returns a single number okay. 49 00:03:31,450 --> 00:03:33,380 So how can we figure this out. 50 00:03:33,640 --> 00:03:39,640 We need to figure out what cross validation is because that's after all what cross Val score is doing 51 00:03:40,300 --> 00:03:47,400 come back into the doctoring value add a score by cross validation on lucky his on the not prepared 52 00:03:47,430 --> 00:03:49,420 earlier to demonstrate. 53 00:03:49,500 --> 00:03:54,990 And as always I understand things better visually you might do the same to demonstrate what cross validation 54 00:03:55,230 --> 00:03:56,580 is doing. 55 00:03:56,580 --> 00:04:00,280 So what we've done before in our normal training test splint. 56 00:04:00,330 --> 00:04:06,300 So we've split our data into training so this could be a stay we started off with 100 patient records 57 00:04:06,670 --> 00:04:12,240 we'd split it into a training split in our case we've used 80 percent and this would contain X train 58 00:04:12,270 --> 00:04:17,970 and Y train 80 samples and I've used the number 100 here we've really got more than that but this is 59 00:04:17,970 --> 00:04:24,680 just because the numbers work out visually better here and in our test data set we've got 20 percent 60 00:04:24,690 --> 00:04:28,710 of the data which would contain x test and Y test. 61 00:04:28,710 --> 00:04:35,370 Now the difference here with cross validation and in our case this image here is demonstrating five 62 00:04:35,370 --> 00:04:36,970 fold cross validation. 63 00:04:37,140 --> 00:04:43,230 What you probably see cross validation referred to is K fold where k is an arbitrary number and the 64 00:04:43,230 --> 00:04:48,260 reason why we're using five is because we've come back here CV equals five there. 65 00:04:48,270 --> 00:04:54,420 So that stands for cross validation and this is what I'm talking about splitting our data into training 66 00:04:54,420 --> 00:04:58,070 and test using 20 per cent for the test size. 67 00:04:58,080 --> 00:05:05,250 So that means naturally the other 80 per cent goes to the training data which is where this level split 68 00:05:05,250 --> 00:05:06,900 here comes from. 69 00:05:06,990 --> 00:05:12,610 Now what cross validation does is it does five different splits. 70 00:05:12,630 --> 00:05:14,970 So it will use the first 20 percent. 71 00:05:15,000 --> 00:05:18,940 So see how here this is kind of using the last tail 20 percent here. 72 00:05:18,990 --> 00:05:24,780 So it'll create a test data set here and a training data set here then we'll move over here and use 73 00:05:24,780 --> 00:05:28,920 this as the test data set and then again and again and again. 74 00:05:29,010 --> 00:05:36,180 And so what happens is that cross validation trains five different versions of the model and then it 75 00:05:36,300 --> 00:05:44,010 evaluates that those models trained on each training data on five different versions of the test data. 76 00:05:44,070 --> 00:05:46,830 So what's the purpose of this. 77 00:05:46,830 --> 00:05:55,380 Well as you could imagine if we're only training one model it could be a lucky split like say this 80 78 00:05:55,380 --> 00:06:00,090 percent of rows say that had a whole bunch of information in the model was able to learn really well 79 00:06:00,300 --> 00:06:06,600 on these 80 rows on these 80 patient records and then it got a really good score on this test set. 80 00:06:06,660 --> 00:06:11,400 Is that a true reflection of how our model would understand the data or figure out the patterns in the 81 00:06:11,400 --> 00:06:12,420 data. 82 00:06:12,420 --> 00:06:15,120 Well not really right because it's just luck right. 83 00:06:15,120 --> 00:06:20,910 If we're just splitting this randomly and somehow a bunch of easy patient records have gotten here and 84 00:06:20,910 --> 00:06:25,380 the models figured out certain patterns and it's gone over to here this test data center it's got an 85 00:06:25,380 --> 00:06:29,970 amazing score we could be tricking ourselves we could be fooling ourselves into thinking that our model 86 00:06:30,180 --> 00:06:32,680 is far better than what it actually is. 87 00:06:32,700 --> 00:06:35,960 So that's where cross validation comes in to play. 88 00:06:36,540 --> 00:06:43,710 It aims to provide a solution to not training on all the data and avoiding getting those lucky scores 89 00:06:43,950 --> 00:06:45,810 on just a single split of data. 90 00:06:45,840 --> 00:06:49,110 So it'll create five different splits. 91 00:06:49,200 --> 00:06:56,760 So no matter what our model is going to be training on all of the data and evaluated on all of the data. 92 00:06:56,820 --> 00:07:04,590 And so this is why if we come back to C cross value score it's why gives us back five different scores 93 00:07:04,590 --> 00:07:06,510 here. 94 00:07:06,540 --> 00:07:12,690 So that's starting to make sense of we call the score parameter on only our x test data and our y test 95 00:07:12,690 --> 00:07:15,610 data is gonna give back one score. 96 00:07:15,780 --> 00:07:23,070 But if we call cross value score because we refer back to the graphic here it's going to make five different 97 00:07:23,070 --> 00:07:29,010 splits and remember five fold is just an arbitrary number you could do 10 fold you could do three fold 98 00:07:29,010 --> 00:07:34,080 you could even do 100 fold but five fold is the default of the library it's usually pretty good depending 99 00:07:34,080 --> 00:07:35,070 on the size of your data. 100 00:07:35,070 --> 00:07:39,710 So we'll use fivefold to demonstrate here just to prove it to you. 101 00:07:39,720 --> 00:07:42,040 We can go in here cross Val score. 102 00:07:42,130 --> 00:07:43,390 We could even do 10. 103 00:07:43,410 --> 00:07:50,430 So this means it's just gonna make 10 different splits exactly the same as this and then return 10 different 104 00:07:50,530 --> 00:07:54,950 scores so you see here this is a great example on split 1. 105 00:07:54,990 --> 00:07:57,080 It's got a score of point nine. 106 00:07:57,090 --> 00:08:03,570 So that could be 90 percent which is higher here but on a later split it's got something much lower 107 00:08:03,830 --> 00:08:05,210 72. 108 00:08:05,220 --> 00:08:13,530 And so what we do here is to figure out in a more ideal performance metric or evaluation metric for 109 00:08:13,530 --> 00:08:20,880 our model is that we can take the average of these five scores let's see it happen we'll do it all in 110 00:08:20,880 --> 00:08:30,490 one cell random seed and what we're going to do is get a single training and test lit score we're going 111 00:08:30,490 --> 00:08:40,080 to make sure see a left single score equals CnF score x test it's gonna use the same data here why test 112 00:08:41,380 --> 00:08:42,010 wonderful. 113 00:08:42,510 --> 00:08:53,920 And then we're gonna go take mane of five fold cross validation school CSF cross vowel school Eagles 114 00:08:54,070 --> 00:09:04,480 NDP mane then we need cross vowels score X on No we need to pass it our classifier x y and will you 115 00:09:04,480 --> 00:09:18,460 see the Eagles five and then compare the two so I see a live single score and then see a left cross 116 00:09:18,580 --> 00:09:22,130 Val score burn. 117 00:09:22,730 --> 00:09:25,860 So what do you see here. 118 00:09:25,880 --> 00:09:32,720 Well in our case our original single score which is now down here just the exact same number because 119 00:09:32,720 --> 00:09:39,290 we're using a random seed using the same test data our single score is point eight five but when we 120 00:09:39,290 --> 00:09:47,270 use cross validation when we use five splits because CV equals five we get a score of point eight two 121 00:09:47,870 --> 00:09:54,170 so it's slightly lower but in this case if you are asked to report the accuracy of your model even though 122 00:09:54,170 --> 00:10:01,190 it is lower you'd prefer the cross validation metric over the non cross validation metric. 123 00:10:01,370 --> 00:10:07,080 Now wait we haven't even used the scoring parameter at all. 124 00:10:07,310 --> 00:10:12,180 Well that's because by default it's set to None. 125 00:10:12,200 --> 00:10:19,190 Let's have a look at the scoring parameter set to None by default. 126 00:10:19,340 --> 00:10:31,550 So if we call cross Val school seal Elif x y CV we can pass scoring here and it's gonna be set to None. 127 00:10:31,550 --> 00:10:34,520 So if we do shift tab on this how do I know this. 128 00:10:34,560 --> 00:10:39,350 Well luckily the docs string comes in handy C by default it's set to None. 129 00:10:39,350 --> 00:10:43,030 So if we keep scrolling down move our notebook. 130 00:10:43,050 --> 00:10:47,430 So scoring string callable on one optional default. 131 00:10:47,430 --> 00:10:49,110 None. 132 00:10:49,410 --> 00:10:51,310 Okay a string. 133 00:10:51,360 --> 00:10:53,370 See model evaluation documentation. 134 00:10:53,370 --> 00:11:00,190 That's what we've had to look up here or a scorer callable object slash function with signature scorer 135 00:11:00,220 --> 00:11:07,210 estimated x y wish a return only a single value if none the estimate is default scorer if available 136 00:11:07,360 --> 00:11:09,480 is used. 137 00:11:09,940 --> 00:11:16,510 Okay now this is why we know that this is accuracy because if the scoring parameter of cross vowel score 138 00:11:16,750 --> 00:11:22,300 is none it uses the default scoring parameter of our estimate. 139 00:11:22,470 --> 00:11:27,600 In our case what is a default 140 00:11:30,540 --> 00:11:37,200 scoring parameter of classifier equals mean accuracy. 141 00:11:37,350 --> 00:11:39,790 And where do we see that before we saw that in last video. 142 00:11:39,820 --> 00:11:46,920 Go see a left score and we hit shift tab returns the mean accuracy on the given test data and label 143 00:11:46,920 --> 00:11:49,800 so that means when we have scoring set to None. 144 00:11:50,040 --> 00:11:57,100 It's gonna use the default evaluation metric for cross validation on our classifier so if we hit shift 145 00:11:57,100 --> 00:12:02,620 and enter it's gonna return the same values all might be slightly different right because we haven't 146 00:12:02,620 --> 00:12:06,970 set up a seed in this cell so these values are gonna be different to the cross fail score we see out 147 00:12:06,970 --> 00:12:07,480 there. 148 00:12:07,540 --> 00:12:10,540 If we'd run it in here we would have seen a similar values. 149 00:12:10,910 --> 00:12:12,490 Who would have we covered here. 150 00:12:13,430 --> 00:12:19,520 Well as you might have guessed the scoring parameter can be changed right so we can as the docs shrink 151 00:12:19,520 --> 00:12:22,800 says we can input our own scoring parameter here. 152 00:12:23,060 --> 00:12:25,350 We can change this to something other than none. 153 00:12:25,400 --> 00:12:26,950 That is what we're going to start to cover right. 154 00:12:26,960 --> 00:12:28,970 We're gonna have a look at in the next few videos. 155 00:12:29,090 --> 00:12:35,300 Some other classification model evaluation metrics that we can use with cross validation. 156 00:12:35,420 --> 00:12:44,170 And so while we use cross validation well as we saw in the picture cross validation aims to solve the 157 00:12:44,170 --> 00:12:46,750 problem of not training on all the data. 158 00:12:46,750 --> 00:12:47,020 Right. 159 00:12:47,020 --> 00:12:52,900 So we're creating five models and we end up having a model trained on all of the data and avoiding getting 160 00:12:52,900 --> 00:12:54,330 lucky scores. 161 00:12:54,340 --> 00:12:55,890 So training on a single split. 162 00:12:56,440 --> 00:13:03,640 And we saw that in action with our single split score here getting a slightly higher score than our 163 00:13:03,640 --> 00:13:05,960 cross validation score. 164 00:13:06,030 --> 00:13:06,540 Who. 165 00:13:07,170 --> 00:13:07,920 That's a lot. 166 00:13:08,280 --> 00:13:10,200 Now we've still got a bit more to go. 167 00:13:10,230 --> 00:13:14,150 Let's get some depth classification metrics happening. 168 00:13:14,400 --> 00:13:15,320 I'll see in the next video.