1 00:00:00,450 --> 00:00:02,610 Okay you've made it. 2 00:00:02,820 --> 00:00:04,180 Look how far we've come. 3 00:00:04,290 --> 00:00:10,140 We've covered an absolutely massive chunk of the psychic line library and we're still going to finish 4 00:00:10,140 --> 00:00:10,650 it off. 5 00:00:10,650 --> 00:00:13,710 We're up to the last section putting it all together. 6 00:00:13,800 --> 00:00:15,810 We've been through an intense workflow. 7 00:00:15,810 --> 00:00:17,690 We've seen how to get the data ready. 8 00:00:17,700 --> 00:00:21,020 We've seen how to choose the right machine learning model for our problems. 9 00:00:21,030 --> 00:00:22,660 We fill it to the data. 10 00:00:22,770 --> 00:00:24,730 We've used it to make predictions. 11 00:00:24,750 --> 00:00:26,740 We've evaluated a trained model. 12 00:00:26,820 --> 00:00:28,890 We've improved a drain model. 13 00:00:28,890 --> 00:00:30,970 We've saved and loaded a trained model. 14 00:00:30,990 --> 00:00:39,060 Now let's combine all of this all of this could we do it could we do it now let's make a little heading 15 00:00:40,140 --> 00:00:49,800 7 Buono putting it all together first thing give yourself a little pat on the back right. 16 00:00:49,810 --> 00:00:56,410 Because we've covered a lot and so far it seems to be all over the place in which it is but not to worry. 17 00:00:56,420 --> 00:01:02,300 This is how machine learning projects often start out as a whole bunch of different experiments and 18 00:01:02,300 --> 00:01:03,880 code all over the place. 19 00:01:04,130 --> 00:01:09,220 And then once you've found something which works the refinement process begins. 20 00:01:09,430 --> 00:01:12,650 And now what would this refinement process look like. 21 00:01:12,650 --> 00:01:17,960 Well let's use remember our good old car sales regression problem a.k.a. predicting the sale price of 22 00:01:17,960 --> 00:01:18,860 cars. 23 00:01:18,860 --> 00:01:21,970 We'll use that as an example and to tidy things up. 24 00:01:22,040 --> 00:01:24,450 We're going to be using psychic loans. 25 00:01:24,470 --> 00:01:24,980 Let's have a look. 26 00:01:24,990 --> 00:01:28,760 Psychic learn pipeline to tidy things up. 27 00:01:28,760 --> 00:01:31,490 We're going to be using the pipeline class. 28 00:01:31,670 --> 00:01:38,120 Now you can imagine pipeline as being a way to string together a number of different socket line processes 29 00:01:38,450 --> 00:01:41,090 in one hit so similar to like writing a function right. 30 00:01:41,090 --> 00:01:43,400 You're getting a couple of different steps and you're putting them together. 31 00:01:44,860 --> 00:01:50,500 Pipeline of Transformers with the final list minor so a transformer or Transformers a.k.a. transforming 32 00:01:50,500 --> 00:01:50,900 data. 33 00:01:50,920 --> 00:01:54,370 Getting data ready and then a final estimate are estimated. 34 00:01:54,370 --> 00:02:01,210 Remember in psychic loan terms is a machine learning model so as a refresh this is our car sales data 35 00:02:02,130 --> 00:02:05,810 is a thousand rows and there's some missing data here right. 36 00:02:05,830 --> 00:02:07,200 So there's missing here. 37 00:02:07,240 --> 00:02:08,050 Missing here. 38 00:02:08,050 --> 00:02:15,580 And what we're trying to do is use make color odometer doors to predict the sale price of cars. 39 00:02:15,580 --> 00:02:19,480 Now again this data is imperfect because could you do this. 40 00:02:19,480 --> 00:02:21,570 Could you figure this out given enough time. 41 00:02:21,700 --> 00:02:26,650 You'd probably realistically need a few more parameters a few more details about the cars before you 42 00:02:26,650 --> 00:02:29,230 could figure out how much they'd sell for. 43 00:02:29,560 --> 00:02:36,300 So let's stop talking about it let's get hands on with the code to to start off we'll report the data. 44 00:02:36,310 --> 00:02:40,290 Go pay data read see as v dot dot. 45 00:02:40,300 --> 00:02:42,360 We've saved it in our data voter. 46 00:02:43,000 --> 00:02:48,620 Car sales extended online in a data folder is not is not up the chain. 47 00:02:48,750 --> 00:02:51,640 I want car sales. 48 00:02:52,030 --> 00:02:52,890 There we go. 49 00:02:52,900 --> 00:02:55,790 Car sales extended data missing. 50 00:02:56,200 --> 00:02:57,520 That's what we want. 51 00:02:57,520 --> 00:02:59,060 And then we'll have a look. 52 00:02:59,530 --> 00:03:02,470 We'll have a quick peek at it wonderful. 53 00:03:02,500 --> 00:03:10,820 So this is just reporting this CSB fall that we have here comeback now. 54 00:03:10,920 --> 00:03:12,360 What did we say. 55 00:03:12,360 --> 00:03:17,670 We said we want to use these columns so these four columns to predict the price column we'll have a 56 00:03:17,670 --> 00:03:19,350 look at our data types. 57 00:03:19,410 --> 00:03:24,450 So you've got make which is an object that's a string their color which is an object that's a string 58 00:03:24,450 --> 00:03:31,330 their odometer doors and price which are we've kind of talked about doors being a category as well. 59 00:03:31,680 --> 00:03:34,050 So odometer is our numerical column. 60 00:03:34,050 --> 00:03:36,810 And do you remember what we have to do for our data. 61 00:03:36,810 --> 00:03:41,430 We have to make sure that it's all in numerical format before we can build a machine learning model 62 00:03:41,430 --> 00:03:46,160 on it and there's another little roadblock that we have as well before we can build our machine learning 63 00:03:46,160 --> 00:03:48,530 model and that figure is an A. 64 00:03:48,590 --> 00:03:55,850 So is now not some we've got no values in some of our while all of our columns we've got 50 missing 65 00:03:55,970 --> 00:03:57,860 missing data points in each of our columns. 66 00:03:58,430 --> 00:04:01,090 So that's also a big no no with machine learning models. 67 00:04:01,100 --> 00:04:07,340 We need to make sure that the data is numerical and that there's no missing values we refer back to 68 00:04:07,340 --> 00:04:10,930 our keynote or data should be numerical. 69 00:04:10,970 --> 00:04:12,270 This is things to remember right. 70 00:04:13,010 --> 00:04:14,910 And there should be no missing values. 71 00:04:15,090 --> 00:04:17,800 Okay let's check it out. 72 00:04:17,810 --> 00:04:23,210 So what we'll have to do is if we want to put this all together we've done this before we've seen filling 73 00:04:23,210 --> 00:04:30,570 data we've seen converting it but this time we want to use a pipeline to do so so the socket lines pipelines 74 00:04:30,600 --> 00:04:39,890 main import is steps which is a list that contains a topple which has step name and action to take. 75 00:04:39,890 --> 00:04:45,220 We're gonna see this in a second what we're just talking through it and in our case our steps are what 76 00:04:45,220 --> 00:04:46,090 are our steps. 77 00:04:46,090 --> 00:04:54,530 Well first we need to fill the missing data and then convert the data to numbers and then build a machine 78 00:04:54,530 --> 00:04:56,020 learning model on it. 79 00:04:56,090 --> 00:04:57,150 So let's do it. 80 00:04:57,360 --> 00:04:59,090 We'll go here. 81 00:04:59,090 --> 00:05:01,440 I want to do this one so it's all in one cell. 82 00:05:01,460 --> 00:05:02,560 Let's put our steps here. 83 00:05:02,570 --> 00:05:06,130 What we have to do steps. 84 00:05:06,310 --> 00:05:09,850 We want to do all in one cell. 85 00:05:09,850 --> 00:05:11,170 Can we do it all in one cell. 86 00:05:11,170 --> 00:05:15,520 Because before to do this it took a fair bit of work but let's say this is what we're doing. 87 00:05:15,520 --> 00:05:18,970 This is our refinement process using pipeline. 88 00:05:19,000 --> 00:05:28,590 So step one is fill missing data and then step two is convert data to numbers. 89 00:05:28,690 --> 00:05:29,620 Wonderful. 90 00:05:29,710 --> 00:05:32,700 And then step 3 is build a model on the data. 91 00:05:35,180 --> 00:05:35,810 All right. 92 00:05:35,970 --> 00:05:39,250 Cracks knuckles gets fingers ready let's do them. 93 00:05:39,300 --> 00:05:44,220 And if it seems like I'm going through this pretty fast let me tell you there's nothing here that we 94 00:05:44,220 --> 00:05:49,440 haven't covered except for the pipeline but what we've learned in the previous videos you'll be able 95 00:05:49,440 --> 00:05:52,910 to deduce as well as reading the documentation here. 96 00:05:52,950 --> 00:05:54,570 What exactly is going on. 97 00:05:54,710 --> 00:06:02,510 So import panders is payday because you want to do this all in one cell and then from SBA loan dot com 98 00:06:02,510 --> 00:06:03,840 pose. 99 00:06:03,900 --> 00:06:07,320 Remember the column transformer. 100 00:06:07,320 --> 00:06:09,110 And then from SBA loan. 101 00:06:09,120 --> 00:06:13,190 This is where we're going to import the pipeline class that we're going to take advantage of here. 102 00:06:13,230 --> 00:06:21,250 Import pipeline and then from SBA loan not impute we're going to import simple computer. 103 00:06:21,250 --> 00:06:24,260 This is what we're going to use to impute our data. 104 00:06:24,270 --> 00:06:26,330 So fill the missing data. 105 00:06:26,810 --> 00:06:33,830 And then from as K line dot pre processing import one hot encoder. 106 00:06:34,080 --> 00:06:39,180 This is what we're going to do to convert our data to numbers or sorry we're going to use to convert 107 00:06:39,180 --> 00:06:40,070 our data to number. 108 00:06:41,010 --> 00:06:44,140 So getting data ready. 109 00:06:44,160 --> 00:06:52,480 These are the inputs we're going to use for that and the next one is modeling so from Skyline dot ensemble 110 00:06:53,760 --> 00:06:58,090 import our trusty Random Forests regress. 111 00:06:58,260 --> 00:07:07,240 Because we wanted to predict a number and then from SBA loan model selection import over there right. 112 00:07:07,510 --> 00:07:08,320 Yeah that looks right. 113 00:07:08,350 --> 00:07:13,870 I've been known to be prone to typos try and test blimp and we want grid search CV because we've seen 114 00:07:14,140 --> 00:07:19,690 grid search being used before which is gonna search for some optimal hover parameters for our Random 115 00:07:19,690 --> 00:07:21,040 Forest aggressor. 116 00:07:21,040 --> 00:07:32,510 Okay now we're going to set up random seed so we'll import num pi as NDP MP random seed 42. 117 00:07:32,950 --> 00:07:39,690 Wonderful now for completeness for doing it all in one cell we're going to reinstall the dinosaur import 118 00:07:39,690 --> 00:07:50,530 data and we're going to drop the rows with missing labels so data equals PD CSB. 119 00:07:50,860 --> 00:07:53,470 We're going to go to our data folder. 120 00:07:53,500 --> 00:07:55,700 This needs to be a string to go. 121 00:07:55,720 --> 00:07:59,910 Car sales extended missing data. 122 00:07:59,920 --> 00:08:07,490 That's the one we're after the thousand Rose version and then to drop the rows that have missing labels. 123 00:08:07,570 --> 00:08:14,440 We'll go data dot drop in a subset we'll set that equal to the price column because that's what we want 124 00:08:14,440 --> 00:08:18,910 to predict and that's our label and we're going to go in place equals true. 125 00:08:19,030 --> 00:08:19,930 Okay. 126 00:08:19,930 --> 00:08:28,360 So what we've done is we've just replicated this here and now we've dropped the rows which contain missing 127 00:08:28,540 --> 00:08:32,810 price values so a.k.a. missing labels. 128 00:08:32,830 --> 00:08:38,830 Now what we're going to do here is we're going to start to define the different features of the dataset 129 00:08:39,370 --> 00:08:46,300 and the different transformer pipelines that we want to take place on that data. 130 00:08:46,300 --> 00:08:52,360 So if we come back up here what transformations do we need to make we need to change our making color 131 00:08:53,110 --> 00:08:54,520 into numbers. 132 00:08:54,850 --> 00:09:01,840 We need to fill the missing values of all the columns in here and we need to change our doors because 133 00:09:01,840 --> 00:09:05,530 it's also a category we need to change it to a category of value. 134 00:09:05,530 --> 00:09:09,940 So we're going to actually know forget what I said about doors. 135 00:09:10,130 --> 00:09:12,700 Let's see what we're actually doing in the code. 136 00:09:12,920 --> 00:09:18,740 So define different features and transformer pipelines. 137 00:09:19,460 --> 00:09:26,870 So for our categorical features doors doors is a special case because it's already numerical categorical 138 00:09:26,870 --> 00:09:32,090 features equals make and color. 139 00:09:32,110 --> 00:09:37,230 Now what we're going to do here is create a categorical transformer. 140 00:09:37,270 --> 00:09:41,500 So this just means modifying the data and to do so we're going to use a pipeline. 141 00:09:41,500 --> 00:09:47,530 This is the first time we're seeing pipeline but the steps in the pipeline are going to be first we're 142 00:09:47,530 --> 00:09:51,240 going to fill with missing values within puter. 143 00:09:51,430 --> 00:09:55,540 So this is where the pipeline inputs take steps to take. 144 00:09:55,660 --> 00:09:55,920 Right. 145 00:09:55,930 --> 00:10:00,550 One after the other as a list of topples so let's see this happening. 146 00:10:00,550 --> 00:10:06,470 The name is in pewter and we're going to remember this is the action to take. 147 00:10:06,520 --> 00:10:14,070 So simple computer strategy equals constant. 148 00:10:14,530 --> 00:10:17,610 And then the fill value is going to be missing. 149 00:10:19,360 --> 00:10:27,090 So you might be out of juice what's happening there with this particular case where imputing the categorical 150 00:10:27,090 --> 00:10:31,930 features with the constant value of missing string. 151 00:10:31,950 --> 00:10:41,780 Now we're going to create one hot and go one hot encoder and to handle the unknowns the columns that 152 00:10:41,780 --> 00:10:45,580 hasn't seen before we're going to tell it to ignore them. 153 00:10:46,740 --> 00:10:47,910 Wonderful. 154 00:10:48,210 --> 00:10:55,050 And then we'll make sure that we've all lined up or lined up these things that we're gonna need to we're 155 00:10:55,050 --> 00:10:56,620 missing something here. 156 00:10:56,640 --> 00:11:00,520 Need to check where our brackets are at. 157 00:11:00,710 --> 00:11:02,730 We're missing one here. 158 00:11:02,750 --> 00:11:04,130 There we go. 159 00:11:04,160 --> 00:11:09,800 See half of this is our figuring out what you're missing racquets are and then door feature. 160 00:11:09,800 --> 00:11:15,530 We're going to define our door feature Eagles doors. 161 00:11:15,780 --> 00:11:23,230 And so the same thing again so we're gonna create a door transformer door. 162 00:11:23,240 --> 00:11:29,650 Transformer equals pipeline again we're using a pipeline because this will all make sense in a second 163 00:11:29,650 --> 00:11:29,870 right. 164 00:11:29,900 --> 00:11:38,400 Why we're setting out pipelines so to do this we have to press impute. 165 00:11:39,470 --> 00:11:48,360 We want to do the same thing again simple computer and the strategy is going to be constant and the 166 00:11:48,360 --> 00:11:54,900 full value for our doors is we're just going to fill it out with force because four is the majority 167 00:11:54,900 --> 00:11:56,910 number in the doors column. 168 00:11:56,910 --> 00:12:01,040 Then next we want numeric features. 169 00:12:01,170 --> 00:12:06,370 So our numeric features is our domain a common Cam. 170 00:12:06,380 --> 00:12:07,190 Wonderful. 171 00:12:07,190 --> 00:12:15,620 And then we're going to create a numeric transformer equals first saying we're going to do this pipeline 172 00:12:16,670 --> 00:12:22,900 steps equal and then again we need to impute values here. 173 00:12:22,930 --> 00:12:32,820 So impute a simple computer strategy equals mean. 174 00:12:32,890 --> 00:12:39,910 So this means it's going to fill the numerical columns in our case the odometer with the strategy means 175 00:12:39,910 --> 00:12:46,090 is gonna take the main value of the odometer column and fill all of the missing rows in odometer with 176 00:12:46,090 --> 00:12:49,300 the main of the rest of the values. 177 00:12:49,390 --> 00:12:49,920 That was a lot. 178 00:12:50,140 --> 00:12:51,880 Now still not done. 179 00:12:51,880 --> 00:12:53,170 That's right. 180 00:12:53,710 --> 00:12:57,630 We're gonna set up the pre processing steps. 181 00:12:58,180 --> 00:13:01,180 So remember what we need to do we need to fill missing data. 182 00:13:01,240 --> 00:13:05,330 So this is part of our pre processing these to a pre processing our data. 183 00:13:05,590 --> 00:13:07,250 And this is modelling the data. 184 00:13:07,480 --> 00:13:12,790 So our pre processing steps are going to be fill missing data and convert data to numbers so we can 185 00:13:12,790 --> 00:13:19,100 do that fill missing values then convert to numbers. 186 00:13:19,550 --> 00:13:29,820 So to create a pre processor will instantiate a pre processor using column transformer tab order complete. 187 00:13:30,820 --> 00:13:38,370 And we're going to pass it column transformer takes Transformers Eagles. 188 00:13:38,430 --> 00:13:43,290 And now this is going to be a list of the transformations we want to do on our data and in our case 189 00:13:43,300 --> 00:13:50,050 we have one called Cat for short and categorical it's going to it the categorical transformer and we 190 00:13:50,050 --> 00:13:56,050 want those transformations that we've defined up here the categorical transformer to fill a values simple 191 00:13:56,050 --> 00:14:00,350 computer with missing and to one hot in code making color. 192 00:14:00,370 --> 00:14:08,580 We want those to be performed on the categorical features and press tab or complete categorical features. 193 00:14:08,580 --> 00:14:09,240 Wonderful. 194 00:14:09,240 --> 00:14:11,070 Now we pass it another topple. 195 00:14:11,130 --> 00:14:21,530 So now we want it to do same steps but this time using the door transformer which is going to fill the 196 00:14:21,530 --> 00:14:24,570 door column with the constant value of four. 197 00:14:25,000 --> 00:14:29,730 So go door feature wonderful. 198 00:14:29,860 --> 00:14:37,000 And finally I wanted to finish up using the nom transformer which is going to take our numeric transformer 199 00:14:38,990 --> 00:14:44,140 and perform the imputation on the numeric features 200 00:14:46,840 --> 00:14:52,600 beautiful when there's only one feature so features is as a plural is probably pulling not correct. 201 00:14:52,600 --> 00:14:58,060 And now now that we've done our pre processing and so our data is getting filled the missing values 202 00:14:58,060 --> 00:14:58,750 are getting filled. 203 00:14:58,750 --> 00:15:02,650 And then it's getting converted to numbers using what we've done here. 204 00:15:03,710 --> 00:15:08,310 The next step is to create a pre processing and modelling pipeline. 205 00:15:08,350 --> 00:15:12,310 We can do this as you maybe had a guess because I've used the word pipeline. 206 00:15:12,340 --> 00:15:19,360 We need to create a pre processing and modelling pipeline so now essentially what we're doing is we're 207 00:15:19,360 --> 00:15:23,550 putting our pre processing steps together with a modelling step. 208 00:15:23,550 --> 00:15:27,880 And because there's multiple steps we'll use a pipeline to combine them. 209 00:15:28,000 --> 00:15:30,970 So model equals pipeline 210 00:15:34,070 --> 00:15:38,980 deposit steps which is a list containing tar balls. 211 00:15:39,010 --> 00:15:41,380 And this is the name. 212 00:15:41,380 --> 00:15:50,440 So this is a pretty process a step and we're going to pass it pre processor so the first step in our 213 00:15:50,440 --> 00:15:58,220 pipeline here our model pipeline is to run through this and then the next step will put a little camera 214 00:15:58,220 --> 00:15:58,490 here. 215 00:15:58,490 --> 00:16:09,400 We need to create another top on which is going to be our model is our Random Forest regress up wonderful. 216 00:16:09,410 --> 00:16:14,780 So now we've got our model pipeline setup which will go through pre processing once the pre processing 217 00:16:14,780 --> 00:16:15,500 is done. 218 00:16:15,530 --> 00:16:18,580 It'll build a random forest classifier on it. 219 00:16:18,590 --> 00:16:23,630 We can get our data ready so split data X equals data don't draw. 220 00:16:23,930 --> 00:16:27,220 We're going to remove the price column because that's what we want to predict. 221 00:16:27,230 --> 00:16:37,370 Access is going to equal one Y is going to be data price then we're going to create X train x test as 222 00:16:37,370 --> 00:16:38,230 we always do. 223 00:16:38,270 --> 00:16:39,490 Why train. 224 00:16:39,830 --> 00:16:46,390 Why test equals try and test split x y test size. 225 00:16:46,390 --> 00:16:58,810 We'll use zero point two and then we own it fit and score the model to model don't fit X train y trying 226 00:16:59,720 --> 00:17:06,580 now model dot school X test y test where we got it wrong. 227 00:17:06,580 --> 00:17:07,700 Handle unknown. 228 00:17:07,780 --> 00:17:10,870 We've got a typo of course we do handle unknown. 229 00:17:10,870 --> 00:17:11,880 There we go. 230 00:17:12,250 --> 00:17:14,870 Change that. 231 00:17:15,110 --> 00:17:16,190 We did it. 232 00:17:16,190 --> 00:17:23,460 We did the entire pull up line in one cell taking advantage of the pipeline class so let's step through 233 00:17:23,460 --> 00:17:24,420 what we did. 234 00:17:24,630 --> 00:17:31,440 We imported the classes to get our data ready Panda's colon transformer pipeline simple and pewter one 235 00:17:31,440 --> 00:17:38,480 hot encoder we imported the classes to do our modeling so random forest aggressor trying to split grid 236 00:17:38,480 --> 00:17:46,710 search CV we set up a random seed so our results are reproducible and then we define different features 237 00:17:46,710 --> 00:17:48,300 and transform our pipelines. 238 00:17:48,300 --> 00:17:50,680 So we defined the categorical features here. 239 00:17:50,910 --> 00:17:56,430 The door feature the numeric feature and then for each of them we created a few steps for our pipeline 240 00:17:56,430 --> 00:18:03,000 to take to not only impute missing values but to one hot encode so turn all of our data that wasn't 241 00:18:03,000 --> 00:18:13,210 numbers into numbers and then we set up a pre processing step that took all of our pipelines here to 242 00:18:13,720 --> 00:18:19,690 take the steps that we wanted to take on the categorical features on the door feature and on the numerical 243 00:18:19,690 --> 00:18:21,000 feature. 244 00:18:21,010 --> 00:18:28,100 Then we created another pipeline to put all of these steps together to not only pre process our data 245 00:18:29,120 --> 00:18:34,610 and then model it so then we split our data into x and y and then into train and test. 246 00:18:34,970 --> 00:18:40,750 And so when we call model dot fit it's gonna call this variable here which is going to run this pipeline 247 00:18:42,360 --> 00:18:48,210 which inherently is gonna call the pre process a step which is going to run this pipeline which is inherently 248 00:18:48,210 --> 00:18:50,280 going to run all of these three steps. 249 00:18:50,400 --> 00:18:54,060 You know I'd be thinking Daniel while this has got a lot going on but if you go back through what you've 250 00:18:54,060 --> 00:18:57,810 looked at here you'll be had to say you can step through it and go okay. 251 00:18:57,990 --> 00:19:02,370 I'm starting to piece together what's happening and now if you scroll right back up to where we originally 252 00:19:02,370 --> 00:19:07,500 did this filling data with pandas filling data with SDK learn and then modelling it on our car sales 253 00:19:07,500 --> 00:19:12,200 problem you'll see that we did it over a number of different cells and now we're doing it. 254 00:19:12,360 --> 00:19:17,880 We're getting a full blown machine learning model and 55 lines of code minus some comments and we could 255 00:19:17,880 --> 00:19:20,210 probably tidy this up if we're being honest with ourselves. 256 00:19:20,730 --> 00:19:26,790 But now you're starting to see is an end the beginning machine learning is very experimental throughout 257 00:19:26,790 --> 00:19:29,550 the whole thing and it's very experimental. 258 00:19:29,550 --> 00:19:35,820 And so what it takes at the start is just code everywhere trying different things seeing how they work 259 00:19:36,240 --> 00:19:38,980 and then eventually you can bring it down to something like this. 260 00:19:39,030 --> 00:19:43,890 Put it all into one big cell in a pipeline going through the steps that you've figured out along the 261 00:19:43,890 --> 00:19:45,060 way. 262 00:19:45,060 --> 00:19:50,610 Now of course we'd want to probably improve this use hyper parameter tuning and that's what we might 263 00:19:50,610 --> 00:19:57,660 look at in the next video is how to use hyper parameter tuning with a pipeline slightly different to 264 00:19:57,660 --> 00:20:02,570 how we've seen it done with just a regular model so take a little break step back through what we've 265 00:20:02,570 --> 00:20:03,680 done here because it was a lot. 266 00:20:03,680 --> 00:20:05,900 Don't worry if you don't take it in entirely. 267 00:20:05,900 --> 00:20:07,880 Have a read through the pipeline. 268 00:20:07,880 --> 00:20:13,790 Class of socket loan documentation and see if you can line up what we've done in the documentation to 269 00:20:13,790 --> 00:20:19,380 what we've done here otherwise I'll see you in the next video we'll see if we can improve this score.