1 00:00:00,680 --> 00:00:08,030 Modelling part for is the final part of modelling and its comparison wanting to know here's how will 2 00:00:08,030 --> 00:00:11,540 our model form in the real world. 3 00:00:11,570 --> 00:00:16,550 So after you've tuned and proved your models performance through hyper parameter tuning it's time to 4 00:00:16,550 --> 00:00:22,580 see how it performs on the tests in the tests and is like the final exam for machine learning models. 5 00:00:22,970 --> 00:00:29,000 If you've created your data splits correctly it should give you an indication on how your model will 6 00:00:29,000 --> 00:00:37,430 perform once deployed in production means customer facing rather than just being on your local computer. 7 00:00:37,850 --> 00:00:44,660 Since your model has never seen data in the test set evaluating your model on it is a good way to see 8 00:00:44,660 --> 00:00:52,040 how it generalizes and remember why generalizing I mean adapts to data it hasn't seen before such as 9 00:00:52,250 --> 00:00:59,030 how heart disease prediction machine learning model would perform at classifying whether a patient has 10 00:00:59,150 --> 00:01:00,550 heart disease or not. 11 00:01:00,830 --> 00:01:08,240 On a patient who wasn't in our original data set a good model will yield similar results on the training 12 00:01:08,240 --> 00:01:10,210 validation and test sets. 13 00:01:10,400 --> 00:01:16,310 And it's not uncommon to see a slight decline in performance from the model on the training and validation 14 00:01:16,310 --> 00:01:18,470 set to the test center. 15 00:01:18,560 --> 00:01:26,000 For example your model might achieve 98 percent accuracy on the training dataset and 96 percent accuracy 16 00:01:26,300 --> 00:01:34,290 on the test set what you should be worried about is if the training set performance is dramatically 17 00:01:34,470 --> 00:01:38,060 higher than the test set also known as under fitting. 18 00:01:38,340 --> 00:01:45,660 And if the test set performance is higher than the training set performance also known as overfishing 19 00:01:46,800 --> 00:01:53,190 overfishing and underfunding are both examples of a model not being able to generalize well which is 20 00:01:53,190 --> 00:02:03,230 what we don't want the ideal model shows up in the goldilocks zone it fits just right not too well but 21 00:02:03,230 --> 00:02:04,740 not too poorly. 22 00:02:04,780 --> 00:02:09,440 You see here this this machine learning model if this was your data these green data points. 23 00:02:09,440 --> 00:02:12,720 This line here it kind of fits the shape but it's it's. 24 00:02:12,920 --> 00:02:14,990 This would be classified as as under fitting. 25 00:02:14,990 --> 00:02:17,130 This is not what we want our model today. 26 00:02:17,130 --> 00:02:18,350 And this one over here. 27 00:02:18,980 --> 00:02:24,190 Well it's doing a good job of fitting all the data points but it's going far too close it's almost too 28 00:02:24,260 --> 00:02:29,780 perfect model it's just snaking between them so this example would mean that the model has learned the 29 00:02:29,780 --> 00:02:35,510 patents too well in this dataset it would be like seeing the final exam before actually taking the final 30 00:02:35,510 --> 00:02:36,620 exam. 31 00:02:36,620 --> 00:02:39,190 This one has the goldilocks zone right. 32 00:02:39,230 --> 00:02:46,130 This is an iterative process where exactly this Goldilocks Zone is like a balanced model will really 33 00:02:46,130 --> 00:02:49,780 depend on your data and the problem you're trying to solve. 34 00:02:49,970 --> 00:02:56,450 That's why again it's a it's an iterative process finding this this balanced zone and now after some 35 00:02:56,450 --> 00:03:01,430 experience and practice working on different machine learning problems you'll be out to start to tell 36 00:03:01,430 --> 00:03:05,700 whether your model is overheating or on defeating now. 37 00:03:05,790 --> 00:03:11,340 There are several reasons why on defeating an opening can happen but the main ones are data leakage 38 00:03:11,400 --> 00:03:13,000 and data mismatch. 39 00:03:13,290 --> 00:03:19,230 Data leakage happens when some of your test data leaks into your training data. 40 00:03:19,260 --> 00:03:24,990 This often results in overfishing or a model doing better on the test set then on the training dataset 41 00:03:26,040 --> 00:03:27,000 it's like. 42 00:03:27,000 --> 00:03:31,470 Remember it was like if you were to have a look at the final exam or everyone had to look at the final 43 00:03:31,470 --> 00:03:36,780 exam as the practice exam you're machine learning model has just learned what it's about to be test 44 00:03:36,780 --> 00:03:37,320 on. 45 00:03:37,320 --> 00:03:43,080 So when it comes time to modelling it just learned it way too well and it starts to fit the data like 46 00:03:43,080 --> 00:03:44,440 this. 47 00:03:44,460 --> 00:03:50,280 And this is why it's important to do your splits correctly and ensure that machine learning model training 48 00:03:50,640 --> 00:03:53,520 happens only on the training data set. 49 00:03:53,520 --> 00:03:59,850 Validation and model tuning happens only on the validation or training data set and that testing and 50 00:03:59,850 --> 00:04:03,270 model comparison happens on the test data set. 51 00:04:04,800 --> 00:04:09,490 And I've remember some different approaches use only a training and testing set and do model tuning 52 00:04:09,490 --> 00:04:13,740 on the training set but testing always stays the same. 53 00:04:13,750 --> 00:04:18,730 It's like when you when you go to university you're doing a course you want to make sure that the final 54 00:04:18,730 --> 00:04:22,810 exam is kind of an indication of how well you understand things. 55 00:04:22,810 --> 00:04:28,660 Same with the test data set for machine learning the test dataset is used as an indication of how well 56 00:04:28,660 --> 00:04:31,100 your model will generalize in the real world. 57 00:04:31,150 --> 00:04:38,880 So you want to avoid data leakage data mismatch happens when the data you're testing on is different 58 00:04:38,880 --> 00:04:45,560 to the data you're training on such as having different features in the training data to the test data. 59 00:04:45,600 --> 00:04:52,080 Having this kind of mismatch can lead to models performing poorly on test data compared to training 60 00:04:52,080 --> 00:04:53,150 data. 61 00:04:53,160 --> 00:04:59,460 This is why it's important to ensure that training is done on the same kind of data as you'll be testing 62 00:04:59,460 --> 00:05:07,360 on and as close as possible to what you'll be using in your future applications other ways to combat 63 00:05:07,400 --> 00:05:11,200 under fitting include using a more advanced model. 64 00:05:11,200 --> 00:05:16,330 This could mean a totally different model or increasing the number of hyper parameters on your current 65 00:05:16,330 --> 00:05:16,750 model. 66 00:05:16,750 --> 00:05:21,520 Remember when we were cooking our chicken dish we might alter one of the hyper parameters of our oven 67 00:05:21,520 --> 00:05:22,720 by turning it up. 68 00:05:22,750 --> 00:05:27,340 That might be something that you might do on a machine learning model instead of only using two layers 69 00:05:27,340 --> 00:05:28,300 in a known network. 70 00:05:28,300 --> 00:05:29,660 You might use for. 71 00:05:29,890 --> 00:05:35,250 We'll see more of this in the future project you could also reduce the number of features you're trying 72 00:05:35,250 --> 00:05:35,850 to model. 73 00:05:35,850 --> 00:05:41,990 Maybe your data has too many features and the model you're using is struggling to find patterns in them. 74 00:05:42,740 --> 00:05:45,510 Finally you could train your model for longer. 75 00:05:45,650 --> 00:05:50,150 Sometimes models take longer to train or longer to learn than you'd expect. 76 00:05:50,150 --> 00:05:57,280 One of your experiments may involve a longer training phase to reduce overfishing. 77 00:05:57,280 --> 00:06:00,080 Useful solutions are to collect more data. 78 00:06:00,130 --> 00:06:06,580 More data will provide more potential patterns for a model to find and thus lower the potential for 79 00:06:06,580 --> 00:06:08,560 it to find them all. 80 00:06:08,650 --> 00:06:11,670 Or you could try use a less advanced model. 81 00:06:11,680 --> 00:06:14,430 This is uncommon but it's a possibility. 82 00:06:14,440 --> 00:06:21,280 The model you're using is too good at learning and it models your data too well be cautious of models 83 00:06:21,280 --> 00:06:25,440 performing too well as they might lead to incorrect predictions. 84 00:06:25,450 --> 00:06:33,070 Remember no model is perfect so be sure to check your good results as much as you check your poor results 85 00:06:34,130 --> 00:06:37,260 finally when comparing two different models to each other. 86 00:06:37,550 --> 00:06:43,340 It's important to ensure you're comparing apples with apples and oranges with oranges. 87 00:06:43,490 --> 00:06:51,090 For example Model 2 trained on dataset 1 vs. Model 3 trained on data set. 88 00:06:51,110 --> 00:06:57,710 1 During comparison you'll want to make sure you take into account not only the final result but what 89 00:06:57,710 --> 00:06:58,610 it took to get there. 90 00:06:59,360 --> 00:07:06,890 If Model 2 takes one second to make a prediction at ninety three point one percent accuracy Model 3 91 00:07:06,950 --> 00:07:15,350 takes 4 seconds to make a prediction at ninety four point seven accuracy is that extra 3 percent accuracy 92 00:07:15,500 --> 00:07:19,240 worth that extra 3 seconds of prediction time. 93 00:07:19,310 --> 00:07:26,870 Now this will depend on what the goal is but if you are optimizing for prediction time you want to make 94 00:07:26,870 --> 00:07:28,990 predictions as fast as possible. 95 00:07:29,030 --> 00:07:35,610 You might choose Model 2 because it makes predictions 4 times faster than model train but at a higher 96 00:07:35,610 --> 00:07:37,340 accuracy level. 97 00:07:37,340 --> 00:07:42,410 Again this will be different depending on what kind of application or production you use case you want 98 00:07:42,410 --> 00:07:49,100 to use but just some things to keep in mind is more than just how a model performs that goes into choosing 99 00:07:49,160 --> 00:07:50,440 which one you should use. 100 00:07:52,130 --> 00:07:55,000 A couple of things you want to remember from this lesson. 101 00:07:55,100 --> 00:08:00,090 Avoid overfishing and under feeding you want a model that heads towards generality. 102 00:08:00,100 --> 00:08:02,240 It's like when you do your practice exam. 103 00:08:02,240 --> 00:08:08,600 If you saw the final exam you might just become an expert memorization machine rather than someone who 104 00:08:08,600 --> 00:08:11,730 could use that knowledge in the real world. 105 00:08:11,810 --> 00:08:14,450 Keep the test set separate at all costs. 106 00:08:14,450 --> 00:08:20,810 When you split your data you want to have a training set and then throw away the test data set and lock 107 00:08:20,810 --> 00:08:21,140 it up. 108 00:08:21,370 --> 00:08:26,600 And once your model has been trained then you can open up the test data set you can unlock it take it 109 00:08:26,600 --> 00:08:33,220 out of the safe and see how your model performs when comparing models compare apples to apples have 110 00:08:33,230 --> 00:08:37,940 Model 1 and Model 2 on data set 1 and dataset 1. 111 00:08:38,210 --> 00:08:43,640 You want to make sure the two models you're comparing have been created in the same sort of environment 112 00:08:43,880 --> 00:08:47,750 so that you can ensure that what you're comparing is is legitimate. 113 00:08:47,750 --> 00:08:55,050 Comparisons finally won Best Performance Metric does not equal the best model. 114 00:08:55,050 --> 00:08:58,710 Remember now example you might be optimizing for prediction time. 115 00:08:58,710 --> 00:09:04,320 So although a model that makes a faster prediction doesn't get as high accuracy is as another model 116 00:09:04,320 --> 00:09:06,090 that takes a little bit longer. 117 00:09:06,170 --> 00:09:12,510 It might not matter to you because you need something that can predict as fast as possible. 118 00:09:12,510 --> 00:09:17,850 That was a lot but we'll see plenty more of this in action throughout the course. 119 00:09:17,850 --> 00:09:22,860 You'll also be using it throughout your entire machine learning Korean so it's important to remember 120 00:09:22,860 --> 00:09:24,240 these concepts. 121 00:09:24,240 --> 00:09:29,880 Let's push on to the next step and seeing how we can put all of this previous steps together in Step 122 00:09:29,880 --> 00:09:30,900 6. 123 00:09:30,900 --> 00:09:31,910 Experimentation.