1 00:00:00,300 --> 00:00:01,333 But according to you. 2 00:00:01,333 --> 00:00:03,000 Is it going to be the best model? 3 00:00:03,000 --> 00:00:03,766 Well, that's what 4 00:00:03,766 --> 00:00:08,700 we're going to figure out very quickly, because then we're going to run 5 00:00:09,000 --> 00:00:12,166 all the cells here for the polynomial regression. 6 00:00:12,166 --> 00:00:14,266 There we go. As you can see, it's going very fast. 7 00:00:14,266 --> 00:00:19,700 And well for the polynomial regression model with remember a degree of four 8 00:00:19,900 --> 00:00:25,166 we get a final R-squared coefficient of oh point 9458. 9 00:00:25,200 --> 00:00:28,866 So we've already built our multiple linear regression model 10 00:00:28,866 --> 00:00:32,533 and therefore so far the best one is polynomial regression. 11 00:00:32,533 --> 00:00:36,300 Feel free to actually test some other degrees if you want. 12 00:00:36,600 --> 00:00:39,366 But now we're going to move on to support vector regression 13 00:00:39,366 --> 00:00:43,100 and see if it's going to beat that polynomial regression. 14 00:00:43,100 --> 00:00:44,666 I actually really like this model. 15 00:00:44,666 --> 00:00:47,066 And usually I get the best results with this. 16 00:00:47,066 --> 00:00:47,966 But let's see. 17 00:00:47,966 --> 00:00:53,800 Let's run all the cells and in flashlight we'll get the final performance. 18 00:00:54,033 --> 00:00:55,366 No not yet. Not yet actually. 19 00:00:55,366 --> 00:01:00,800 And by very very short it beats indeed the polynomial regression model. 20 00:01:00,800 --> 00:01:03,000 You know, oh point 9480. 21 00:01:03,000 --> 00:01:06,666 And here we had actually oh point 9458. 22 00:01:06,866 --> 00:01:10,233 So so far the best model is to support vector regression. 23 00:01:10,233 --> 00:01:13,033 Can you see how we're doing super fast here. Right. 24 00:01:13,033 --> 00:01:15,900 We only had to change the name of the data set here. 25 00:01:15,900 --> 00:01:17,966 And all the rest is automatic. 26 00:01:17,966 --> 00:01:20,366 And that's the beauty of code templates. 27 00:01:20,366 --> 00:01:23,366 Now let's move on to the next one Decision tree regression. 28 00:01:23,400 --> 00:01:26,466 And let's see if it can beat the support vector regression 29 00:01:26,466 --> 00:01:29,700 which remember has opened 94 eight. 30 00:01:30,000 --> 00:01:30,833 So let's see. 31 00:01:30,833 --> 00:01:34,433 Let's run you know all the cells. 32 00:01:34,866 --> 00:01:40,100 And the final result is oh point 92 two okay. 33 00:01:40,100 --> 00:01:41,600 So that's actually the worst 34 00:01:41,600 --> 00:01:45,466 I think it is indeed worse than the multiple linear regression. Yes. 35 00:01:45,733 --> 00:01:48,733 So the decision tree regression model did not perform well here. 36 00:01:48,733 --> 00:01:49,900 But maybe that's 37 00:01:49,900 --> 00:01:53,866 because it didn't have enough team spirit to tackle these predictions. 38 00:01:53,866 --> 00:01:57,000 And that's what we're going to figure out with random force regression. 39 00:01:57,000 --> 00:02:01,033 Because indeed a random forest regression model is a bunch of trees 40 00:02:01,033 --> 00:02:04,033 teaming up to return an ultimate prediction. 41 00:02:04,200 --> 00:02:09,000 So now the ultimate question is, do you think that the final winner 42 00:02:09,333 --> 00:02:12,900 of this competition is going to be the support vector 43 00:02:12,900 --> 00:02:16,100 regression model, or the random forest regression model? 44 00:02:16,300 --> 00:02:19,200 Take your bet and let's see if you're right. 45 00:02:19,200 --> 00:02:19,900 So there we go. 46 00:02:19,900 --> 00:02:23,766 We're about to find out in a second because we're going to click run all now. 47 00:02:23,966 --> 00:02:28,500 And we're going to get the final final result with a final R-squared 48 00:02:28,500 --> 00:02:33,333 coefficient of actually oh point 96, which therefore makes the random forest 49 00:02:33,333 --> 00:02:37,066 regression the big winner of this data competition. 50 00:02:37,433 --> 00:02:38,700 So congratulations. 51 00:02:38,700 --> 00:02:39,833 In a very few seconds, 52 00:02:39,833 --> 00:02:44,200 you were able to quickly identify and select the best regression model. 53 00:02:44,433 --> 00:02:45,333 And mostly, you know, 54 00:02:45,333 --> 00:02:49,266 the most important thing in all of this is to finally know how to answer 55 00:02:49,266 --> 00:02:54,000 this very often asked question how do I select the best model? 56 00:02:54,000 --> 00:02:58,800 And well, the answer is you simply try all of them and using the R-squared 57 00:02:58,800 --> 00:03:03,533 coefficient, you compare them and conclude on which one is the best. 58 00:03:03,533 --> 00:03:06,533 And in our situation here, you know, for this data set, 59 00:03:06,600 --> 00:03:09,600 well the best is the random forest regression. 60 00:03:09,966 --> 00:03:10,633 All right. 61 00:03:10,633 --> 00:03:12,233 So congratulations. 62 00:03:12,233 --> 00:03:13,866 Now you know a lot. 63 00:03:13,866 --> 00:03:17,100 You know you have an expertise in regression models. 64 00:03:17,300 --> 00:03:20,533 You not only know how to build them but also you have all these cool templates 65 00:03:20,700 --> 00:03:24,600 which you can use very efficiently to select the best machine learning model 66 00:03:24,600 --> 00:03:28,533 for your regression problem and that for any data set. 67 00:03:28,800 --> 00:03:32,466 And remember, if your data set has missing data or categorical data, 68 00:03:32,700 --> 00:03:36,766 well, you only have to grab your tools and the data preprocessing toolkit 69 00:03:36,900 --> 00:03:38,533 to handle these situations. 70 00:03:38,533 --> 00:03:41,533 And then you can deploy your code templates. 71 00:03:41,700 --> 00:03:42,333 All right. 72 00:03:42,333 --> 00:03:44,433 So I'm super excited and super happy. 73 00:03:44,433 --> 00:03:49,100 Now that we completed 100% this part two on regression. 74 00:03:49,300 --> 00:03:53,700 And now my friends we're going to move on to a brand new branch of machine learning 75 00:03:53,900 --> 00:03:56,333 which is classification and which we're going to do 76 00:03:56,333 --> 00:04:00,200 exactly the same, but this time to predict a category 77 00:04:00,466 --> 00:04:03,466 and same as what we did for this regression branch. 78 00:04:03,533 --> 00:04:06,300 Well, we will build several classification models. 79 00:04:06,300 --> 00:04:09,666 And at the end I'll show you again how to select the best one. 80 00:04:09,966 --> 00:04:12,866 So I can't wait to see you in this part three. 81 00:04:12,866 --> 00:04:14,800 And until then enjoy machine learning.