1 00:00:00,266 --> 00:00:03,300 So now let's move on to the next code template. 2 00:00:03,500 --> 00:00:04,933 Polynomial regression. 3 00:00:04,933 --> 00:00:08,800 Well here that's the same, you know the same data preprocessing phase 4 00:00:08,800 --> 00:00:11,833 with first importing the libraries, then importing the data sets 5 00:00:11,833 --> 00:00:14,833 where you only have to enter the name of your data set here, 6 00:00:14,866 --> 00:00:17,866 and then splitting the data set into the training set and test set. 7 00:00:18,200 --> 00:00:22,066 Then of course, we train the polynomial regression model on the training set. 8 00:00:22,533 --> 00:00:25,466 So that's exactly like what we did in this part two. 9 00:00:25,466 --> 00:00:28,200 You know when we built it you recognize degree equals four. 10 00:00:28,200 --> 00:00:30,266 You know that's exactly the same code. 11 00:00:30,266 --> 00:00:34,766 Then we predict some test results just to compare our predictions 12 00:00:34,766 --> 00:00:36,000 and the real results. 13 00:00:36,000 --> 00:00:39,000 And finally we will evaluate the model performance. 14 00:00:39,200 --> 00:00:41,833 And I will reveal very soon how to do that. 15 00:00:41,833 --> 00:00:42,166 Okay. 16 00:00:42,166 --> 00:00:44,133 So that's for polynomial regression. 17 00:00:44,133 --> 00:00:45,933 Once again very generic. 18 00:00:45,933 --> 00:00:48,600 You just have to enter here the name of your data set. 19 00:00:48,600 --> 00:00:51,566 And then this code template is ready to be deployed. 20 00:00:51,566 --> 00:00:54,100 All right then support vector regression. 21 00:00:54,100 --> 00:00:55,533 So here that's the same. 22 00:00:55,533 --> 00:00:59,433 First the data preprocessing phase where we import the libraries. 23 00:00:59,433 --> 00:01:01,066 Then we import the data set. 24 00:01:01,066 --> 00:01:02,933 But then remember we have to reshape 25 00:01:02,933 --> 00:01:06,000 our dependent variable vector y because we have two features. 26 00:01:06,000 --> 00:01:07,966 Kill it. Right. Because we were doing regression. 27 00:01:07,966 --> 00:01:11,733 So the dependent variable vector has continuous numerical values. 28 00:01:11,933 --> 00:01:15,600 And therefore for SVR we need to scale the dependent variable vector. 29 00:01:15,900 --> 00:01:19,666 That's exactly the same as what we saw together when building the SVR model. 30 00:01:20,100 --> 00:01:21,400 Then I added this. 31 00:01:21,400 --> 00:01:24,833 Of course, in order to split the data set into the training set and test set 32 00:01:24,866 --> 00:01:26,533 so that we can indeed evaluate 33 00:01:26,533 --> 00:01:30,300 the performance of SVR and compare it to the other models, 34 00:01:30,800 --> 00:01:34,800 then of course, we have feature scaling compulsory for the SVR 35 00:01:35,033 --> 00:01:37,200 with remember our two scalars, one 36 00:01:37,200 --> 00:01:40,200 for the matrix of features and one for the dependent variable vector. 37 00:01:40,366 --> 00:01:43,966 Then we train, of course the SVR model on the training set. 38 00:01:44,133 --> 00:01:46,566 You know this very well. We did it together. 39 00:01:46,566 --> 00:01:49,766 Then we predicted test results just to compare and have an idea 40 00:01:49,766 --> 00:01:52,766 of how good are the predictions of new observations. 41 00:01:52,933 --> 00:01:57,533 And finally we will evaluate the model performance with r squared. 42 00:01:58,033 --> 00:01:58,666 No worries. 43 00:01:58,666 --> 00:02:00,633 We'll get to that very very soon. 44 00:02:00,633 --> 00:02:04,166 So that's for the SVR then for decision tree regression. 45 00:02:04,166 --> 00:02:05,700 Well exactly the same. 46 00:02:05,700 --> 00:02:09,966 You know the data preprocessing phase first with no feature scaling right. 47 00:02:09,966 --> 00:02:12,700 Remember we don't need feature scaling for decision trees. 48 00:02:12,700 --> 00:02:15,800 So once again we only have to change the name of the data set here. 49 00:02:15,800 --> 00:02:18,800 Then we split the data set into the training set and test it. 50 00:02:18,866 --> 00:02:21,866 Then we train the decision tree regression model on the training set, 51 00:02:22,000 --> 00:02:24,933 exactly the same as we did in our implementation. 52 00:02:24,933 --> 00:02:26,433 When we built it together. 53 00:02:26,433 --> 00:02:28,200 Then we predict the test result 54 00:02:28,200 --> 00:02:32,200 in order to compare our predictions to the real result in Y test. 55 00:02:32,200 --> 00:02:35,100 And that's in order to have a first idea of the performance. 56 00:02:35,100 --> 00:02:38,700 And then of course, we will evaluate the model performance with R squared. 57 00:02:39,000 --> 00:02:42,300 And finally we have the exact same data preprocessing 58 00:02:42,300 --> 00:02:45,300 phase where you only have to enter the name of your data set here. 59 00:02:45,366 --> 00:02:48,633 And then we train the random forest regression model on the training 60 00:02:48,633 --> 00:02:52,400 set with the exact same implementation as how we did it together. 61 00:02:52,633 --> 00:02:56,366 Then we predict the test result in order to get a first idea of the performance. 62 00:02:56,533 --> 00:02:59,533 And finally we evaluate the model performance. 63 00:02:59,766 --> 00:03:04,300 All right, so as I told you, you have purely generic code templates 64 00:03:04,300 --> 00:03:07,233 which you can deploy for any of your future data sets 65 00:03:07,233 --> 00:03:10,666 as long as they have first of features and last, the dependent variable. 66 00:03:10,800 --> 00:03:13,800 And as long as they don't have missing data or categorical data, 67 00:03:13,800 --> 00:03:15,133 in which case it's still fine. 68 00:03:15,133 --> 00:03:18,633 You can use your data preprocessing toolkit, but there you go. 69 00:03:18,666 --> 00:03:22,400 You have this code template, and now I'm going to show you how to evaluate 70 00:03:22,400 --> 00:03:25,566 your regression models using the R-squared coefficient. 71 00:03:26,300 --> 00:03:28,033 All right. So let's start with r squared. 72 00:03:28,033 --> 00:03:29,133 You know that 73 00:03:29,133 --> 00:03:33,600 final sale in each of the implementations evaluating the model performance. 74 00:03:33,833 --> 00:03:36,266 Let's see how we're going to do this. 75 00:03:36,266 --> 00:03:41,300 Well as I also want to train you on how to be independent in machine learning. 76 00:03:41,433 --> 00:03:45,033 We're going to pretend once again that I actually have no idea on how 77 00:03:45,033 --> 00:03:48,800 to evaluate the model performance of regression models, and therefore that 78 00:03:48,800 --> 00:03:52,833 I have to go to the documentation online to figure out how to do it. 79 00:03:52,833 --> 00:03:53,500 All right. 80 00:03:53,500 --> 00:03:57,900 I'm just training you to be independent and quickly find an information 81 00:03:57,900 --> 00:03:58,966 whenever you need it.