1 00:00:00,300 --> 00:00:02,700 Hello my friends, welcome to this new tutorial. 2 00:00:02,700 --> 00:00:06,933 And now we're about to train the SVR model on the whole data set. 3 00:00:07,200 --> 00:00:10,300 After a successful data preprocessing phase 4 00:00:10,300 --> 00:00:14,066 with your new experience in feature scaling, 5 00:00:14,066 --> 00:00:17,700 because now you can handle very well all the different situations. 6 00:00:18,000 --> 00:00:18,366 All right. 7 00:00:18,366 --> 00:00:22,000 So let's train that SVR model on the whole data set. 8 00:00:22,000 --> 00:00:22,400 Of course, 9 00:00:22,400 --> 00:00:26,933 because we did not do a split of the data set between training and test sets. 10 00:00:27,166 --> 00:00:28,000 So there you go. 11 00:00:28,000 --> 00:00:30,066 Let's create a new code cell. 12 00:00:30,066 --> 00:00:35,366 And now let's build of course in the most efficient way the SVR model. 13 00:00:35,366 --> 00:00:37,266 So we're going to build it with scikit learn. 14 00:00:37,266 --> 00:00:41,266 Of course I remind that scikit learn is the best data science library 15 00:00:41,266 --> 00:00:45,000 excluding deep learning, because of course we have TensorFlow and PyTorch. 16 00:00:45,000 --> 00:00:48,900 But for any machine learning model that is not based on neural networks, 17 00:00:49,066 --> 00:00:53,133 well, scikit learn is definitely by far the best data science library. 18 00:00:53,133 --> 00:00:56,600 And for what we are interested in right now, you know the SVR model. 19 00:00:56,700 --> 00:01:01,766 Well, we will build it with a class called SVR, as simple as that, 20 00:01:01,933 --> 00:01:05,633 which belongs to a module of scikit learn called SVM. 21 00:01:05,833 --> 00:01:08,700 And so we have to start here from 22 00:01:08,700 --> 00:01:13,066 first scikit learn to access the scikit learn library from which 23 00:01:13,233 --> 00:01:17,400 we're going to get access to that SVM modules. 24 00:01:17,400 --> 00:01:23,500 That's the module from which we're going to import that SVR class. 25 00:01:23,500 --> 00:01:24,500 Perfect. 26 00:01:24,500 --> 00:01:25,833 Now we have the class. 27 00:01:25,833 --> 00:01:28,833 And you of course know the next natural step. 28 00:01:28,833 --> 00:01:31,833 Any time you import a class, next natural step is, of course, 29 00:01:32,000 --> 00:01:36,066 to create an object or an instance of this class. 30 00:01:36,266 --> 00:01:39,000 And we're going to call that object regressor. 31 00:01:39,000 --> 00:01:40,233 Just like before. 32 00:01:40,233 --> 00:01:44,400 Because this instance or object of the SVR class 33 00:01:44,400 --> 00:01:48,600 is nothing else than the SVR regressor itself. 34 00:01:48,600 --> 00:01:51,900 You know, the support vector for regression regressor. 35 00:01:52,266 --> 00:01:53,200 So regressor. 36 00:01:53,200 --> 00:01:56,633 And then of course we are going to call this class SVR. 37 00:01:56,800 --> 00:01:58,066 Add some parenthesis. 38 00:01:58,066 --> 00:02:01,400 And this time we have to input a parameter 39 00:02:01,833 --> 00:02:06,066 because indeed remember in the intuition lecture of support vector regression. 40 00:02:06,066 --> 00:02:09,766 And you will also see in the intuition lecture of the support 41 00:02:09,766 --> 00:02:12,766 vector machine in the next part, part three classification. 42 00:02:12,933 --> 00:02:15,633 Well, you actually have what we call kernels 43 00:02:15,633 --> 00:02:19,900 which can either learn some linear relationships and that's the linear 44 00:02:19,900 --> 00:02:23,433 kernel or non-linear relationships in your data set, 45 00:02:23,666 --> 00:02:27,966 which are the nonlinear kernels such as the RBF radial basis. 46 00:02:27,966 --> 00:02:30,366 I will actually show it to you right away. 47 00:02:30,366 --> 00:02:35,300 You know, that's the Gaussian RBF kernel which is given by this formula. 48 00:02:35,300 --> 00:02:38,300 And I can actually show you the plot of that kernel. 49 00:02:38,300 --> 00:02:39,133 Here it is. 50 00:02:39,133 --> 00:02:41,366 So that's the Gaussian RBF kernel. 51 00:02:41,366 --> 00:02:44,500 And then you also have some other kernels which I've prepared 52 00:02:44,500 --> 00:02:48,333 here on this great website, showing clearly the different kernels 53 00:02:48,333 --> 00:02:52,333 of the support vector machine, but also the SVR, because as we are, 54 00:02:52,333 --> 00:02:56,366 is nothing else than a support vector machine model for regression. 55 00:02:56,700 --> 00:02:58,033 So let's see here they are. 56 00:02:58,033 --> 00:03:01,466 There is the polynomial kernel adapted for a non-linear data set. 57 00:03:01,566 --> 00:03:04,800 You have two Gaussian kernel which has a classic Gaussian function. 58 00:03:05,033 --> 00:03:06,900 Then the Gaussian radial basis function. 59 00:03:06,900 --> 00:03:08,833 The most widely used one. 60 00:03:08,833 --> 00:03:11,000 And that's actually the one we will use. 61 00:03:11,000 --> 00:03:14,266 Then you have the Laplace one, the hyperbolic tangent kernel as well. 62 00:03:14,266 --> 00:03:15,933 That's a popular one sigma one. 63 00:03:15,933 --> 00:03:18,300 Well you have all of them here. 64 00:03:18,300 --> 00:03:21,000 And so if you're curious yes, you can have a look at them. 65 00:03:21,000 --> 00:03:25,500 But the one we will use for our implementation will be 66 00:03:25,500 --> 00:03:29,500 and that's the one I recommend each time you experiment with an SVR model. 67 00:03:29,633 --> 00:03:33,466 And that's the radial basis function kernel, the RBF kernel. 68 00:03:33,733 --> 00:03:36,933 And that's exactly what we have to input here in our parameters. 69 00:03:37,200 --> 00:03:39,700 And so the name of that parameter is kernel. 70 00:03:39,700 --> 00:03:41,400 So that's the name of the parameter. 71 00:03:41,400 --> 00:03:44,400 And then the value we want for that parameter which you know 72 00:03:44,433 --> 00:03:49,933 corresponds to the radial basis function has the code name in quotes are the f. 73 00:03:50,133 --> 00:03:51,000 And that's it. 74 00:03:51,000 --> 00:03:53,766 That's only what we have to input here. 75 00:03:53,766 --> 00:03:58,866 So that basically creates the SVR model with the radial basis function kernel. 76 00:03:59,100 --> 00:04:00,866 And so now that means only one thing. 77 00:04:00,866 --> 00:04:03,833 That means that we already have built the model. 78 00:04:03,833 --> 00:04:06,000 We already have the SVR model itself. 79 00:04:06,000 --> 00:04:09,133 And so now the last final natural step is to of course 80 00:04:09,133 --> 00:04:13,000 train that regressor on the whole data set. 81 00:04:13,000 --> 00:04:15,100 Right. Which is also the training set. Okay. 82 00:04:15,100 --> 00:04:18,000 So let's do this. You know exactly how to do this. 83 00:04:18,000 --> 00:04:20,700 You know from this point it's exactly the same as before. 84 00:04:20,700 --> 00:04:24,366 You know, the same function, which is of course the fit function 85 00:04:24,633 --> 00:04:29,633 and which will train this SVR model on your whole data set, 86 00:04:29,633 --> 00:04:33,733 right, to learn the correlations between the position levels and the salaries. 87 00:04:33,966 --> 00:04:36,966 And all this done with the radial basis function kernel. 88 00:04:37,000 --> 00:04:37,333 All right. 89 00:04:37,333 --> 00:04:40,266 So let's do this. Let's train our regressor. 90 00:04:40,266 --> 00:04:42,300 So we take our regressor object first. 91 00:04:42,300 --> 00:04:44,633 And as usual we add here a dot. 92 00:04:44,633 --> 00:04:49,666 And then the fit method which takes as input this time not xtrain 93 00:04:49,666 --> 00:04:53,100 and y train because we have not created a separate training set. 94 00:04:53,266 --> 00:04:57,933 But of course this time the whole data set which remember was feature scaled, 95 00:04:57,933 --> 00:05:03,233 you know, both on the matrix of features export and the dependent variable y part. 96 00:05:03,500 --> 00:05:08,400 And so that's exactly what we have to input here both x and y. 97 00:05:08,600 --> 00:05:10,733 And that's it. Congratulations. 98 00:05:10,733 --> 00:05:14,400 Now you know how to build and train and SVR model. 99 00:05:14,400 --> 00:05:19,333 After a successful data preprocessing phase including special feature scaling. 100 00:05:20,000 --> 00:05:20,566 All right. 101 00:05:20,566 --> 00:05:24,300 And so now we have a very interesting next step. 102 00:05:24,566 --> 00:05:27,466 It is the step where we predict the new result. 103 00:05:27,466 --> 00:05:30,466 And this next step is very interesting because it will teach you 104 00:05:30,500 --> 00:05:34,533 how to reverse the scaling of your prediction. 105 00:05:34,533 --> 00:05:38,800 Because you will see that when we apply the predict method to predict 106 00:05:38,800 --> 00:05:43,800 this new result, well, it will be returned in the scale that was used for Y. 107 00:05:43,800 --> 00:05:44,266 Right? 108 00:05:44,266 --> 00:05:47,100 The new scale of Y after it was transformed. 109 00:05:47,100 --> 00:05:49,600 And so we will have to reverse this transformation. 110 00:05:49,600 --> 00:05:53,900 We'll have to reverse the scaling in order to get the original scale of Y. 111 00:05:54,033 --> 00:05:57,200 And now we'll teach you exactly how to do it in the next tutorial.