1 00:00:00,300 --> 00:00:02,733 Hello and welcome to this art tutorial. 2 00:00:02,733 --> 00:00:07,066 In the previous tutorial was a lot of, interesting things to learn about R 3 00:00:07,066 --> 00:00:11,033 and what it can do to provide insightful informations of your model. 4 00:00:11,500 --> 00:00:15,900 And now what we only have to do left is to predict the test results. 5 00:00:16,300 --> 00:00:19,300 As for a simple linear regression, it will be very quick. 6 00:00:19,366 --> 00:00:22,066 We will just need one line for this and it's going to be 7 00:00:23,066 --> 00:00:23,766 y pred. 8 00:00:23,766 --> 00:00:26,833 I'm introducing the vector of predictions here Y pred. 9 00:00:27,833 --> 00:00:30,833 And then I'm going to use the predict function. 10 00:00:31,633 --> 00:00:34,200 And then the predict function I will just use two arguments. 11 00:00:34,200 --> 00:00:36,233 One is the regressor. 12 00:00:36,233 --> 00:00:38,733 You have to specify with which 13 00:00:38,733 --> 00:00:41,733 regressor you want to predict your test set results. 14 00:00:41,966 --> 00:00:44,000 And of course it's with our multiple linear 15 00:00:44,000 --> 00:00:47,000 regression regressor that we define here. 16 00:00:47,833 --> 00:00:50,433 And the second argument is new data 17 00:00:50,433 --> 00:00:53,433 which is the set of the new observations. 18 00:00:53,433 --> 00:00:56,433 You want to predict the result you want to predict the profit. 19 00:00:56,900 --> 00:00:59,600 And so of course this new data 20 00:00:59,600 --> 00:01:02,600 is going to be the test set. 21 00:01:03,566 --> 00:01:04,866 Don't worry if so far 22 00:01:04,866 --> 00:01:07,866 we are only making predictions about the test set. 23 00:01:07,933 --> 00:01:11,366 You will see that in the next sections we will be making predictions 24 00:01:11,366 --> 00:01:12,966 of new observations, 25 00:01:12,966 --> 00:01:16,500 which won't be the test set observations, but some single observations. 26 00:01:16,800 --> 00:01:19,533 You will understand why we will do this, and you will see that 27 00:01:19,533 --> 00:01:22,666 the purpose of doing that is actually going to be pretty fun. 28 00:01:23,466 --> 00:01:25,566 So the okay, so that's ready. 29 00:01:25,566 --> 00:01:26,900 The predict function is ready. 30 00:01:26,900 --> 00:01:29,900 It has its two argument the regressor and the new data. 31 00:01:29,900 --> 00:01:34,100 So we're ready to select this and execute. 32 00:01:34,400 --> 00:01:35,400 All right. 33 00:01:35,400 --> 00:01:39,733 And now our widespread vector of predicted results of the test 34 00:01:39,733 --> 00:01:41,500 set is ready. 35 00:01:41,500 --> 00:01:45,300 So what I'm going to do now is to type 36 00:01:45,300 --> 00:01:48,300 Y pred in the console to have a look at it. 37 00:01:48,600 --> 00:01:49,933 Here it is. 38 00:01:49,933 --> 00:01:53,566 So as you can see this contains ten predicted results. 39 00:01:53,566 --> 00:01:57,800 These are the ten predicted profits of the ten observations of the test set. 40 00:01:58,633 --> 00:02:03,266 So for example let's compare the real profits and the predicted profit. 41 00:02:03,666 --> 00:02:06,766 So let's look at our test set here okay. 42 00:02:06,766 --> 00:02:09,233 So let's look at our startups. 43 00:02:09,233 --> 00:02:10,733 Let's look at startup number four. 44 00:02:10,733 --> 00:02:14,800 And startup number four has a $182,000 profit. 45 00:02:15,300 --> 00:02:19,333 And what about the prediction $173,000 profit. 46 00:02:19,766 --> 00:02:23,333 So that's the predicted profit which is not far from the real profit. 47 00:02:23,333 --> 00:02:25,500 So that's pretty good okay. 48 00:02:25,500 --> 00:02:26,533 What about the second one. 49 00:02:26,533 --> 00:02:32,633 166,000 for the real profit and 172,000 for the predicted profit. 50 00:02:32,833 --> 00:02:34,800 Not too bad either. 51 00:02:34,800 --> 00:02:38,466 Then third observation 155 160. Good 52 00:02:39,800 --> 00:02:42,166 146 135. 53 00:02:42,166 --> 00:02:44,400 Not too bad then. Okay. 54 00:02:44,400 --> 00:02:46,766 Then. For example, let's look at the start. 55 00:02:46,766 --> 00:02:52,233 Number 24 started number 24 had a real profit of $108,000, 56 00:02:52,600 --> 00:02:55,700 and our predicted profit is $110,000. 57 00:02:55,733 --> 00:02:58,400 So here we're pretty close. That's an accurate prediction. 58 00:02:58,400 --> 00:03:00,566 Good one okay. 59 00:03:00,566 --> 00:03:05,066 And oh that's even better for the last two ones because here it's 99,000. 60 00:03:05,100 --> 00:03:07,733 Here it's almost 99,000. 61 00:03:07,733 --> 00:03:11,500 And 97 here for the last one and almost 97. 62 00:03:11,833 --> 00:03:14,000 So that's pretty good. That's a good model. 63 00:03:14,000 --> 00:03:19,066 And remember that's the model with all the independent variables. 64 00:03:19,366 --> 00:03:22,566 But as we saw there is only truly significant variable 65 00:03:22,566 --> 00:03:25,033 which was the R&D spend. 66 00:03:25,033 --> 00:03:29,066 And so if we try to recompute the regressor by only putting the R&D 67 00:03:29,066 --> 00:03:32,400 spending dependent variable here, you will see that the predictions 68 00:03:32,400 --> 00:03:35,033 are going to be the same with the same accuracy, 69 00:03:35,033 --> 00:03:40,066 because the R&D spend is the only strong predictor of the profit. 70 00:03:40,400 --> 00:03:41,666 So you can try that yourself. 71 00:03:41,666 --> 00:03:43,066 It could be a good practice. 72 00:03:43,066 --> 00:03:46,066 You can play around with the different independent variables. 73 00:03:46,266 --> 00:03:49,500 So I will let you do it yourself and continue to analyze, 74 00:03:49,500 --> 00:03:52,500 interpret and play around with the multiple linear regression. 75 00:03:52,866 --> 00:03:54,300 That's the end of this tutorial. 76 00:03:54,300 --> 00:03:58,900 I was very happy to teach you how to make multiple linear regression model. 77 00:03:59,000 --> 00:04:02,133 I look forward to seeing you in the next section where we'll be talking 78 00:04:02,133 --> 00:04:05,400 about a new kind of regressor, because this time you're going to see that 79 00:04:05,400 --> 00:04:09,666 it's not a linear regressor, it's going to be a polynomial regressor. 80 00:04:09,966 --> 00:04:13,200 And I can't wait to show you how it can give powerful predictions. 81 00:04:13,533 --> 00:04:16,066 And we are going to use it on a very funny example. 82 00:04:16,066 --> 00:04:18,200 So there will be quite a challenge here. 83 00:04:18,200 --> 00:04:20,666 And just to give you a teasing of what we are going to do, 84 00:04:20,666 --> 00:04:22,766 we are going to make kind of a lying detector. 85 00:04:22,766 --> 00:04:24,866 So I look forward to seeing you in the next section. 86 00:04:24,866 --> 00:04:26,733 And until then, enjoy machine learning.