1 00:00:00,166 --> 00:00:02,633 Hello and welcome to this art tutorial. 2 00:00:02,633 --> 00:00:05,833 So in the previous tutorials we prepared our data set. 3 00:00:06,266 --> 00:00:10,566 Then we fitted the simple linear regression model to our training set. 4 00:00:10,900 --> 00:00:13,633 And now we're going to predict the test set results. 5 00:00:13,633 --> 00:00:16,533 Because we trained our model on the training set. 6 00:00:16,533 --> 00:00:20,033 And we want to see how it can predict new observations. 7 00:00:20,766 --> 00:00:25,200 So to do this we're going to create our vector of prediction y pred. 8 00:00:26,300 --> 00:00:27,433 I'm calling it why pred. 9 00:00:27,433 --> 00:00:29,733 Because this is a vector of predictions. 10 00:00:29,733 --> 00:00:34,333 That is the vector that will contain the predicted values of the test set. 11 00:00:34,333 --> 00:00:36,133 Observations. 12 00:00:36,133 --> 00:00:39,166 And we are going to use the predict function. 13 00:00:39,700 --> 00:00:41,100 So I'm going to write that here. 14 00:00:41,100 --> 00:00:43,233 So that's the predict function parentheses. 15 00:00:43,233 --> 00:00:46,233 And in this predict function it's really really simple. 16 00:00:46,466 --> 00:00:49,466 We will just need to input two arguments. 17 00:00:49,600 --> 00:00:51,900 The first one is our regressor. 18 00:00:51,900 --> 00:00:54,733 So that's the simple linear regressor. 19 00:00:54,733 --> 00:00:56,433 And then comma. 20 00:00:56,433 --> 00:00:59,433 And then the second argument is new data. 21 00:00:59,900 --> 00:01:01,633 So that's that's the name of the argument. 22 00:01:01,633 --> 00:01:05,200 But then this new data is the data 23 00:01:05,200 --> 00:01:08,500 that contains the new observations that we want to predict the result. 24 00:01:08,833 --> 00:01:11,233 And so obviously that's the test set. 25 00:01:12,633 --> 00:01:14,800 And actually that is ready. 26 00:01:14,800 --> 00:01:17,400 We are ready to build our vector of predictions. 27 00:01:17,400 --> 00:01:19,566 I will just remind what this is going to do. 28 00:01:19,566 --> 00:01:21,066 Here we have a test set. 29 00:01:21,066 --> 00:01:23,833 This is the column of the number of years of experience. 30 00:01:23,833 --> 00:01:26,833 So what will happen is that our simple linear regression 31 00:01:26,966 --> 00:01:31,566 will predict for each of these test set observations, the salary. 32 00:01:32,300 --> 00:01:35,800 So it won't be the same salary here because these are the real salaries, 33 00:01:35,833 --> 00:01:38,833 the salaries that actually exist in real life. 34 00:01:38,933 --> 00:01:42,166 But since we saw that there is a strong linear 35 00:01:42,166 --> 00:01:46,433 dependency between the years of experience and the salary, the predictions 36 00:01:46,433 --> 00:01:50,633 returned by our simple linear regression model should be close to the salary. 37 00:01:50,866 --> 00:01:54,900 Anyway, we're going to see that in the final step of this simple linear 38 00:01:54,900 --> 00:01:58,700 regression model, because we will visualize on the graph all the results. 39 00:01:59,033 --> 00:02:02,066 But so far, just to explain you what we are expecting. 40 00:02:02,700 --> 00:02:05,333 So let's go back to our model and let's select 41 00:02:05,333 --> 00:02:08,633 this command or Control plus enter to execute. 42 00:02:09,133 --> 00:02:10,500 And here we go. 43 00:02:10,500 --> 00:02:13,500 The vector of prediction is now created. 44 00:02:13,633 --> 00:02:15,200 So let's have a look at it. 45 00:02:15,200 --> 00:02:18,200 We're going to type widespread here in the console. 46 00:02:18,900 --> 00:02:20,333 And here it is. 47 00:02:20,333 --> 00:02:25,500 These are all the salary predictions for our ten observations in our test set. 48 00:02:26,300 --> 00:02:26,666 Okay. 49 00:02:26,666 --> 00:02:29,833 So for example let's let's have a look at the first observation. 50 00:02:30,266 --> 00:02:33,266 The real value, the real salary of the first, 51 00:02:33,400 --> 00:02:38,300 employee of the test set was 46,000 and $205. 52 00:02:38,766 --> 00:02:40,633 And let's see what our model predicted. 53 00:02:40,633 --> 00:02:43,933 It predicted $37,000. 54 00:02:44,366 --> 00:02:46,466 So okay, so that's not too close. 55 00:02:46,466 --> 00:02:48,833 But then if we look at the second observation, 56 00:02:48,833 --> 00:02:51,900 which is the fourth employee, let's call it the fourth employee. 57 00:02:52,366 --> 00:02:57,033 His real salary is 43,000 and $525. 58 00:02:57,333 --> 00:03:03,833 And our model predicted $44,322, which is this time much closer. 59 00:03:04,466 --> 00:03:06,900 Okay, so that's it with this tutorial. 60 00:03:06,900 --> 00:03:09,533 And now the next tutorial is my favorite part, 61 00:03:09,533 --> 00:03:14,400 the part where we visually see what everything that we've been making. 62 00:03:14,800 --> 00:03:18,300 That is, we will be visualizing on a graph the training set results 63 00:03:18,566 --> 00:03:20,133 and the test set results. 64 00:03:20,133 --> 00:03:23,933 So we will make two graph a first graph showing the predictions 65 00:03:23,933 --> 00:03:26,500 of the training set observations, and the second graph 66 00:03:26,500 --> 00:03:28,900 showing the predictions of the test set observations. 67 00:03:28,900 --> 00:03:32,666 And we will see how the simple linear regression line approach 68 00:03:32,666 --> 00:03:34,566 the real observations. 69 00:03:34,566 --> 00:03:36,833 So I look forward to seeing you in the next tutorial. 70 00:03:36,833 --> 00:03:38,566 And until then enjoy machine learning.