1 00:00:00,200 --> 00:00:02,633 Hello and welcome to this art tutorial. 2 00:00:02,633 --> 00:00:04,666 Now that we get this curve, and now that we see 3 00:00:04,666 --> 00:00:07,333 that we have an excellent model for our problem, 4 00:00:07,333 --> 00:00:09,033 we are going to validate this model. 5 00:00:09,033 --> 00:00:10,066 And this is the model 6 00:00:10,066 --> 00:00:13,700 we were going to choose to make our final accurate prediction. 7 00:00:13,966 --> 00:00:15,666 So let's do it. 8 00:00:15,666 --> 00:00:19,000 I'm actually going to this new code section that I prepared for you. 9 00:00:19,000 --> 00:00:22,200 So a first one to predict the salary associated 10 00:00:22,200 --> 00:00:25,700 to a 6.5 level according to a linear regression model. 11 00:00:25,800 --> 00:00:29,966 And then the second code section will predict the salary of the 6.5 level 12 00:00:30,266 --> 00:00:33,266 according to our polynomial regression. 13 00:00:33,266 --> 00:00:36,366 This one right here with a fourth degree. 14 00:00:36,733 --> 00:00:37,000 Okay. 15 00:00:37,000 --> 00:00:40,000 So let's start with the linear regression prediction. 16 00:00:40,000 --> 00:00:43,100 We're going to do exactly the same as in simple linear regression. 17 00:00:43,100 --> 00:00:45,300 We're going to use the predict function. 18 00:00:45,300 --> 00:00:49,433 But actually something is going to be different and will interest many of you. 19 00:00:49,700 --> 00:00:52,266 This time we're not going to make some predictions 20 00:00:52,266 --> 00:00:55,500 on a whole test set that is on a vector of observations. 21 00:00:55,800 --> 00:00:58,200 We are going to make a single prediction. 22 00:00:58,200 --> 00:00:59,300 That is, we're going to make 23 00:00:59,300 --> 00:01:03,866 a prediction of a single level, which is going to be the 6.5 level. 24 00:01:04,200 --> 00:01:05,700 So you're going to see that the syntax 25 00:01:05,700 --> 00:01:09,733 is going to change the technique to make this prediction is going to change. 26 00:01:09,933 --> 00:01:12,700 But it's actually as easy. 27 00:01:12,700 --> 00:01:13,866 So let's do it. 28 00:01:13,866 --> 00:01:16,833 We're going to call this prediction widespread as before. 29 00:01:16,833 --> 00:01:19,500 So be careful why pred was before a vector. 30 00:01:19,500 --> 00:01:22,233 And now it's only going to be a single prediction value. 31 00:01:22,233 --> 00:01:24,900 So let's still call it why pred anyway. 32 00:01:24,900 --> 00:01:26,533 And same as before. 33 00:01:26,533 --> 00:01:28,900 We're going to take the predict function. 34 00:01:28,900 --> 00:01:31,900 So let's here take predict. Here we go. 35 00:01:32,000 --> 00:01:35,233 And now in this predict function remember we need to specify first 36 00:01:35,433 --> 00:01:37,533 the regressor we want to make the predictions with. 37 00:01:37,533 --> 00:01:39,233 So that's the first argument. 38 00:01:39,233 --> 00:01:42,833 And then the second argument is the new data of which 39 00:01:42,833 --> 00:01:44,900 we want to make the predictions. 40 00:01:44,900 --> 00:01:47,066 Okay. So let's input the first argument. 41 00:01:47,066 --> 00:01:48,566 The first argument is our regressor. 42 00:01:48,566 --> 00:01:53,000 And since we're predicting the result with the linear regression regressor. 43 00:01:53,000 --> 00:01:56,100 And since we called this regressor linear, 44 00:01:56,866 --> 00:02:00,066 then the first argument we need to input here is Lin rec. 45 00:02:01,666 --> 00:02:02,033 All right. 46 00:02:02,033 --> 00:02:03,800 So that's our regressor. 47 00:02:03,800 --> 00:02:08,566 And now we need to input the new data of which we want to make the prediction. 48 00:02:08,766 --> 00:02:14,600 So this new data is as you understood, a single element a single observation point. 49 00:02:15,033 --> 00:02:19,600 And this observation point is actually the 6.5 level here. 50 00:02:19,600 --> 00:02:23,500 So what we'll do now is to actually, you know, 51 00:02:23,500 --> 00:02:27,333 since this 6.5 level doesn't exist in our data set, 52 00:02:27,666 --> 00:02:32,366 we actually need to create a new data frame containing the 6.5 value. 53 00:02:32,400 --> 00:02:35,933 We're not going to add this 6.5 level in our data set. 54 00:02:36,166 --> 00:02:40,800 We're going to create a new data set of only one line and only one column. 55 00:02:41,100 --> 00:02:43,166 That is of only one cell actually. 56 00:02:43,166 --> 00:02:46,633 And this cell will contain the 6.5 level. 57 00:02:46,933 --> 00:02:47,933 So let's do this. 58 00:02:47,933 --> 00:02:50,566 The syntax in order to do that is very simple. 59 00:02:50,566 --> 00:02:54,000 We just need to type here dataframe separated by a dot. 60 00:02:54,433 --> 00:02:57,366 So frame here we go data frame. 61 00:02:57,366 --> 00:03:00,633 And now as you can see it automatically added some parenthesis. 62 00:03:01,000 --> 00:03:04,633 And in these parenthesis we need to input level 63 00:03:06,500 --> 00:03:09,500 equals 6.5. 64 00:03:09,533 --> 00:03:10,733 And that's it. 65 00:03:10,733 --> 00:03:14,000 That's actually all you need to do to make a single prediction 66 00:03:14,000 --> 00:03:16,800 using the predict function with a specific regressor. 67 00:03:16,800 --> 00:03:18,900 That is here the linear regressor. 68 00:03:18,900 --> 00:03:20,000 Let's check it out. 69 00:03:20,000 --> 00:03:24,333 Let's select this line and execute okay. 70 00:03:24,333 --> 00:03:26,600 Why pred correctly generated. 71 00:03:26,600 --> 00:03:30,000 So as you can see why pred is the values here, we can already see 72 00:03:30,166 --> 00:03:33,433 the predicted value of the 6.5 level salary 73 00:03:33,966 --> 00:03:37,100 and it's actually $330,000, 74 00:03:37,500 --> 00:03:41,366 so much higher than what this employee mentioned its salary was. 75 00:03:41,366 --> 00:03:43,500 So that's actually a good start for us. 76 00:03:43,500 --> 00:03:46,300 But remember, we don't want to keep the linear regression model. 77 00:03:46,300 --> 00:03:49,666 We want to keep the most accurate regression model. 78 00:03:49,933 --> 00:03:52,100 That is of course the polynomial regression model. 79 00:03:52,100 --> 00:03:55,333 So what we're going to do now is make this same prediction. 80 00:03:55,566 --> 00:03:58,633 But this time according to our polynomial regression model.