1 00:00:00,600 --> 00:00:04,210 We saw simple linear regression with only one predictor variable. 2 00:00:05,220 --> 00:00:12,750 However, in practice that is really the case and we usually have more than one predictor variable even 3 00:00:12,750 --> 00:00:14,250 in our house pricing data. 4 00:00:14,460 --> 00:00:18,020 We have 16 variables left after data preprocessing. 5 00:00:19,560 --> 00:00:23,790 Let us not try to extend our analysis to accommodate multiple variables. 6 00:00:25,990 --> 00:00:31,630 Mathematically, for multiple linear regression, the equation can be transformed as below. 7 00:00:33,290 --> 00:00:37,520 Why is he going to be the zero plus one times X one defense predictor? 8 00:00:37,670 --> 00:00:41,000 Plus, we are two times X2, the second predictor and so on. 9 00:00:41,010 --> 00:00:45,260 The last predictor here, B is giving us the number of predictors. 10 00:00:45,860 --> 00:00:54,080 And it's this last item is the erratum the error from the of this predicted Y from the actual Y. 11 00:00:56,240 --> 00:01:04,610 And for our dataset, this equation can be modelled like this price is equal to be zero, which is a 12 00:01:04,610 --> 00:01:08,540 constant plus the times the first variable. 13 00:01:09,510 --> 00:01:15,000 Plus, we two times the second level with this poor population and so on, the 16 variable. 14 00:01:17,310 --> 00:01:25,710 So for any date predictor beta of that date predictor is giving the average effect of one unit increase 15 00:01:25,710 --> 00:01:29,640 in the value of that predictor on why. 16 00:01:30,560 --> 00:01:37,100 With the assumption that we are holding all other predators fixed, for example, if we increase poor 17 00:01:37,100 --> 00:01:41,750 population by one unit, price will increase by better two units. 18 00:01:42,650 --> 00:01:50,660 If we increase resistance by one unit, price will increase by 16 units to the corporations are giving 19 00:01:50,660 --> 00:01:53,530 us the amount of increase in the response variable. 20 00:01:55,470 --> 00:02:03,150 So when we are estimating the regression coefficient in a multiple linear model, the logical flaw remains 21 00:02:03,150 --> 00:02:03,630 the same. 22 00:02:05,020 --> 00:02:13,870 We first get the results of Oscars by this formula, so this is the square of difference between the 23 00:02:13,870 --> 00:02:18,730 predicted and actual values, and for all the data points, it is something. 24 00:02:19,360 --> 00:02:22,800 So we try to minimize this value, this odysseys value. 25 00:02:23,980 --> 00:02:30,010 We will again use calculus and matrix algebra to get the coefficients, but the multiple regression 26 00:02:30,010 --> 00:02:31,830 coefficients are a bit complicated. 27 00:02:32,170 --> 00:02:33,760 So we are not providing them here. 28 00:02:34,730 --> 00:02:39,890 Our software will be handling that part again for us below. 29 00:02:39,910 --> 00:02:44,440 I have given the result of the multiple linear regression model run on our dataset. 30 00:02:46,420 --> 00:02:55,870 It includes all the predictable variables, the beta estimates, the capuchin and then the standard 31 00:02:55,870 --> 00:03:02,380 error, we saw how to calculate standard error before we did a simple linear regression and the corresponding 32 00:03:02,380 --> 00:03:03,760 PNB values. 33 00:03:05,400 --> 00:03:08,950 Against these people use depending on their significance. 34 00:03:08,970 --> 00:03:16,380 We have these tarmac's so we said that we can have a threshold of one percent if we have a threshold 35 00:03:16,380 --> 00:03:18,510 of one percent will get two stars. 36 00:03:18,510 --> 00:03:20,740 If we have a threshold of five percent, we will get one star. 37 00:03:20,760 --> 00:03:24,730 So these two are having a confidence interval of 95 percent. 38 00:03:25,200 --> 00:03:29,280 And these three are having a confidence interval of more than two point one percent. 39 00:03:33,080 --> 00:03:37,280 And in the end, these are the values for the complete model. 40 00:03:37,880 --> 00:03:43,650 So these are values for individual variables and these values are for the complete model. 41 00:03:44,240 --> 00:03:51,350 So here is the multiple R-squared, which is seven to that is seventy two percent of the variance in 42 00:03:51,350 --> 00:03:56,900 the values of Y in the values of house price is explained by our model. 43 00:03:57,830 --> 00:04:03,920 Adjusted did Square is taking into account the number of variables, so the number of variables is 16. 44 00:04:03,930 --> 00:04:07,030 So it is taking into account that we have 16 variables. 45 00:04:07,040 --> 00:04:12,800 So correspondingly, they ask where it is pointing and one which is nearly the same as our square. 46 00:04:14,380 --> 00:04:19,390 And we can say that this model is pretty good in explaining the variance of why. 47 00:04:20,530 --> 00:04:26,160 Remember that this is just the result in a separate video, I'll show you how to get this resolved, 48 00:04:26,160 --> 00:04:29,700 how to run a multiple linear regression model in the software package. 49 00:04:30,690 --> 00:04:36,310 And there will be discussing the whole result again and the meaning of each and every coefficient and 50 00:04:36,310 --> 00:04:39,480 the P values and this are squared and F statistic. 51 00:04:40,450 --> 00:04:46,440 The aim of showing you here is that when we run a multiple linear regression, this is the type of result 52 00:04:46,440 --> 00:04:47,040 that we get. 53 00:04:47,220 --> 00:04:53,820 We get the beta coefficient and the corresponding P values for each of the predictive variables and 54 00:04:53,820 --> 00:04:57,270 we get some other values which are for the whole model.