1 00:00:00,660 --> 00:00:04,260 We saw simple linear regression with only one predictor variable. 2 00:00:05,310 --> 00:00:07,740 However, in practice, that is really the case. 3 00:00:08,370 --> 00:00:11,160 And we usually have more than one predictor variable. 4 00:00:12,480 --> 00:00:17,880 Even in our house pricing data, we have 16 variables lived after data processing. 5 00:00:19,620 --> 00:00:23,800 Let us now try to extend our analysis to accommodate multiple variables. 6 00:00:26,080 --> 00:00:31,660 Mathematically, for multiple linear regression, the equation can be transformed as below. 7 00:00:33,350 --> 00:00:34,910 Why is equal to does it all? 8 00:00:34,940 --> 00:00:37,510 Plus, we don't want names x1 defoe's predictor. 9 00:00:37,660 --> 00:00:39,320 Plus we our two times x2. 10 00:00:39,440 --> 00:00:40,970 The second predictor and so on. 11 00:00:41,030 --> 00:00:47,690 The last predictor here B is giving us the number of predictors and etr this last term. 12 00:00:47,720 --> 00:00:54,050 Iida, is the error dumb, the error from Deve of this predicted away from the actual way. 13 00:00:56,360 --> 00:00:59,330 And for our dataset, this equation. 14 00:00:59,840 --> 00:01:01,100 Can we model like this? 15 00:01:01,730 --> 00:01:07,340 Price is equal to Bedazzler, which is a constant plus beta one times. 16 00:01:07,400 --> 00:01:08,540 The first variable. 17 00:01:09,570 --> 00:01:13,500 Plus, beta, two 22nd preliterate able with this split population and so on. 18 00:01:13,620 --> 00:01:15,000 And the 16 valuable. 19 00:01:17,370 --> 00:01:25,680 So for any debt predictor, bita of that Geertz predictor is giving the average effect of one unit increase 20 00:01:25,680 --> 00:01:29,640 in the value of that predictor on why? 21 00:01:30,620 --> 00:01:37,070 With the assumption that we are hauling all other predators fixed, for example, if we increase poor 22 00:01:37,070 --> 00:01:41,750 population by one unit, price will increase by Bider to units. 23 00:01:42,760 --> 00:01:50,120 If we increase average distance by one unit, price will increase by redoes 16 unit to the coefficients 24 00:01:50,210 --> 00:01:53,510 are giving us the amount of increase and the response would be able. 25 00:01:55,530 --> 00:02:03,150 So when we are estimating the regression coefficient in a multiple linear model, the logical flaw remains 26 00:02:03,150 --> 00:02:03,660 the same. 27 00:02:05,080 --> 00:02:09,550 We first get the rest to some Oscars by this formula. 28 00:02:10,180 --> 00:02:15,340 So this is the square of difference between the predicted and actual values. 29 00:02:15,880 --> 00:02:18,670 And for all the data points, it is some. 30 00:02:19,400 --> 00:02:22,840 So we tried to minimize this value, this artist's value. 31 00:02:24,070 --> 00:02:27,390 We will again use calculus and matrix algebra to get the coefficients. 32 00:02:28,870 --> 00:02:31,840 But the multiple regression coefficients are a bit complicated. 33 00:02:32,230 --> 00:02:33,760 So we are not providing them here. 34 00:02:34,840 --> 00:02:39,210 Our software will be handling that part again for us below. 35 00:02:39,910 --> 00:02:44,410 I have given the result of the multiple linear regression model run on our dataset. 36 00:02:46,510 --> 00:02:55,840 It includes all the predictable variables, the BDA estimates, the coefficient, then the standard 37 00:02:55,930 --> 00:02:56,290 error. 38 00:02:56,800 --> 00:03:02,380 We saw how to calculate standard error before we did simple linear regression and the corresponding 39 00:03:02,380 --> 00:03:03,760 P and P values. 40 00:03:05,520 --> 00:03:08,900 Against these P values, depending on their significance. 41 00:03:09,060 --> 00:03:16,380 We have these tarmac's so we said that we can have a threshold of one percent if we have a threshold 42 00:03:16,380 --> 00:03:17,100 of one percent. 43 00:03:17,430 --> 00:03:19,880 We'll get two stars if we have a total of five or 10. 44 00:03:19,960 --> 00:03:20,730 We'll get one star. 45 00:03:20,760 --> 00:03:24,710 So these two are having a confidence interval of 95 percent. 46 00:03:25,320 --> 00:03:29,220 And these three are having a confidence interval of more than point one percent. 47 00:03:33,200 --> 00:03:37,280 And in the end, these are the values for the complete model. 48 00:03:37,920 --> 00:03:43,670 So these are values for individual variables and these values are for the complete model. 49 00:03:44,320 --> 00:03:47,750 So here is the multiple R squared, which is point seven two. 50 00:03:48,350 --> 00:03:55,880 That is seventy two percent of the variance in the values of Y in the values of house price is explained 51 00:03:55,880 --> 00:03:56,870 by our model. 52 00:03:57,950 --> 00:04:01,550 Adjusted R-squared is taking into account the number of variables. 53 00:04:01,940 --> 00:04:03,870 So the number of variables is 16. 54 00:04:03,900 --> 00:04:06,950 So it is taking into account that we have 16 variables. 55 00:04:07,100 --> 00:04:12,770 So, correspondingly, the adjusted R-squared is point seven one, which is nearly the same as R-squared. 56 00:04:14,500 --> 00:04:19,360 And we can say that this model is pretty good in explaining the variance of why. 57 00:04:20,640 --> 00:04:24,260 Remember that this is just it isn't in a separate video. 58 00:04:24,350 --> 00:04:26,120 I will show you how to get this result. 59 00:04:26,160 --> 00:04:29,700 How do I know multiple linear regression model in this software package. 60 00:04:30,810 --> 00:04:36,270 And there will be discussing the whole result again and the meaning of each and every coefficient and 61 00:04:36,270 --> 00:04:39,450 the P values and this are scored an F statistic. 62 00:04:40,530 --> 00:04:45,990 The aim of showing, you heard is that when we run a multiple linear regression, this is the type of 63 00:04:46,050 --> 00:04:47,040 result that we get. 64 00:04:47,310 --> 00:04:53,430 We get the beta coefficient and the corresponding P values for each of the predictive variables. 65 00:04:53,640 --> 00:04:57,210 And we get some other values which are for the whole model.