1 00:00:01,270 --> 00:00:04,750 This we do, we learn about other types of media models. 2 00:00:05,930 --> 00:00:11,060 Know, we have discussed the standard linear model given by this equation. 3 00:00:13,260 --> 00:00:17,940 We digress to find out the values of the bee, does bee, does it all be done? 4 00:00:18,300 --> 00:00:23,640 So until beat up Bee and from that regard, predicted values of light. 5 00:00:27,040 --> 00:00:33,520 The sum of squares of the difference of predicted lay and the actual way or the important quantity we 6 00:00:33,880 --> 00:00:34,360 defined. 7 00:00:34,510 --> 00:00:36,700 And we named residual sum of squares. 8 00:00:38,490 --> 00:00:40,320 And we minimized odysseys. 9 00:00:42,140 --> 00:00:44,900 Which is why the more was called or Mary Lee Squeers. 10 00:00:47,340 --> 00:00:51,890 Now we are going to explore some other models other than the plainly squares model. 11 00:00:54,570 --> 00:00:59,710 So there exist alternative voting procedures, which you do benefits. 12 00:01:00,880 --> 00:01:05,320 One is of prediction accuracy and other is models, interpretively. 13 00:01:07,630 --> 00:01:11,190 Prediction accuracy of Li Square method is usually good. 14 00:01:11,650 --> 00:01:15,250 True relationship between predictors and response is approximately linear. 15 00:01:16,330 --> 00:01:18,640 And we have a lot of observations to the grace. 16 00:01:20,530 --> 00:01:25,600 Particularly if the number of observations are much larger than the number of levels. 17 00:01:26,200 --> 00:01:28,510 We may not need any alternative approach. 18 00:01:30,450 --> 00:01:36,060 However, in case if the number of observations is not much larger than be. 19 00:01:37,460 --> 00:01:42,210 Then there will be a lot of variability resulting in a overfit. 20 00:01:42,560 --> 00:01:44,540 And does your predictions. 21 00:01:46,480 --> 00:01:49,040 And in case B is greater than N. 22 00:01:49,330 --> 00:01:53,110 That is number of variables is more than the number of observations. 23 00:01:54,390 --> 00:02:00,000 There will be infinite variability, that is, there will be infinite number of solutions available. 24 00:02:02,260 --> 00:02:09,700 In such a case, by reducing the number of variables which will be selected to run the model are shrinking 25 00:02:09,710 --> 00:02:15,220 those estimated coefficients towards zero, we can substantially reduce the variance. 26 00:02:17,810 --> 00:02:22,310 This small change will lead to a substantial improvement in the accuracy of the prediction. 27 00:02:25,970 --> 00:02:28,040 Second thing is model, interpretively. 28 00:02:29,120 --> 00:02:34,970 If we have irrelevant variables in the analysis, it will unnecessarily complicate deserting model. 29 00:02:37,260 --> 00:02:41,640 If we remove these variables, the model will become more interpretable. 30 00:02:43,540 --> 00:02:44,970 And when do we drop a variable? 31 00:02:46,090 --> 00:02:52,180 If the coalition BDA of that variable is zero, we say that that variable has no impact on the response. 32 00:02:52,960 --> 00:02:54,150 So we can drop it. 33 00:02:55,930 --> 00:02:59,950 But ordinarily, Square's method rarely gives any bit. 34 00:03:00,050 --> 00:03:00,730 That is little. 35 00:03:03,350 --> 00:03:08,840 If our model is able to shutting down the coefficient of not important variables to zero. 36 00:03:09,900 --> 00:03:11,210 You'll be able to rob them. 37 00:03:12,550 --> 00:03:15,610 And that is early model will make more sense to us than. 38 00:03:18,550 --> 00:03:24,700 So this process of excluding irrelevant variables and keeping only the relevant ones is called variable 39 00:03:24,700 --> 00:03:25,270 selection. 40 00:03:30,340 --> 00:03:33,310 In the coming videos, we learn some important methods. 41 00:03:33,760 --> 00:03:35,550 Let's give us these two benefits. 42 00:03:36,620 --> 00:03:39,510 We will be discussing two types of methods primarily. 43 00:03:40,450 --> 00:03:42,460 One type is called subsects election. 44 00:03:43,890 --> 00:03:48,250 In this method, we use a subset of B variables in our model. 45 00:03:48,280 --> 00:03:50,710 Instead of using all the variables. 46 00:03:53,300 --> 00:03:55,730 The second type of method is called shrinkage methods. 47 00:03:58,350 --> 00:04:02,430 In these type of method, we try to string Dondi coalitions of variables to zero. 48 00:04:04,140 --> 00:04:06,240 This is also known as regularisation. 49 00:04:08,740 --> 00:04:15,490 So in the coming videos, we will look at alternative models, which may increase model accuracy and 50 00:04:15,880 --> 00:04:16,630 interpretively.