1
00:00:01,270 --> 00:00:04,750
This we do, we learn about other types of media models.

2
00:00:05,930 --> 00:00:11,060
Know, we have discussed the standard linear model given by this equation.

3
00:00:13,260 --> 00:00:17,940
We digress to find out the values of the bee, does bee, does it all be done?

4
00:00:18,300 --> 00:00:23,640
So until beat up Bee and from that regard, predicted values of light.

5
00:00:27,040 --> 00:00:33,520
The sum of squares of the difference of predicted lay and the actual way or the important quantity we

6
00:00:33,880 --> 00:00:34,360
defined.

7
00:00:34,510 --> 00:00:36,700
And we named residual sum of squares.

8
00:00:38,490 --> 00:00:40,320
And we minimized odysseys.

9
00:00:42,140 --> 00:00:44,900
Which is why the more was called or Mary Lee Squeers.

10
00:00:47,340 --> 00:00:51,890
Now we are going to explore some other models other than the plainly squares model.

11
00:00:54,570 --> 00:00:59,710
So there exist alternative voting procedures, which you do benefits.

12
00:01:00,880 --> 00:01:05,320
One is of prediction accuracy and other is models, interpretively.

13
00:01:07,630 --> 00:01:11,190
Prediction accuracy of Li Square method is usually good.

14
00:01:11,650 --> 00:01:15,250
True relationship between predictors and response is approximately linear.

15
00:01:16,330 --> 00:01:18,640
And we have a lot of observations to the grace.

16
00:01:20,530 --> 00:01:25,600
Particularly if the number of observations are much larger than the number of levels.

17
00:01:26,200 --> 00:01:28,510
We may not need any alternative approach.

18
00:01:30,450 --> 00:01:36,060
However, in case if the number of observations is not much larger than be.

19
00:01:37,460 --> 00:01:42,210
Then there will be a lot of variability resulting in a overfit.

20
00:01:42,560 --> 00:01:44,540
And does your predictions.

21
00:01:46,480 --> 00:01:49,040
And in case B is greater than N.

22
00:01:49,330 --> 00:01:53,110
That is number of variables is more than the number of observations.

23
00:01:54,390 --> 00:02:00,000
There will be infinite variability, that is, there will be infinite number of solutions available.

24
00:02:02,260 --> 00:02:09,700
In such a case, by reducing the number of variables which will be selected to run the model are shrinking

25
00:02:09,710 --> 00:02:15,220
those estimated coefficients towards zero, we can substantially reduce the variance.

26
00:02:17,810 --> 00:02:22,310
This small change will lead to a substantial improvement in the accuracy of the prediction.

27
00:02:25,970 --> 00:02:28,040
Second thing is model, interpretively.

28
00:02:29,120 --> 00:02:34,970
If we have irrelevant variables in the analysis, it will unnecessarily complicate deserting model.

29
00:02:37,260 --> 00:02:41,640
If we remove these variables, the model will become more interpretable.

30
00:02:43,540 --> 00:02:44,970
And when do we drop a variable?

31
00:02:46,090 --> 00:02:52,180
If the coalition BDA of that variable is zero, we say that that variable has no impact on the response.

32
00:02:52,960 --> 00:02:54,150
So we can drop it.

33
00:02:55,930 --> 00:02:59,950
But ordinarily, Square's method rarely gives any bit.

34
00:03:00,050 --> 00:03:00,730
That is little.

35
00:03:03,350 --> 00:03:08,840
If our model is able to shutting down the coefficient of not important variables to zero.

36
00:03:09,900 --> 00:03:11,210
You'll be able to rob them.

37
00:03:12,550 --> 00:03:15,610
And that is early model will make more sense to us than.

38
00:03:18,550 --> 00:03:24,700
So this process of excluding irrelevant variables and keeping only the relevant ones is called variable

39
00:03:24,700 --> 00:03:25,270
selection.

40
00:03:30,340 --> 00:03:33,310
In the coming videos, we learn some important methods.

41
00:03:33,760 --> 00:03:35,550
Let's give us these two benefits.

42
00:03:36,620 --> 00:03:39,510
We will be discussing two types of methods primarily.

43
00:03:40,450 --> 00:03:42,460
One type is called subsects election.

44
00:03:43,890 --> 00:03:48,250
In this method, we use a subset of B variables in our model.

45
00:03:48,280 --> 00:03:50,710
Instead of using all the variables.

46
00:03:53,300 --> 00:03:55,730
The second type of method is called shrinkage methods.

47
00:03:58,350 --> 00:04:02,430
In these type of method, we try to string Dondi coalitions of variables to zero.

48
00:04:04,140 --> 00:04:06,240
This is also known as regularisation.

49
00:04:08,740 --> 00:04:15,490
So in the coming videos, we will look at alternative models, which may increase model accuracy and

50
00:04:15,880 --> 00:04:16,630
interpretively.