1
00:00:01,100 --> 00:00:04,520
In this video, we will discuss about the bias media straight off.

2
00:00:07,110 --> 00:00:13,720
So as I told you in the test, trains split Lichter, our agenda is to find the model the lowest.

3
00:00:13,770 --> 00:00:14,500
Test it a.

4
00:00:16,590 --> 00:00:20,730
Now, fundamentally, there are three contributors to the expected tested at.

5
00:00:22,710 --> 00:00:29,730
These three contributors are called variance bias and the variance of erratum, which is represented

6
00:00:29,730 --> 00:00:30,360
by IDO.

7
00:00:32,830 --> 00:00:33,650
This totem.

8
00:00:35,090 --> 00:00:42,260
Comes from the fact that there is some inherent randomness in the process and given sambal observations,

9
00:00:42,860 --> 00:00:45,920
also do not follow the intended function.

10
00:00:48,300 --> 00:00:50,250
So this is an irreducible error.

11
00:00:50,790 --> 00:00:54,000
And since we cannot do much about it, we will not focus on it.

12
00:00:55,470 --> 00:01:00,030
We'll focus on these two other terms and let's talk about them one by one.

13
00:01:01,910 --> 00:01:07,540
Towards variance variance refers to the amount by which efford change.

14
00:01:07,970 --> 00:01:09,710
If we change our training dataset.

15
00:01:11,650 --> 00:01:18,580
And bias refers to that part of it, which is introduced by approximating a complicated Real-Life relationship

16
00:01:18,970 --> 00:01:19,930
with a simpler model.

17
00:01:21,990 --> 00:01:23,370
So let's look at them one by one.

18
00:01:25,750 --> 00:01:31,150
So as I told, new variance refers to the amount by which the predicted function would change.

19
00:01:31,420 --> 00:01:33,040
If I change, my training, does it.

20
00:01:34,690 --> 00:01:40,930
If you remember when we talked about simple linear regression, I told you that there is this group

21
00:01:40,930 --> 00:01:49,690
Population Lane used by this recall, which is the best line if we were putting the line on the whole

22
00:01:49,690 --> 00:01:50,290
population.

23
00:01:52,760 --> 00:01:57,950
But when we have putting it on a sample, the sample regression line is different from the population

24
00:01:57,950 --> 00:01:58,670
regression line.

25
00:02:00,200 --> 00:02:04,130
And different sample data changes, the sample regression line also changes.

26
00:02:05,310 --> 00:02:11,490
So basically variances capturing the part of error which is coming from that particular simple.

27
00:02:13,930 --> 00:02:17,900
So if we have two models, one of them is more flexible than the other.

28
00:02:18,580 --> 00:02:20,290
Which one will have more variance?

29
00:02:22,260 --> 00:02:26,820
Well, since the more flexible method, we'll be trying to touch each and every point.

30
00:02:28,720 --> 00:02:33,820
Even if I change one or two points, it will give out a completely different that it would function

31
00:02:34,150 --> 00:02:35,620
to accommodate this small change.

32
00:02:36,910 --> 00:02:40,630
This means that more flexible methods of high variance.

33
00:02:43,260 --> 00:02:44,820
This is shown graphically as well.

34
00:02:45,420 --> 00:02:47,010
This first graph on the left.

35
00:02:47,640 --> 00:02:52,950
We are trying to predict this relationship with a straight line.

36
00:02:54,190 --> 00:02:56,200
Straight lane is a very less flexible matter.

37
00:02:58,160 --> 00:03:02,270
Even if I change one or two data point, this blue point.

38
00:03:03,260 --> 00:03:07,400
These lope and the intersect of this line will not change as much.

39
00:03:09,530 --> 00:03:16,630
However, if you look at the function on the date, if I change even one or two point on this go.

40
00:03:17,450 --> 00:03:20,660
The predicted output function will be very different.

41
00:03:23,090 --> 00:03:26,810
So you can see that the variance is very high.

42
00:03:27,200 --> 00:03:33,440
If the flexibility and the covers I saw more flexible Dimitar, I will be the brilliant.

43
00:03:35,970 --> 00:03:40,860
This phenomenon of following the data too closely, as you see in the right glove.

44
00:03:41,890 --> 00:03:46,990
That yet even following the error in the observations is called Overfitting.

45
00:03:48,220 --> 00:03:50,830
When we overdo it, we do get low training error.

46
00:03:51,520 --> 00:03:53,530
But the testator increases.

47
00:03:55,970 --> 00:04:03,250
Now, let's talk about bias bias refers to that part of the error, which is introduced by approximating

48
00:04:03,250 --> 00:04:06,370
a complicated Real-Life relationship with a simpler model.

49
00:04:07,750 --> 00:04:08,500
For example.

50
00:04:09,920 --> 00:04:16,200
He may be trying to fit a linear model between dependent and independent variables with a linear relationship

51
00:04:16,290 --> 00:04:17,400
is highly unlikely.

52
00:04:18,820 --> 00:04:23,280
You can see in this graph the points can never be fitted with a straight line.

53
00:04:23,600 --> 00:04:24,430
Much still.

54
00:04:25,580 --> 00:04:30,390
But still, if we select a linear model, it is always going to have some error.

55
00:04:30,770 --> 00:04:33,110
And that part of it, it is called the bias.

56
00:04:35,730 --> 00:04:38,450
And how is bias littered with flexibility of model?

57
00:04:39,610 --> 00:04:43,930
You can see that linear model, which is less flexible, is unable to fit this data.

58
00:04:45,250 --> 00:04:50,320
If I increase flexibility and allow it to go, then it will better fit the point.

59
00:04:51,430 --> 00:04:55,870
So generally, if we increase flexibility, the bias error reduces.

60
00:04:57,250 --> 00:05:00,700
So you can see where the bias variance tradeoff is coming from.

61
00:05:03,050 --> 00:05:09,860
As we increase flexibility, error due to variance increases and error due to bias decreases.

62
00:05:11,840 --> 00:05:18,410
Although we want to decrease board, but when we do a two degrees one, the other one starts to increase.

63
00:05:19,520 --> 00:05:23,870
So the challenge is to find that point where this summer's minimum.

64
00:05:25,900 --> 00:05:27,430
This is depicted graphically here.

65
00:05:28,350 --> 00:05:32,880
This orange line is showing us the variance, which is increasing with flexibility.

66
00:05:33,690 --> 00:05:37,380
And this blue line is for bias, which is decreasing with flexibility.

67
00:05:38,190 --> 00:05:42,120
And this red line is some of the of these Twitters.

68
00:05:43,100 --> 00:05:46,760
We want to find this minimum point that this sum is the minimum.

69
00:05:47,900 --> 00:05:52,070
Although we will not be able to compute bias and variance for our model.

70
00:05:52,700 --> 00:05:58,820
This concept will be used when we will be comparing different models and their potential accuracy in

71
00:05:58,820 --> 00:06:00,410
predicting dependent variables.