1
00:00:00,470 --> 00:00:04,710
In this video, we will assess the accuracy of the model that we have created.

2
00:00:07,020 --> 00:00:10,140
Now we have established the relationship between X and Y.

3
00:00:11,030 --> 00:00:12,240
What do we want to know?

4
00:00:12,870 --> 00:00:17,730
How well does he predicted by values with the actual Y values?

5
00:00:19,400 --> 00:00:23,940
So to assess the quality of it, we will look at two related quantities.

6
00:00:25,080 --> 00:00:30,020
One is it is a double standard edit and the other is called R Square.

7
00:00:32,760 --> 00:00:35,080
Let us first look at residual standard edit.

8
00:00:38,110 --> 00:00:45,760
We saw earlier that there's a double standard edit is on the road odysseys by and minus two, which

9
00:00:45,760 --> 00:00:53,050
can also be written like this, either summation or squared off difference between actual value and

10
00:00:53,050 --> 00:00:53,860
predicted value.

11
00:00:56,200 --> 00:01:02,770
Roughly speaking, Oddisee is the average amount that the response will deviate from the true regression

12
00:01:02,770 --> 00:01:03,040
line.

13
00:01:06,060 --> 00:01:09,120
As you can see in the desert shown below.

14
00:01:10,380 --> 00:01:13,290
This is the design we got when we landed more than an assortment.

15
00:01:14,160 --> 00:01:15,790
It is totally giving us details.

16
00:01:15,820 --> 00:01:18,280
Do a standard edit with is six point five minutes of.

17
00:01:18,690 --> 00:01:21,750
This model on 504 degrees of freedom.

18
00:01:22,320 --> 00:01:23,200
This 504.

19
00:01:23,310 --> 00:01:24,980
We are getting from minus two.

20
00:01:25,000 --> 00:01:26,080
And it's 506.

21
00:01:26,930 --> 00:01:28,620
And minus two is 504.

22
00:01:29,180 --> 00:01:31,020
And this is called the degrees of Freedom.

23
00:01:32,610 --> 00:01:37,620
So for these many degrees of freedom, we are getting a standard at all six point nine seven.

24
00:01:39,080 --> 00:01:45,930
And in other words, even if a model was correct and through values of redoes, it wouldn't be the one

25
00:01:45,930 --> 00:01:46,800
well known.

26
00:01:46,860 --> 00:01:47,610
Exactly.

27
00:01:48,480 --> 00:01:52,450
The predicted value of house price from this model.

28
00:01:53,380 --> 00:01:59,740
We'll still be off by six point five nine seven unit on an average.

29
00:02:01,790 --> 00:02:09,620
Therefore, Odyssey can also be considered as a measure of lack of fit of this model to the data.

30
00:02:10,550 --> 00:02:18,470
So this six point five nine seven value is telling you on an average by how many units your predictive

31
00:02:18,470 --> 00:02:21,830
value is missing, the actual value.

32
00:02:25,830 --> 00:02:27,240
Next is the Oswestry stick.

33
00:02:28,980 --> 00:02:35,460
The Odyssey provides an absolute measure of lack of food, but since it is measured in the units of

34
00:02:35,460 --> 00:02:40,610
light, it is not always clear what constitutes a good odyssey.

35
00:02:42,880 --> 00:02:48,030
So our R-squared provides us with an alternative squid as a proportion.

36
00:02:49,410 --> 00:02:53,010
The proportion of total variance explained by our model.

37
00:02:53,580 --> 00:02:56,160
So it always lies between zero and one.

38
00:02:57,310 --> 00:02:59,370
It is the mathematical formula for Askwith.

39
00:03:00,240 --> 00:03:06,690
R-squared is VSS minus Artosis upon basis would be assessed as total sum of squares.

40
00:03:07,500 --> 00:03:10,170
And Odyssey's is legitimate sum of squares.

41
00:03:12,170 --> 00:03:17,730
Yes, this is measuring the amount of variability inherent in the response.

42
00:03:18,600 --> 00:03:21,360
That is what our house prices data.

43
00:03:21,810 --> 00:03:25,980
The price of each house itself is writing about dimino space.

44
00:03:27,170 --> 00:03:34,460
So if you find the difference of actual house price from the mean of the house price.

45
00:03:35,580 --> 00:03:37,890
Square these values and add them up.

46
00:03:38,040 --> 00:03:39,900
You get those small squares.

47
00:03:41,440 --> 00:03:48,060
So this sort of sum of squared value is giving you the total amount of variability in whole space.

48
00:03:49,620 --> 00:03:57,420
How much of this is explained by the model that we call constructed or that we will use odysseys, Odyssey's

49
00:03:57,450 --> 00:04:04,650
is measuring the amount of variability that is not explained by our model of prediction and the assessed

50
00:04:04,680 --> 00:04:05,070
minus.

51
00:04:05,080 --> 00:04:09,150
Odyssey's is giving us the variability of way, which is explained by our model.

52
00:04:10,740 --> 00:04:16,830
Therefore, R-squared measures the proportion of explained variance from the total variance.

53
00:04:19,810 --> 00:04:25,630
R-squared venue, close to one, indicates that a large proportion of the variability in the response

54
00:04:25,630 --> 00:04:28,610
variable has been explained by the regression model.

55
00:04:30,070 --> 00:04:34,910
If it is close to zero, it indicates that regression did not explain much of divide evenly.

56
00:04:35,890 --> 00:04:43,300
This can occur either because out of linear model is wrong or because Linnean was not the right choice

57
00:04:43,300 --> 00:04:50,290
for this relationship between X and Y or both of these reasons or our model.

58
00:04:50,760 --> 00:04:56,020
The result given by the software packages that led to a standard error was six point five nine.

59
00:04:58,010 --> 00:05:00,240
R-squared value is zero point forty eight.

60
00:05:02,530 --> 00:05:04,390
So it is somewhere between zero and one.

61
00:05:05,860 --> 00:05:12,190
Nearly 50 percent of the video of the response variable is handled by the model that we constructed.

62
00:05:15,550 --> 00:05:16,810
There is an added value.

63
00:05:17,170 --> 00:05:21,170
We just call it just did R-squared, which you can see from the model result.

64
00:05:22,630 --> 00:05:31,240
The difference between this R-squared and this adjusted R-squared is that an adjusted R-squared will

65
00:05:31,240 --> 00:05:36,730
be altered, taking into account the total number of variables which are actually impacting the model.

66
00:05:38,230 --> 00:05:46,210
The reason behind doing this is if you keep on adding variables to your model, Osgoode value simply

67
00:05:46,210 --> 00:05:47,440
keeps on increasing.

68
00:05:48,980 --> 00:05:53,450
Even if the variable is not significantly related with the response variable.

69
00:05:54,470 --> 00:05:58,880
Still, the R-squared value will increase by less by a small amount.

70
00:06:01,520 --> 00:06:07,750
So the adjusted R-squared is a modified version of R-squared that has been adjusted for the number of

71
00:06:07,750 --> 00:06:14,860
predictors in the model, the adjusted R-squared increase is only the new term, improved the model

72
00:06:14,950 --> 00:06:16,900
more than would be expected by chance.

73
00:06:17,920 --> 00:06:22,060
It decreases when they predict that improves the model by less than expected by tons.

74
00:06:23,950 --> 00:06:26,410
So adjusted R-squared is a more preferred term.

75
00:06:26,410 --> 00:06:26,710
Or what?

76
00:06:26,800 --> 00:06:27,360
R-squared.

77
00:06:30,620 --> 00:06:37,000
We have a value of R-squared, but what value of R-squared will be considered as a good value of R-squared?

78
00:06:38,540 --> 00:06:41,870
This will generally depend on the type of application that you get.

79
00:06:42,710 --> 00:06:49,170
If the data is coming from a science experiment and the relationship is supposed to be actually linear

80
00:06:49,910 --> 00:06:52,930
in such a case, Oscar should be very close to one.

81
00:06:54,350 --> 00:07:01,730
But if it is a marketing data and we are missing a lot of unmeasured factors and the lenient assumption

82
00:07:01,970 --> 00:07:04,880
is also a rough approximation of the relationship.

83
00:07:06,270 --> 00:07:08,330
The residual errors are going to be large.

84
00:07:09,850 --> 00:07:13,840
In such a case, even smaller R-squared values can be acceptable.

85
00:07:14,650 --> 00:07:19,270
Generally, Oscar seventeen point five can be considered as a corporate model.