1
00:00:00,930 --> 00:00:06,870
No, simple linear regression is a straightforward approach for predicting Y on the basis of a single

2
00:00:06,870 --> 00:00:08,150
predictor variable X.

3
00:00:08,610 --> 00:00:13,290
So if you take only one predictor variable, it is called simple linear regression.

4
00:00:14,700 --> 00:00:19,400
If we assume linear relationship between X and Y, mathematically it can be the DNS.

5
00:00:20,400 --> 00:00:24,280
Y is approximately equal to be does zero plus between X.

6
00:00:26,490 --> 00:00:32,310
It is nearly equal to because the value of Y that our model will give may not be exactly equal to the

7
00:00:32,310 --> 00:00:32,790
value.

8
00:00:32,820 --> 00:00:39,330
In our observation, we'll come back to this talk later here.

9
00:00:40,120 --> 00:00:44,820
Let us select only one variable from the dataset to predict price of the house.

10
00:00:45,630 --> 00:00:48,960
Let's say I choose average number of rooms for this simple model.

11
00:00:50,610 --> 00:00:57,180
So we will regress house price onto the number of rooms by putting the model price is nearly equal to

12
00:00:57,380 --> 00:01:00,060
the zero plus B 11 times.

13
00:01:00,180 --> 00:01:02,750
Room number here.

14
00:01:03,170 --> 00:01:04,470
B, does it go and be done?

15
00:01:04,620 --> 00:01:12,060
Are the unknown items which are known as model, coefficient or model parameters for the particular

16
00:01:12,060 --> 00:01:17,850
case of simple linear regression between is the slope and B does it or is the intercept?

17
00:01:20,520 --> 00:01:26,560
Once we use the training data to estimate these two parameters, because you don't be the one, you'll

18
00:01:26,580 --> 00:01:32,010
be using this hard symbol to denote estimated parameters from our data.

19
00:01:32,280 --> 00:01:37,380
So we will rate prices equal to be does he look at plus bidone gap times, number of rooms.

20
00:01:39,170 --> 00:01:44,480
So for estimating the values of what parameters, we will be using the data points in our dataset.

21
00:01:45,360 --> 00:01:49,250
If you remember, our house pricing dataset has 506 observations.

22
00:01:50,700 --> 00:01:56,740
This number of data points for general purpose will be denoted by an smolan.

23
00:01:58,020 --> 00:02:01,050
So Smolan is 506 for our dataset.

24
00:02:02,520 --> 00:02:06,570
What this means is we have five hundred six pairs of X and Y values.

25
00:02:07,620 --> 00:02:10,340
And our goal is to obtain coefficient estimates.

26
00:02:10,630 --> 00:02:16,620
But does he look up and we have UNCAP such that the linear model fits the available data?

27
00:02:17,220 --> 00:02:19,950
Such as why one gap is equal to be.

28
00:02:19,950 --> 00:02:23,260
Does it all gap plus bidone gap x1.

29
00:02:23,820 --> 00:02:30,150
And if you generalize it for any eye, it is why is nearly equal to B does he look at Lusby, the one

30
00:02:30,150 --> 00:02:31,070
gap excite.

31
00:02:31,770 --> 00:02:37,170
In other words, we want our estimated line to be as close to these points as possible.

32
00:02:38,880 --> 00:02:45,390
One method for measuring this closeness of our line is called the least squared method, which will

33
00:02:45,390 --> 00:02:46,110
discuss No.

34
00:02:47,530 --> 00:02:53,170
Once we done model and get a line, the line will be predicting a value of Y at each point.

35
00:02:53,530 --> 00:02:53,830
I.

36
00:02:55,420 --> 00:02:59,110
This predicted y value will be denoted by y a gap.

37
00:02:59,530 --> 00:03:02,680
Now we do have the actual values of Y at each of these points.

38
00:03:03,070 --> 00:03:08,290
The difference between these actual values and the predicted value is that, Miss.

39
00:03:08,650 --> 00:03:09,550
This is the residual.

40
00:03:10,540 --> 00:03:18,430
And it is denoted by e I as you can see in the graph, using the training data that we had.

41
00:03:18,850 --> 00:03:21,370
We have fitted a line using the B does.

42
00:03:21,370 --> 00:03:23,380
It won't be the one that we calculated.

43
00:03:23,770 --> 00:03:26,610
And this line is drawn here in the blue color.

44
00:03:27,250 --> 00:03:28,930
Each of these points is also plotted.

45
00:03:29,290 --> 00:03:37,120
Some of these points are exactly on the line, but most of them are missing the distance of that point

46
00:03:37,300 --> 00:03:38,050
from the line.

47
00:03:38,250 --> 00:03:43,520
Indeed, as it will at some point, this race is positive at some points.

48
00:03:43,570 --> 00:03:49,030
This is negative when we are taking out the total residual of the sample.

49
00:03:49,600 --> 00:03:53,650
We cannot straightaway sum them up because some are positive and somewhat negative.

50
00:03:54,310 --> 00:03:58,030
Therefore, it will define a new quantity called residual sum of squares.

51
00:03:59,020 --> 00:04:04,450
Now, since Odyssey's is something these squares of Egypt as a tool, it is representing the total error

52
00:04:05,050 --> 00:04:05,740
in this formula.

53
00:04:05,770 --> 00:04:13,030
You can see that for each of the point, we are subtracting the actual observed value from the predicted

54
00:04:13,030 --> 00:04:13,420
value.

55
00:04:13,480 --> 00:04:14,470
And then squaring it.

56
00:04:15,010 --> 00:04:17,500
And we are doing this for all of the points.

57
00:04:18,370 --> 00:04:24,760
Now we have the total error of our predicted lane and we want to minimize this error.

58
00:04:26,110 --> 00:04:32,890
So using calculus and matrix algebra, we will get these formulas, what B does you do and be done for

59
00:04:32,890 --> 00:04:34,630
which this edit is minimized.

60
00:04:35,770 --> 00:04:42,610
So this approach is called Lee Squared Method, because we are minimizing the squared error squared

61
00:04:42,610 --> 00:04:43,360
sum of errors.

62
00:04:43,450 --> 00:04:46,660
So this odysseys value we are trying to minimize.

63
00:04:47,410 --> 00:04:52,650
So by differentiating and putting it to zero, we'll get these values up between gabbin.

64
00:04:52,690 --> 00:04:58,060
But as you look at what this value, what these values will be does zero.

65
00:04:58,060 --> 00:05:01,860
And we done the calculated sum of squares will be minimum.

66
00:05:03,070 --> 00:05:09,040
So we Dumanis summation of X A minus X, but if you remember X bodies did mean of example.

67
00:05:09,730 --> 00:05:15,630
So for each data point, we will find out this difference of each point from its mean.

68
00:05:16,300 --> 00:05:22,900
And then really multiplied with the difference of each y y variable with X mean you'll sum this product

69
00:05:23,500 --> 00:05:24,580
for all the point.

70
00:05:25,030 --> 00:05:29,830
And we're divided by the difference of X from its mean squared reload.

71
00:05:30,070 --> 00:05:31,990
All points similarly.

72
00:05:32,680 --> 00:05:33,290
What does it do?

73
00:05:33,420 --> 00:05:38,820
It is mean value of Y minus between gabb times mean value of X.

74
00:05:39,190 --> 00:05:41,280
So we have mean value of X, I mean value way.

75
00:05:41,440 --> 00:05:43,540
We first need to get laid the B.W. value.

76
00:05:44,080 --> 00:05:48,370
When we put we don't value and this formula will get the B does value.

77
00:05:50,210 --> 00:05:53,550
They're using these formulas for simple linear regression.

78
00:05:53,700 --> 00:05:56,700
You can get these beads, you know, and be the one values.

79
00:05:58,980 --> 00:06:07,710
So for our model, where we selected House Price as VI and room them as X, if I ran this model an assortment,

80
00:06:07,750 --> 00:06:08,820
I get this result.

81
00:06:09,620 --> 00:06:13,220
I have highlighted the beta values in this bluebox.

82
00:06:14,680 --> 00:06:16,640
This intercept is Bieda zero.

83
00:06:17,170 --> 00:06:19,450
And room number is the X variable.

84
00:06:19,840 --> 00:06:22,540
And this is giving the coefficient of this variable.

85
00:06:22,630 --> 00:06:23,710
So this is be done.

86
00:06:24,970 --> 00:06:25,680
So be done.

87
00:06:25,750 --> 00:06:32,320
Is coming out as nine point zero nine and intercept as is coming is minus thirty four point six nine.

88
00:06:33,310 --> 00:06:40,210
In other words, this means that if I increase the number of rooms by one unit, the price of houses

89
00:06:40,270 --> 00:06:42,010
will increase by nine units.

90
00:06:43,750 --> 00:06:47,990
What is the meaning of all these other values that we'll be learning in the coming videos?

91
00:06:49,720 --> 00:06:56,050
One thing to note here is you do not need to remember these formulas because these software packages

92
00:06:56,050 --> 00:06:57,250
will be doing it for you.

93
00:06:59,280 --> 00:07:04,650
As you saw in this video and you will see in the coming videos, we'll be telling you the mathematical

94
00:07:04,650 --> 00:07:09,720
concept behind the theory and discussing those mathematical formulas.

95
00:07:09,720 --> 00:07:14,310
Also, keep in mind that you do not need to remember these formulas.

96
00:07:14,550 --> 00:07:17,220
You just need to understand the concept behind them.

97
00:07:17,730 --> 00:07:23,850
The intuition that I tell you that will help you interpret the result, that understanding of result

98
00:07:24,030 --> 00:07:24,960
is very important.

99
00:07:26,100 --> 00:07:32,700
But you do not need to memorize these formulas since you will be using a software package will which

100
00:07:32,700 --> 00:07:38,040
will be applying all these formulas and getting the results for you to preparing the data is important.

101
00:07:38,470 --> 00:07:44,210
Running a model is important and interpreting the data accurately is the most important.

102
00:07:45,750 --> 00:07:49,290
Remembering formulas is not important in machine learning, no.

103
00:07:51,330 --> 00:07:56,170
Also, even if you do not understand the mathematical part of this, don't be worried.

104
00:07:56,730 --> 00:08:03,360
You can still run a machine learning model and you can use the results in your professional life.

105
00:08:04,600 --> 00:08:10,320
But I highly recommend that you go through all the lectures very carefully to understand the context

106
00:08:10,350 --> 00:08:12,450
behind all these machine learning methods.