1 00:00:00,930 --> 00:00:06,870 No, simple linear regression is a straightforward approach for predicting Y on the basis of a single 2 00:00:06,870 --> 00:00:08,150 predictor variable X. 3 00:00:08,610 --> 00:00:13,290 So if you take only one predictor variable, it is called simple linear regression. 4 00:00:14,700 --> 00:00:19,400 If we assume linear relationship between X and Y, mathematically it can be the DNS. 5 00:00:20,400 --> 00:00:24,280 Y is approximately equal to be does zero plus between X. 6 00:00:26,490 --> 00:00:32,310 It is nearly equal to because the value of Y that our model will give may not be exactly equal to the 7 00:00:32,310 --> 00:00:32,790 value. 8 00:00:32,820 --> 00:00:39,330 In our observation, we'll come back to this talk later here. 9 00:00:40,120 --> 00:00:44,820 Let us select only one variable from the dataset to predict price of the house. 10 00:00:45,630 --> 00:00:48,960 Let's say I choose average number of rooms for this simple model. 11 00:00:50,610 --> 00:00:57,180 So we will regress house price onto the number of rooms by putting the model price is nearly equal to 12 00:00:57,380 --> 00:01:00,060 the zero plus B 11 times. 13 00:01:00,180 --> 00:01:02,750 Room number here. 14 00:01:03,170 --> 00:01:04,470 B, does it go and be done? 15 00:01:04,620 --> 00:01:12,060 Are the unknown items which are known as model, coefficient or model parameters for the particular 16 00:01:12,060 --> 00:01:17,850 case of simple linear regression between is the slope and B does it or is the intercept? 17 00:01:20,520 --> 00:01:26,560 Once we use the training data to estimate these two parameters, because you don't be the one, you'll 18 00:01:26,580 --> 00:01:32,010 be using this hard symbol to denote estimated parameters from our data. 19 00:01:32,280 --> 00:01:37,380 So we will rate prices equal to be does he look at plus bidone gap times, number of rooms. 20 00:01:39,170 --> 00:01:44,480 So for estimating the values of what parameters, we will be using the data points in our dataset. 21 00:01:45,360 --> 00:01:49,250 If you remember, our house pricing dataset has 506 observations. 22 00:01:50,700 --> 00:01:56,740 This number of data points for general purpose will be denoted by an smolan. 23 00:01:58,020 --> 00:02:01,050 So Smolan is 506 for our dataset. 24 00:02:02,520 --> 00:02:06,570 What this means is we have five hundred six pairs of X and Y values. 25 00:02:07,620 --> 00:02:10,340 And our goal is to obtain coefficient estimates. 26 00:02:10,630 --> 00:02:16,620 But does he look up and we have UNCAP such that the linear model fits the available data? 27 00:02:17,220 --> 00:02:19,950 Such as why one gap is equal to be. 28 00:02:19,950 --> 00:02:23,260 Does it all gap plus bidone gap x1. 29 00:02:23,820 --> 00:02:30,150 And if you generalize it for any eye, it is why is nearly equal to B does he look at Lusby, the one 30 00:02:30,150 --> 00:02:31,070 gap excite. 31 00:02:31,770 --> 00:02:37,170 In other words, we want our estimated line to be as close to these points as possible. 32 00:02:38,880 --> 00:02:45,390 One method for measuring this closeness of our line is called the least squared method, which will 33 00:02:45,390 --> 00:02:46,110 discuss No. 34 00:02:47,530 --> 00:02:53,170 Once we done model and get a line, the line will be predicting a value of Y at each point. 35 00:02:53,530 --> 00:02:53,830 I. 36 00:02:55,420 --> 00:02:59,110 This predicted y value will be denoted by y a gap. 37 00:02:59,530 --> 00:03:02,680 Now we do have the actual values of Y at each of these points. 38 00:03:03,070 --> 00:03:08,290 The difference between these actual values and the predicted value is that, Miss. 39 00:03:08,650 --> 00:03:09,550 This is the residual. 40 00:03:10,540 --> 00:03:18,430 And it is denoted by e I as you can see in the graph, using the training data that we had. 41 00:03:18,850 --> 00:03:21,370 We have fitted a line using the B does. 42 00:03:21,370 --> 00:03:23,380 It won't be the one that we calculated. 43 00:03:23,770 --> 00:03:26,610 And this line is drawn here in the blue color. 44 00:03:27,250 --> 00:03:28,930 Each of these points is also plotted. 45 00:03:29,290 --> 00:03:37,120 Some of these points are exactly on the line, but most of them are missing the distance of that point 46 00:03:37,300 --> 00:03:38,050 from the line. 47 00:03:38,250 --> 00:03:43,520 Indeed, as it will at some point, this race is positive at some points. 48 00:03:43,570 --> 00:03:49,030 This is negative when we are taking out the total residual of the sample. 49 00:03:49,600 --> 00:03:53,650 We cannot straightaway sum them up because some are positive and somewhat negative. 50 00:03:54,310 --> 00:03:58,030 Therefore, it will define a new quantity called residual sum of squares. 51 00:03:59,020 --> 00:04:04,450 Now, since Odyssey's is something these squares of Egypt as a tool, it is representing the total error 52 00:04:05,050 --> 00:04:05,740 in this formula. 53 00:04:05,770 --> 00:04:13,030 You can see that for each of the point, we are subtracting the actual observed value from the predicted 54 00:04:13,030 --> 00:04:13,420 value. 55 00:04:13,480 --> 00:04:14,470 And then squaring it. 56 00:04:15,010 --> 00:04:17,500 And we are doing this for all of the points. 57 00:04:18,370 --> 00:04:24,760 Now we have the total error of our predicted lane and we want to minimize this error. 58 00:04:26,110 --> 00:04:32,890 So using calculus and matrix algebra, we will get these formulas, what B does you do and be done for 59 00:04:32,890 --> 00:04:34,630 which this edit is minimized. 60 00:04:35,770 --> 00:04:42,610 So this approach is called Lee Squared Method, because we are minimizing the squared error squared 61 00:04:42,610 --> 00:04:43,360 sum of errors. 62 00:04:43,450 --> 00:04:46,660 So this odysseys value we are trying to minimize. 63 00:04:47,410 --> 00:04:52,650 So by differentiating and putting it to zero, we'll get these values up between gabbin. 64 00:04:52,690 --> 00:04:58,060 But as you look at what this value, what these values will be does zero. 65 00:04:58,060 --> 00:05:01,860 And we done the calculated sum of squares will be minimum. 66 00:05:03,070 --> 00:05:09,040 So we Dumanis summation of X A minus X, but if you remember X bodies did mean of example. 67 00:05:09,730 --> 00:05:15,630 So for each data point, we will find out this difference of each point from its mean. 68 00:05:16,300 --> 00:05:22,900 And then really multiplied with the difference of each y y variable with X mean you'll sum this product 69 00:05:23,500 --> 00:05:24,580 for all the point. 70 00:05:25,030 --> 00:05:29,830 And we're divided by the difference of X from its mean squared reload. 71 00:05:30,070 --> 00:05:31,990 All points similarly. 72 00:05:32,680 --> 00:05:33,290 What does it do? 73 00:05:33,420 --> 00:05:38,820 It is mean value of Y minus between gabb times mean value of X. 74 00:05:39,190 --> 00:05:41,280 So we have mean value of X, I mean value way. 75 00:05:41,440 --> 00:05:43,540 We first need to get laid the B.W. value. 76 00:05:44,080 --> 00:05:48,370 When we put we don't value and this formula will get the B does value. 77 00:05:50,210 --> 00:05:53,550 They're using these formulas for simple linear regression. 78 00:05:53,700 --> 00:05:56,700 You can get these beads, you know, and be the one values. 79 00:05:58,980 --> 00:06:07,710 So for our model, where we selected House Price as VI and room them as X, if I ran this model an assortment, 80 00:06:07,750 --> 00:06:08,820 I get this result. 81 00:06:09,620 --> 00:06:13,220 I have highlighted the beta values in this bluebox. 82 00:06:14,680 --> 00:06:16,640 This intercept is Bieda zero. 83 00:06:17,170 --> 00:06:19,450 And room number is the X variable. 84 00:06:19,840 --> 00:06:22,540 And this is giving the coefficient of this variable. 85 00:06:22,630 --> 00:06:23,710 So this is be done. 86 00:06:24,970 --> 00:06:25,680 So be done. 87 00:06:25,750 --> 00:06:32,320 Is coming out as nine point zero nine and intercept as is coming is minus thirty four point six nine. 88 00:06:33,310 --> 00:06:40,210 In other words, this means that if I increase the number of rooms by one unit, the price of houses 89 00:06:40,270 --> 00:06:42,010 will increase by nine units. 90 00:06:43,750 --> 00:06:47,990 What is the meaning of all these other values that we'll be learning in the coming videos? 91 00:06:49,720 --> 00:06:56,050 One thing to note here is you do not need to remember these formulas because these software packages 92 00:06:56,050 --> 00:06:57,250 will be doing it for you. 93 00:06:59,280 --> 00:07:04,650 As you saw in this video and you will see in the coming videos, we'll be telling you the mathematical 94 00:07:04,650 --> 00:07:09,720 concept behind the theory and discussing those mathematical formulas. 95 00:07:09,720 --> 00:07:14,310 Also, keep in mind that you do not need to remember these formulas. 96 00:07:14,550 --> 00:07:17,220 You just need to understand the concept behind them. 97 00:07:17,730 --> 00:07:23,850 The intuition that I tell you that will help you interpret the result, that understanding of result 98 00:07:24,030 --> 00:07:24,960 is very important. 99 00:07:26,100 --> 00:07:32,700 But you do not need to memorize these formulas since you will be using a software package will which 100 00:07:32,700 --> 00:07:38,040 will be applying all these formulas and getting the results for you to preparing the data is important. 101 00:07:38,470 --> 00:07:44,210 Running a model is important and interpreting the data accurately is the most important. 102 00:07:45,750 --> 00:07:49,290 Remembering formulas is not important in machine learning, no. 103 00:07:51,330 --> 00:07:56,170 Also, even if you do not understand the mathematical part of this, don't be worried. 104 00:07:56,730 --> 00:08:03,360 You can still run a machine learning model and you can use the results in your professional life. 105 00:08:04,600 --> 00:08:10,320 But I highly recommend that you go through all the lectures very carefully to understand the context 106 00:08:10,350 --> 00:08:12,450 behind all these machine learning methods.