1 00:00:00,100 --> 00:00:02,233 Ordinary least squares. 2 00:00:02,233 --> 00:00:04,500 So we have our data points. 3 00:00:04,500 --> 00:00:07,600 Now the question is that we're answering in this tutorial is 4 00:00:07,933 --> 00:00:11,366 how do we know which of the sloped lines is the best one? 5 00:00:11,400 --> 00:00:13,866 Is it this one? Is it this one? Is it this one? 6 00:00:13,866 --> 00:00:16,000 Is it this one or is it this one? 7 00:00:16,000 --> 00:00:18,766 As we can see, there can be multiple slope lines 8 00:00:18,766 --> 00:00:21,766 that we can draw through our data points. 9 00:00:21,866 --> 00:00:25,000 And how do we know which one is the best one. 10 00:00:25,000 --> 00:00:27,000 Which is the best linear regression. 11 00:00:27,000 --> 00:00:30,000 And in fact how do we even define the best one. 12 00:00:30,466 --> 00:00:32,900 So in order to answer those questions, we need to look at a method 13 00:00:32,900 --> 00:00:35,000 called the ordinary least squares method. 14 00:00:35,000 --> 00:00:39,333 And the way it works in a visual sense is we need to take our data points 15 00:00:39,533 --> 00:00:42,533 and project them vertically onto, 16 00:00:42,733 --> 00:00:45,366 our linear regression line. 17 00:00:45,366 --> 00:00:48,433 Now we would need to do this for every single linear regression line 18 00:00:48,433 --> 00:00:49,500 that we're considering. 19 00:00:49,500 --> 00:00:50,866 But for simplicity's sake, in 20 00:00:50,866 --> 00:00:53,866 this tutorial we're just going to do it with this, line here in the middle. 21 00:00:54,700 --> 00:00:59,666 Now for each pair of points we have two values y I and y I hat. 22 00:00:59,966 --> 00:01:01,266 So what are these values? 23 00:01:01,266 --> 00:01:04,833 Y is the actual, amount 24 00:01:05,200 --> 00:01:10,500 of, potatoes in our case, in our example, potatoes yielded from the farm 25 00:01:11,100 --> 00:01:15,700 when that, specific amount of nitrogen fertilizer was used. 26 00:01:15,700 --> 00:01:18,800 So let's say 15kg of nitrogen fertilizer were used 27 00:01:19,033 --> 00:01:22,033 and the farm yielded two tons of potatoes. 28 00:01:22,166 --> 00:01:27,000 Y I hat, on the other hand, is what this linear regression 29 00:01:27,000 --> 00:01:30,066 that we're considering what it predicts the yield, 30 00:01:30,600 --> 00:01:33,800 to be or two would have been. 31 00:01:33,800 --> 00:01:38,233 So in this case, let's say I again, 15kg of nitrogen were used. 32 00:01:38,233 --> 00:01:42,366 But the, linear regression that we're looking at predicts that only 33 00:01:42,600 --> 00:01:47,866 one and a half tonnes of potatoes, would have been yielded from the farm. 34 00:01:48,266 --> 00:01:52,400 As you can see, there's a slight difference between the, 35 00:01:53,066 --> 00:01:56,700 actual value, actual yield and the predicted. 36 00:01:56,733 --> 00:01:57,633 And that's normal. 37 00:01:57,633 --> 00:01:59,233 It's never going to be perfectly. 38 00:01:59,233 --> 00:02:02,133 This line's never going to go perfectly through every single data point. 39 00:02:02,133 --> 00:02:04,266 That's simply impossible. 40 00:02:04,266 --> 00:02:07,433 but what we want to do is we want to find the best line, 41 00:02:07,433 --> 00:02:11,133 and it will be related to how small 42 00:02:11,133 --> 00:02:14,133 these differences are, as we can imagine. 43 00:02:14,233 --> 00:02:15,866 So let's have a look here. 44 00:02:15,866 --> 00:02:17,733 There's our y and y hat. 45 00:02:17,733 --> 00:02:20,033 The difference between them is called the residual. 46 00:02:20,033 --> 00:02:21,566 here's our equation. 47 00:02:21,566 --> 00:02:27,533 And the best equation is such equation A where b or where 48 00:02:27,633 --> 00:02:31,466 such an equation where the parameters b0 and b1 are such 49 00:02:31,466 --> 00:02:36,700 that the sum of the squares of the residuals 50 00:02:37,033 --> 00:02:40,900 is minimized, and that's why it's called the ordinary least squares method. 51 00:02:40,900 --> 00:02:44,866 So we need to take all of these, residuals, these differences. 52 00:02:44,866 --> 00:02:48,900 We need to square them for every single data point. 53 00:02:49,400 --> 00:02:54,900 and then we need to add up the sum and whenever we find the smallest value. 54 00:02:54,900 --> 00:02:58,066 So for whichever regression line, this value is going to be the smallest, 55 00:02:58,066 --> 00:02:59,500 that will be the best regression line. 56 00:02:59,500 --> 00:03:02,433 And that will guarantee that the line is going nicely through our data points. 57 00:03:02,433 --> 00:03:05,666 And it is the best line or the best linear 58 00:03:05,666 --> 00:03:09,333 regression to use for modeling our problem. 59 00:03:09,333 --> 00:03:10,000 So there you go. 60 00:03:10,000 --> 00:03:12,800 That's in a nutshell how the ordinary least squares method works. 61 00:03:12,800 --> 00:03:14,466 And I look forward to seeing you next time. 62 00:03:14,466 --> 00:03:16,100 And till then enjoy machine learning.