1 00:00:00,300 --> 00:00:01,300 Hello and welcome back. 2 00:00:01,300 --> 00:00:04,300 Let's have a look at simple linear regression. 3 00:00:04,533 --> 00:00:05,833 So here's the equation. 4 00:00:05,833 --> 00:00:10,100 And we will look at the parts of this equation one by one. 5 00:00:10,100 --> 00:00:14,000 So on the left we have our dependent variable which we're trying to predict. 6 00:00:14,366 --> 00:00:17,966 On the right we have our independent variable which is the predictor. 7 00:00:18,500 --> 00:00:23,233 here we have b0 which is the y intercept also known as a constant. 8 00:00:23,566 --> 00:00:26,266 And b1 is the slope coefficient. 9 00:00:26,266 --> 00:00:30,233 Now to make things more fun we're going to use that example 10 00:00:30,233 --> 00:00:33,900 we mentioned about predicting the output of potatoes 11 00:00:33,900 --> 00:00:37,033 on a farm based on the amount of fertilizer that we use. 12 00:00:37,500 --> 00:00:38,766 So here's our equation. 13 00:00:38,766 --> 00:00:42,433 And if we modify it to fit our example it will look like this. 14 00:00:43,000 --> 00:00:46,433 And let's say that we ran the simple linear regression algorithm. 15 00:00:46,433 --> 00:00:50,866 And it came up with the following values B0 equals eight tons 16 00:00:50,866 --> 00:00:54,000 and b1 equals three tons per kilogram. 17 00:00:54,900 --> 00:00:58,900 So what does this mean in terms of the graphical representation. 18 00:00:58,900 --> 00:01:02,100 How to how can we better understand this on an intuitive level. 19 00:01:02,433 --> 00:01:06,200 So let's plot a simple scatter plot. 20 00:01:06,200 --> 00:01:11,133 So here we have on the x axis a nitrogen fertilizer used in kilograms 21 00:01:11,133 --> 00:01:14,533 as our x1, variable. 22 00:01:14,533 --> 00:01:18,500 And here we have the Y variable, which is a potato yield in tons. 23 00:01:18,933 --> 00:01:23,066 And here on the scatter plot we have several data points. 24 00:01:23,266 --> 00:01:24,266 What are these data points. 25 00:01:24,266 --> 00:01:28,533 Well each one represents a separate harvest on the farm 26 00:01:28,533 --> 00:01:29,666 that we are talking about. 27 00:01:29,666 --> 00:01:34,666 So, multiple times, the potatoes were harvested over 28 00:01:34,666 --> 00:01:39,933 many years, and the farmer recorded how much fertilizer they used 29 00:01:39,933 --> 00:01:45,400 and also how many potatoes they were able to harvest in that specific season. 30 00:01:45,933 --> 00:01:48,000 So there is a scatter plot. 31 00:01:48,000 --> 00:01:51,600 And what this, equation on the left represents 32 00:01:51,600 --> 00:01:54,966 is a sloped line that goes through a point. 33 00:01:55,400 --> 00:01:57,500 the y intercept is over here. 34 00:01:57,500 --> 00:01:59,100 That's our eight tons. 35 00:01:59,100 --> 00:02:02,800 And what the slope coefficient means is that if you increase 36 00:02:02,800 --> 00:02:05,900 the amount of nitrogen fertilizer by one kilogram, 37 00:02:06,366 --> 00:02:09,100 then the amount of potato output 38 00:02:09,100 --> 00:02:12,100 will increase by three tons. 39 00:02:12,466 --> 00:02:16,300 And of course, these numbers are made up for illustrative purposes. 40 00:02:16,300 --> 00:02:18,833 So there we go. That's how the simple linear regression works. 41 00:02:18,833 --> 00:02:20,133 I look forward to seeing you next time. 42 00:02:20,133 --> 00:02:22,133 Until then, enjoy machine learning.