1 00:00:00,633 --> 00:00:02,966 Hello and welcome back to the course on Machine Learning. 2 00:00:02,966 --> 00:00:05,900 Today we're talking about the polynomial regression. 3 00:00:05,900 --> 00:00:07,666 So let's get straight into it. 4 00:00:07,666 --> 00:00:09,866 We already know a couple of types of regressions. 5 00:00:09,866 --> 00:00:12,600 We know the simple linear regression which we can see over here. 6 00:00:12,600 --> 00:00:13,533 Then we've also discussed 7 00:00:13,533 --> 00:00:17,233 the multiple linear regression which is written out over here. 8 00:00:17,866 --> 00:00:21,633 And finally we've got the polynomial linear regression which is written out 9 00:00:21,633 --> 00:00:22,100 here. 10 00:00:22,100 --> 00:00:25,466 So notice how it's very similar to the multiple linear regression. 11 00:00:25,700 --> 00:00:28,933 But at the same time instead of the different variables 12 00:00:28,933 --> 00:00:32,200 like x2, x3, x4 and so on zn 13 00:00:32,533 --> 00:00:37,400 we have the same variable x1, but it is in a different power. 14 00:00:37,400 --> 00:00:42,233 So instead of x2 we have x1 squared, instead of x3 would have x1 cubed. 15 00:00:42,533 --> 00:00:45,866 And so instead of x n we would have x1 to the power of n. 16 00:00:46,233 --> 00:00:48,100 So basically we're using one variable. 17 00:00:48,100 --> 00:00:51,166 But we're using the different powers of that variable. 18 00:00:51,533 --> 00:00:52,266 So let's have a look at 19 00:00:52,266 --> 00:00:55,500 when you would use a polynomial regression when it would come in handy. 20 00:00:56,100 --> 00:01:00,300 Let's say we've got a observation a set of observations which is then 21 00:01:00,300 --> 00:01:05,333 the line that fits this data is obviously a simple linear regression. 22 00:01:05,333 --> 00:01:07,366 As you can see, it feels fitted quite well. 23 00:01:07,366 --> 00:01:11,200 But let's, for a change, say that the data set looked something like this. 24 00:01:11,800 --> 00:01:14,633 So if we try to use a simple linear regression here 25 00:01:14,633 --> 00:01:16,466 which is expressed like that. 26 00:01:16,466 --> 00:01:18,866 You'll see that it doesn't fit quite well. 27 00:01:18,866 --> 00:01:21,500 So in the middle you've got data underneath. 28 00:01:21,500 --> 00:01:24,600 And then as you go further the data will be above the line. 29 00:01:24,900 --> 00:01:26,566 So how can we correct that. 30 00:01:26,566 --> 00:01:30,100 Well we can try to correct that by using a polynomial regression. 31 00:01:30,100 --> 00:01:30,800 Let's have a look. 32 00:01:30,800 --> 00:01:34,800 So instead of the linear regression we're going to conduct a polynomial regression. 33 00:01:34,800 --> 00:01:37,800 And that in this case fits perfectly. 34 00:01:38,100 --> 00:01:39,533 And what is a formula. 35 00:01:39,533 --> 00:01:43,700 Well that is a formula for this particular case y equals b0. 36 00:01:43,700 --> 00:01:46,666 So that's a constant plus b1 x1. So that's a simple linear regression part. 37 00:01:46,666 --> 00:01:49,766 But then we're adding the b2 x1 squared. 38 00:01:50,000 --> 00:01:55,200 And the b2 x1 squared gives its that parabolic effect so that the curve 39 00:01:55,200 --> 00:01:58,466 becomes parabolic and therefore it will fit this data better. 40 00:01:58,900 --> 00:01:59,433 As you can see, 41 00:01:59,433 --> 00:02:03,300 polynomial regression is a bit different to a simple linear regression. 42 00:02:03,600 --> 00:02:06,233 And at the same time it has its own use cases. 43 00:02:06,233 --> 00:02:09,233 So it's all in comes on a case by case basis. You. 44 00:02:09,500 --> 00:02:10,566 You have a problem. 45 00:02:10,566 --> 00:02:14,466 And then you might try a simple linear regression or multiple linear regression 46 00:02:14,666 --> 00:02:15,600 if you have many variables. 47 00:02:15,600 --> 00:02:19,500 Or you might try a polynomial linear regression and see what happens. 48 00:02:19,500 --> 00:02:22,566 And sometimes the polynomial regressions do work better. 49 00:02:22,800 --> 00:02:27,033 For example, they're used to describe how diseases spread 50 00:02:27,033 --> 00:02:32,466 or pandemics and epidemics are spread across territory or across population. 51 00:02:32,700 --> 00:02:35,400 Polynomial linear regressions can be handy there. 52 00:02:35,400 --> 00:02:37,200 And they also have other use cases. 53 00:02:37,200 --> 00:02:39,200 So it's a matter of what works best. 54 00:02:39,200 --> 00:02:42,200 So it's always good to have more tools in your arsenal. 55 00:02:42,533 --> 00:02:44,833 And we have one final question left. 56 00:02:44,833 --> 00:02:48,100 The question is why is it called linear still. 57 00:02:48,133 --> 00:02:48,366 Right. 58 00:02:48,366 --> 00:02:53,033 So we saw those different powers squared cube two to the power of n and so on. 59 00:02:53,066 --> 00:02:55,733 Why is it still called linear. And I'll show you what I mean. 60 00:02:55,733 --> 00:02:59,066 If you look on the left here it says polynomial linear regression. 61 00:02:59,800 --> 00:03:03,633 So why is it still called a linear regression if it's a polynomial 62 00:03:03,633 --> 00:03:04,466 regression? 63 00:03:04,466 --> 00:03:08,366 Well, the trick here is that when we're talking about linear and non-linear, 64 00:03:08,366 --> 00:03:11,566 we're not actually talking about the x variables. 65 00:03:11,766 --> 00:03:12,000 Right. 66 00:03:12,000 --> 00:03:15,800 So even though they're non linear here the relationship between y and x is. 67 00:03:16,733 --> 00:03:20,600 When you're talking about the class of a regression you're talking. 68 00:03:20,600 --> 00:03:25,000 So whether it's linear non-linear you're talking about the coefficients here. 69 00:03:25,033 --> 00:03:26,433 So that's the interesting part. 70 00:03:26,433 --> 00:03:29,900 So whether or not this function which we have here. 71 00:03:29,900 --> 00:03:32,400 So y is a function of x right. 72 00:03:32,400 --> 00:03:37,133 And so the question is can this function be expressed as a linear 73 00:03:37,133 --> 00:03:42,400 combination of these coefficients that because ultimately they are the unknowns. 74 00:03:42,400 --> 00:03:42,600 Right. 75 00:03:42,600 --> 00:03:46,500 So your goal when you're building a regression is to find these coefficients, 76 00:03:46,733 --> 00:03:49,733 find out their actual values so that then further down the track 77 00:03:49,800 --> 00:03:54,166 you can use those coefficients to then plug in x and predict y 78 00:03:54,166 --> 00:03:58,166 whether it's a linear well it's a simple linear multiple linear 79 00:03:58,166 --> 00:03:59,766 regression or polynomial linear regression. 80 00:03:59,766 --> 00:04:02,200 That's your goal to find these B coefficients. 81 00:04:02,200 --> 00:04:06,000 And that's why linear non linear refers to the coefficients. 82 00:04:06,633 --> 00:04:09,033 So an example of a nonlinear regression 83 00:04:09,033 --> 00:04:14,666 would be if the equation was y equals b0 plus b1 x1 84 00:04:14,666 --> 00:04:18,333 divided by b2 plus x2 or something, 85 00:04:18,566 --> 00:04:22,033 or a b0 divided by b1 plus x1. 86 00:04:22,033 --> 00:04:26,300 So a situation where you really cannot replace the coefficients 87 00:04:26,300 --> 00:04:29,400 with other coefficients to turn the equation into a linear one 88 00:04:29,700 --> 00:04:33,266 in regards to the coefficients, not the x values. 89 00:04:33,866 --> 00:04:34,500 So there you go. 90 00:04:34,500 --> 00:04:38,366 That's why a polynomial regression is still called a linear regression. 91 00:04:38,366 --> 00:04:39,866 And that's your fun fact for the day. 92 00:04:39,866 --> 00:04:41,700 And maybe you can show off to your colleagues. 93 00:04:41,700 --> 00:04:45,733 And also because of that, the polynomial linear regression 94 00:04:45,733 --> 00:04:50,633 is actually a special case of the multiple linear regression. 95 00:04:51,066 --> 00:04:54,666 So that's just something to also kind of note that 96 00:04:54,666 --> 00:04:57,900 this is a version of the multiple linear regression. 97 00:04:58,233 --> 00:05:01,533 Rather than a standalone absolutely new type of regression. 98 00:05:01,900 --> 00:05:06,233 So I hope you enjoyed today's tutorial and I look forward to seeing you next time. 99 00:05:06,266 --> 00:05:08,200 Until then, enjoy machine learning.