1 00:00:00,133 --> 00:00:00,500 All right. 2 00:00:00,500 --> 00:00:03,033 So let's do this. Let's create a new code cell. 3 00:00:03,033 --> 00:00:07,133 And now the first thing that we have to do is called the matplotlib 4 00:00:07,133 --> 00:00:10,133 dot pyplot module, which has a shortcut plt. 5 00:00:10,800 --> 00:00:15,266 Then from this module we're going to call first the scatter function 6 00:00:15,700 --> 00:00:19,800 which will allow to first, you know, display on a 2D plot 7 00:00:20,066 --> 00:00:24,166 the different points of coordinates containing the real results. 8 00:00:24,166 --> 00:00:29,466 You know, the real position levels going from 1 to 10 and the real salaries. 9 00:00:29,466 --> 00:00:30,933 So that's what we are plotting first. 10 00:00:30,933 --> 00:00:33,066 And then we'll plot the predictions. 11 00:00:33,066 --> 00:00:36,400 And so well here we have to input for the coordinates 12 00:00:36,400 --> 00:00:40,300 of these real points containing the real position levels in real salaries. 13 00:00:40,600 --> 00:00:43,233 And on the x axis we have of course the position levels. 14 00:00:43,233 --> 00:00:45,833 Therefore the x coordinates will be x. 15 00:00:45,833 --> 00:00:48,533 You know all the position levels contained in x. 16 00:00:48,533 --> 00:00:51,666 And then on the y axis will have the salaries the real ones. 17 00:00:51,900 --> 00:00:54,300 And therefore the y coordinates will be. 18 00:00:54,300 --> 00:00:56,266 Y'all right. 19 00:00:56,266 --> 00:00:58,766 And then remember we need to add a color 20 00:00:58,766 --> 00:01:01,766 and we'll choose like last time red. 21 00:01:01,800 --> 00:01:03,833 Perfect. So that's for the real results. 22 00:01:03,833 --> 00:01:05,900 And now we're going to plot the predictions. 23 00:01:05,900 --> 00:01:09,466 And to do that well we're going to call PLT again 24 00:01:09,466 --> 00:01:14,200 matplotlib.pyplot from which we're going to call the plot method. 25 00:01:14,200 --> 00:01:17,833 Because this time we're actually going to plot that regression 26 00:01:17,833 --> 00:01:20,833 line, you know, in blue like in simple linear regression. 27 00:01:20,866 --> 00:01:22,400 And then for polynomial regression, 28 00:01:22,400 --> 00:01:25,866 you will see that it's not a line, but it will be actually a curve. 29 00:01:25,866 --> 00:01:28,700 And still we will use the plot function for that. 30 00:01:28,700 --> 00:01:33,900 So again we have to enter the coordinates of the different points of this line. 31 00:01:34,133 --> 00:01:35,533 And for these coordinates 32 00:01:35,533 --> 00:01:39,600 well first the x coordinates are still x the position levels. 33 00:01:39,866 --> 00:01:43,666 But then the y coordinates will be of course the predicted salaries. 34 00:01:43,700 --> 00:01:45,900 You know, instead of the real salaries. 35 00:01:45,900 --> 00:01:48,933 And to get them well we simply need to call our line 36 00:01:49,366 --> 00:01:52,000 req object not line rec two. 37 00:01:52,000 --> 00:01:54,700 You know, line rec two is for the polynomial regression model. 38 00:01:54,700 --> 00:01:57,200 Line is for the linear regression model. 39 00:01:57,200 --> 00:02:00,566 And so Lin rec from which we're going to call that predict 40 00:02:00,866 --> 00:02:04,566 method applied of course to X right. 41 00:02:04,733 --> 00:02:07,900 Containing the position levels of the matrix of features x. 42 00:02:08,433 --> 00:02:10,633 All right. And then we add a color. 43 00:02:10,633 --> 00:02:13,633 And like last time we'll choose the color blue. 44 00:02:14,033 --> 00:02:15,333 All right blue. 45 00:02:15,333 --> 00:02:18,333 And then we'll just improve or enhance 46 00:02:18,333 --> 00:02:21,566 our graph by adding a title an x label and a y label. 47 00:02:21,566 --> 00:02:24,733 And finally displayed so you know how to do this 48 00:02:24,733 --> 00:02:28,600 we call first plt then the title function. 49 00:02:28,966 --> 00:02:32,366 So here we're going to add a funny title like and quote. 50 00:02:32,400 --> 00:02:37,266 Of course truth or fluff right. 51 00:02:37,266 --> 00:02:40,266 This is the simple scenario I invented for this case study. 52 00:02:40,633 --> 00:02:44,433 And then we're going to precise that it's linear regression. 53 00:02:44,433 --> 00:02:48,633 You know the linear regression model okay good good title. 54 00:02:48,833 --> 00:02:50,466 Now an X label. 55 00:02:50,466 --> 00:02:53,533 So here again plt dot x label. 56 00:02:53,966 --> 00:02:55,166 And we'll choose. 57 00:02:55,166 --> 00:03:00,633 You know something simple like position level okay perfect. 58 00:03:00,633 --> 00:03:02,600 Then a y label. 59 00:03:02,600 --> 00:03:05,400 So I'm calling the y label function. 60 00:03:05,400 --> 00:03:08,400 And we're going to enter salary because on the 61 00:03:08,400 --> 00:03:12,066 y axis are all the salaries the real ones or the predicted ones. 62 00:03:12,633 --> 00:03:19,233 And finally remember that we have to show the graph by using this show function. 63 00:03:19,566 --> 00:03:20,733 Perfect. 64 00:03:20,733 --> 00:03:21,466 And there we go. 65 00:03:21,466 --> 00:03:24,400 We're already ready to visualize the linear regression results. 66 00:03:24,400 --> 00:03:25,133 Are you ready? 67 00:03:25,133 --> 00:03:28,800 I'm going to close this so that we can see it very well. 68 00:03:29,300 --> 00:03:31,566 All right let's make sure I didn't make any mistake. 69 00:03:31,566 --> 00:03:33,266 Yeah it seems all good. 70 00:03:33,266 --> 00:03:36,266 So now let's press play and here we go. 71 00:03:36,366 --> 00:03:39,300 This is the linear regression results. 72 00:03:39,300 --> 00:03:41,400 Let's scroll down a bit. 73 00:03:41,400 --> 00:03:42,033 All right. 74 00:03:42,033 --> 00:03:47,533 So first of all to recap the red points here are the real salaries. 75 00:03:47,533 --> 00:03:52,500 You know going from zero to I think 1 million for the CEO. 76 00:03:52,500 --> 00:03:53,600 Yes okay. 77 00:03:53,600 --> 00:03:57,133 So these are all the real salaries of this column okay. 78 00:03:57,333 --> 00:03:59,766 And then the blue line is of course 79 00:03:59,766 --> 00:04:02,900 the regression line containing all our predictions. 80 00:04:02,900 --> 00:04:05,900 But the predictions of the linear regression model. 81 00:04:06,266 --> 00:04:09,600 So first of all we can see that indeed the linear 82 00:04:09,600 --> 00:04:13,933 regression model is not well adapted to this data set. 83 00:04:13,933 --> 00:04:17,500 Because indeed, remember that in simple linear regression, 84 00:04:17,500 --> 00:04:20,900 it was well adapted because each time, you know, for each of the values 85 00:04:20,900 --> 00:04:25,100 of the features, well, the prediction was close to the real result. 86 00:04:25,500 --> 00:04:29,366 But here for many position levels, you know, for many values 87 00:04:29,366 --> 00:04:33,466 of the feature, well, the prediction is far from the real result here. 88 00:04:33,466 --> 00:04:36,166 For example, it is quite far from the real salary. 89 00:04:36,166 --> 00:04:40,666 Imagine if we wanted to use that model to predict if someone is saying 90 00:04:40,666 --> 00:04:43,733 the truth, or bluffing about a salary that is contained here. 91 00:04:43,866 --> 00:04:44,700 Well, you know, 92 00:04:44,700 --> 00:04:48,500 we would have offered a way higher salary then we would be supposed to. 93 00:04:48,500 --> 00:04:50,833 So that would not be the best negotiation. 94 00:04:50,833 --> 00:04:51,200 Okay. 95 00:04:51,200 --> 00:04:54,866 So that linear regression model is not well adapted and same here. 96 00:04:54,866 --> 00:04:56,400 It is far from the prediction. 97 00:04:56,400 --> 00:04:59,100 Same here, far from the prediction here. It's okay. 98 00:04:59,100 --> 00:05:02,433 But you know that's only for two points two salaries. 99 00:05:02,733 --> 00:05:04,933 But then here it's super far from the prediction. 100 00:05:04,933 --> 00:05:06,033 And here as well. 101 00:05:06,033 --> 00:05:08,033 So clearly not well adapted. 102 00:05:08,033 --> 00:05:10,166 And that's why I wanted to show you this. 103 00:05:10,166 --> 00:05:11,566 Because now you're going to see 104 00:05:11,566 --> 00:05:15,733 that the polynomial regression results are going to be much better. 105 00:05:15,900 --> 00:05:17,966 And now I'm going to show this to you right away. 106 00:05:17,966 --> 00:05:19,400 So I'm going to close this. 107 00:05:19,400 --> 00:05:22,133 And we're going to visualize the polynomial regression result 108 00:05:22,133 --> 00:05:25,133 efficiently by taking this code. 109 00:05:25,366 --> 00:05:28,200 Because you're going to see that it's going to be almost the same. 110 00:05:28,200 --> 00:05:32,500 So I'm going to scroll down I'll scroll back up to compare. 111 00:05:32,500 --> 00:05:36,366 Create a new code cell facing the here and now. 112 00:05:36,366 --> 00:05:39,600 According to you, what do we have to change here? 113 00:05:40,066 --> 00:05:41,700 Actually something important. 114 00:05:41,700 --> 00:05:46,233 Please press pause on the video now and try to figure out before 115 00:05:46,233 --> 00:05:47,233 I give you the solution, 116 00:05:47,233 --> 00:05:50,533 what you have to replace here in order to visualize the polynomial 117 00:05:50,533 --> 00:05:51,600 regression result. 118 00:05:51,600 --> 00:05:54,666 So please press pause in the video and make that change in order 119 00:05:54,666 --> 00:05:56,900 to make it work for polynomial regression results.