1 00:00:00,233 --> 00:00:02,600 Hello and welcome to this art tutorial. 2 00:00:02,600 --> 00:00:06,600 So right now let's write the code to visualize those results. 3 00:00:07,066 --> 00:00:10,166 So we're going to do this with ggplot2. 4 00:00:10,500 --> 00:00:14,100 In case you didn't follow the linear regression tutorials on R 5 00:00:14,100 --> 00:00:17,466 I'm just going to write a line to install the ggplot2 package. 6 00:00:17,466 --> 00:00:20,100 I'm not going to install it, but it's just in case you don't have it. 7 00:00:20,100 --> 00:00:23,100 And to check that you need to go to the packages here 8 00:00:23,466 --> 00:00:26,500 and check if you have the ggplot2 package here. 9 00:00:26,533 --> 00:00:29,066 And so if you don't find it in this list here, 10 00:00:29,066 --> 00:00:30,800 you need to install it and to install it. 11 00:00:30,800 --> 00:00:31,833 It's really simple. 12 00:00:31,833 --> 00:00:35,633 You just need to type here install dot packages. 13 00:00:35,633 --> 00:00:36,833 You can press enter here. 14 00:00:36,833 --> 00:00:42,233 And then in quotes you enter the name of the package which is ggplot2. 15 00:00:42,700 --> 00:00:46,166 And then you just need to select this line and execute this line 16 00:00:46,166 --> 00:00:48,200 by pressing Command and Control plus enter. 17 00:00:48,200 --> 00:00:51,900 And this will install the ggplot two package without any issue. 18 00:00:52,133 --> 00:00:55,566 So I'm going to put that as command by pressing command shift plus c. 19 00:00:55,900 --> 00:00:57,033 All right. 20 00:00:57,033 --> 00:01:00,866 And now we can start to visualize the linear regression results. 21 00:01:01,533 --> 00:01:07,000 So as you can see in the packages here the ggplot2 package is not selected here. 22 00:01:07,000 --> 00:01:09,300 So I'm going to need to select here. 23 00:01:09,300 --> 00:01:10,666 So either I can click here. 24 00:01:10,666 --> 00:01:14,566 Or the better way is to automate the selection of this package 25 00:01:14,766 --> 00:01:16,166 thanks to this script. 26 00:01:16,166 --> 00:01:19,166 And to do this we just need to add the line library 27 00:01:19,666 --> 00:01:23,066 and in parenthesis the name of the package with no quotes. 28 00:01:23,233 --> 00:01:28,066 So just glad to hear this way and when this line will be executed. 29 00:01:28,066 --> 00:01:29,666 Actually we can check it right now. 30 00:01:29,666 --> 00:01:32,066 As you can see, ggplot2 is not selected. 31 00:01:32,066 --> 00:01:34,933 I'm executing this line and now it's selected. 32 00:01:34,933 --> 00:01:39,700 Okay, so now that our ggplot2 library is selected, let's start building the plot. 33 00:01:40,266 --> 00:01:42,833 So we're going to do exactly like in simple linear regression. 34 00:01:42,833 --> 00:01:45,533 We going to use the ggplot function. 35 00:01:45,533 --> 00:01:48,433 And then we're going to add the different components in the graph. 36 00:01:48,433 --> 00:01:52,900 So first we will add the real observation points thanks to the jump point function. 37 00:01:53,400 --> 00:01:54,433 And then we're going to add 38 00:01:54,433 --> 00:01:58,233 the predictions component thanks to the Geum line function. 39 00:01:58,700 --> 00:02:01,700 Then we're going to add a title to the plot things to get title, 40 00:02:01,700 --> 00:02:06,800 and then the label to the x axis with x slab and a label to the y axis with y lab. 41 00:02:07,033 --> 00:02:08,200 So very simple. 42 00:02:08,200 --> 00:02:12,033 Exactly like before, we are going to make this graph step by step. 43 00:02:12,766 --> 00:02:15,200 Okay. So let's start with the first step. 44 00:02:15,200 --> 00:02:18,633 The first step is just to type here ggplot 45 00:02:19,733 --> 00:02:21,333 with parentheses. 46 00:02:21,333 --> 00:02:24,300 And that will initiate the plot. 47 00:02:24,300 --> 00:02:27,166 And now remember we need to add a plus here. 48 00:02:27,166 --> 00:02:30,166 And that's where we start to add the different components. 49 00:02:30,500 --> 00:02:33,600 So the first component is the real observation points. 50 00:02:33,800 --> 00:02:37,100 And we are plotting them with the gem point function. 51 00:02:37,100 --> 00:02:40,966 So let's add here geom point geom underscore point. 52 00:02:40,966 --> 00:02:42,900 Actually here it is. 53 00:02:42,900 --> 00:02:46,100 And now we input the arguments in the parentheses okay. 54 00:02:46,100 --> 00:02:49,333 So the first argument remember is the esthetic function. 55 00:02:49,600 --> 00:02:53,766 So that's in this function that we will input the x coordinates 56 00:02:53,766 --> 00:02:56,933 of our observation points as well as the y coordinates. 57 00:02:57,300 --> 00:03:00,866 So let's do this s for a function. 58 00:03:01,333 --> 00:03:04,900 So in this esthetic function we need to input the x coordinate 59 00:03:05,366 --> 00:03:09,633 and the y coordinate of all our ten observation points. 60 00:03:09,933 --> 00:03:11,933 So let's see what these coordinates are. 61 00:03:11,933 --> 00:03:13,766 Let's have a look at our data set. 62 00:03:13,766 --> 00:03:16,733 So our ten observation points are characterized 63 00:03:16,733 --> 00:03:19,200 by their levels and their salaries. 64 00:03:19,200 --> 00:03:22,533 The x coordinates corresponds to the independent variable 65 00:03:22,533 --> 00:03:24,433 that is the level column here. 66 00:03:24,433 --> 00:03:27,700 And our y coordinates will be our ten salaries 67 00:03:27,700 --> 00:03:30,700 here associated to these ten levels here. 68 00:03:30,900 --> 00:03:34,800 So in short the x coordinates are the independent variable values. 69 00:03:34,800 --> 00:03:36,466 That is the ten levels. Here. 70 00:03:36,466 --> 00:03:39,533 And the y coordinates are the dependent variable values. 71 00:03:39,533 --> 00:03:42,133 That is the ten salaries here okay. 72 00:03:42,133 --> 00:03:43,966 So let's input that. 73 00:03:43,966 --> 00:03:49,700 So x here is level and y is salary. 74 00:03:50,366 --> 00:03:55,433 However since we did not specify that the data is data set here our data set 75 00:03:55,766 --> 00:04:00,066 we need to specify where we are taking the levels and the salaries. 76 00:04:00,300 --> 00:04:03,466 And to specify that we need to add here data set 77 00:04:04,933 --> 00:04:05,966 dollar. 78 00:04:05,966 --> 00:04:08,733 So that's our understands that we are taking 79 00:04:08,733 --> 00:04:11,833 the levels of our data set data set here. 80 00:04:12,266 --> 00:04:13,666 So it's going to be the same for Y. 81 00:04:13,666 --> 00:04:17,633 We're going to specify that we're taking the salaries of the columns 82 00:04:17,633 --> 00:04:20,633 salary in our data set this way. 83 00:04:20,700 --> 00:04:21,733 Perfect. 84 00:04:21,733 --> 00:04:25,366 And now we can add another argument because this 85 00:04:25,366 --> 00:04:29,233 a static function was the first argument of this geom point function. 86 00:04:29,533 --> 00:04:33,033 And remember we can add a second argument which we will add 87 00:04:33,033 --> 00:04:37,200 because we want to add a color to our points to then make the distinction 88 00:04:37,200 --> 00:04:40,266 between the real observation points and the predictions. 89 00:04:40,633 --> 00:04:42,766 And so we're going to pick the red color. 90 00:04:42,766 --> 00:04:45,766 So here I'm going to add the color argument 91 00:04:45,900 --> 00:04:49,033 and set it equal to red in quotes. 92 00:04:49,533 --> 00:04:50,000 All right. 93 00:04:50,000 --> 00:04:53,066 Now let's not forget to close the parentheses of the jump point function 94 00:04:53,066 --> 00:04:56,466 because this parenthesis was for the esthetic function. 95 00:04:56,900 --> 00:04:58,800 And now we are all fine.