1 00:00:00,300 --> 00:00:04,500 So for those of you who are interested in knowing how the code works. 2 00:00:04,666 --> 00:00:06,633 Stay with me. I will explain that. 3 00:00:06,633 --> 00:00:09,533 So I will reduce this. 4 00:00:09,533 --> 00:00:11,300 Okay. 5 00:00:11,300 --> 00:00:14,300 And let's explain this. 6 00:00:14,700 --> 00:00:15,633 Okay. 7 00:00:15,633 --> 00:00:19,700 So the idea is that we consider each of these pixel observation points. 8 00:00:20,100 --> 00:00:24,000 As a user in the social network, like an imaginative user. 9 00:00:24,700 --> 00:00:28,100 And so, you know, for example, this pixel point here is not a user 10 00:00:28,100 --> 00:00:29,266 in the data set. 11 00:00:29,266 --> 00:00:35,133 But we imagine this pixel point here as a user who has this salary and this age. 12 00:00:36,300 --> 00:00:36,666 And we 13 00:00:36,666 --> 00:00:39,666 apply our classifier on this pixel observation points 14 00:00:39,666 --> 00:00:43,100 on this user here, so that the classifier predicts 15 00:00:43,100 --> 00:00:46,100 if the user is going to buy yes or no, the SUV. 16 00:00:46,800 --> 00:00:49,800 Then once the classifier makes its prediction. 17 00:00:50,133 --> 00:00:54,300 It colonizes the pixel observation points according to the prediction. 18 00:00:54,700 --> 00:00:57,933 So if the prediction is no this pixel user 19 00:00:57,933 --> 00:01:01,033 won by the SUV, then it will colorize in red. 20 00:01:01,300 --> 00:01:05,766 And if the prediction is yes, this pixel user will buy the SUV, 21 00:01:06,166 --> 00:01:09,000 then it will colorize the points in green. 22 00:01:09,000 --> 00:01:12,333 And so we apply that idea on all the pixels in this frame, 23 00:01:12,666 --> 00:01:16,133 so that eventually our classifier will colorize 24 00:01:16,133 --> 00:01:19,400 all the points that he predicted as zero in red, 25 00:01:19,600 --> 00:01:22,966 and all the points that he predicted as one in green. 26 00:01:23,766 --> 00:01:26,200 So now that you get the idea, let's look at our code 27 00:01:26,200 --> 00:01:29,333 here and go through the steps of completing this idea. 28 00:01:30,466 --> 00:01:31,066 Okay. 29 00:01:31,066 --> 00:01:34,466 So first I declare a set equals training set. 30 00:01:35,000 --> 00:01:36,166 That's because, you know, 31 00:01:36,166 --> 00:01:39,166 I want to plot this graph for the training set and the test set. 32 00:01:39,366 --> 00:01:40,666 And since we're using training 33 00:01:40,666 --> 00:01:44,233 set many times in the code, I just replaced it by set here. 34 00:01:44,566 --> 00:01:48,500 So that's when I copy paste the same code for the test set. 35 00:01:48,833 --> 00:01:49,966 I just need to replace test 36 00:01:49,966 --> 00:01:52,966 set by training set only here and not in the whole code. 37 00:01:53,533 --> 00:01:55,966 So basically that's just a shortcut. 38 00:01:55,966 --> 00:01:59,400 And so first we build a grid with X one and x two. 39 00:02:00,000 --> 00:02:03,400 So we take the minimum of the values of the training set minus one. 40 00:02:03,400 --> 00:02:06,300 Because we don't want the points to be squeezed in the graph. 41 00:02:06,300 --> 00:02:08,833 And same for the maximum plus one. 42 00:02:08,833 --> 00:02:13,266 So by doing this we're taking the range of our training set observation points 43 00:02:13,533 --> 00:02:17,166 minus one and plus one so that so that our points are not squished in the graph. 44 00:02:17,866 --> 00:02:20,700 And so we're doing this for the h column 45 00:02:20,700 --> 00:02:23,966 of the training set and the salary column of the training set. 46 00:02:25,233 --> 00:02:27,933 Then by writing this line we are building the grid. 47 00:02:27,933 --> 00:02:30,700 So basically with these three lines we're making the grid here 48 00:02:30,700 --> 00:02:35,100 with all the pixel observation points that are imaginary social network users. 49 00:02:35,933 --> 00:02:40,533 Then, since this grid set is actually a matrix of the two columns age and salary, 50 00:02:40,700 --> 00:02:44,666 but for all the imaginary users that are the pixel observation points, 51 00:02:44,966 --> 00:02:46,200 this is actually a matrix. 52 00:02:46,200 --> 00:02:49,733 So with this line, we just give a name to the columns of this matrix, 53 00:02:49,733 --> 00:02:52,733 which are age and estimated salary. 54 00:02:52,833 --> 00:02:55,833 And then that's where all the magic happens. 55 00:02:55,866 --> 00:02:59,000 Because here that's where we use our classifier 56 00:02:59,366 --> 00:03:02,033 to predict the result 57 00:03:02,033 --> 00:03:06,900 of each of the pixel observation points that are the imaginary pixel users. 58 00:03:07,200 --> 00:03:08,966 So we predict this. 59 00:03:08,966 --> 00:03:12,800 And then, you know, since the predict function returns the probabilities, 60 00:03:13,000 --> 00:03:16,566 we transform it into 1 or 0 result. 61 00:03:17,200 --> 00:03:20,166 So that's why I call it y grid, because it's the predictions 62 00:03:20,166 --> 00:03:22,200 of all the points in the grid. 63 00:03:22,200 --> 00:03:25,833 So that returns the vector of predictions of all the points in the grid. 64 00:03:26,233 --> 00:03:29,233 And then finally we plot the whole graph. 65 00:03:29,433 --> 00:03:31,166 So in this plot we include 66 00:03:32,266 --> 00:03:35,666 all our real users and their real actions. 67 00:03:36,066 --> 00:03:39,166 Read if they didn't buy the car in green, if they bought the car 68 00:03:39,566 --> 00:03:44,100 and we plot the predicted results of all the pixel observation points 69 00:03:44,333 --> 00:03:47,333 that we created when we created the grid. 70 00:03:48,166 --> 00:03:49,500 So that's here. 71 00:03:49,500 --> 00:03:51,900 And so, you know, I'm using the color spring green. 72 00:03:51,900 --> 00:03:54,900 So that is the spring green color and tomato. 73 00:03:54,900 --> 00:03:57,366 That is the tomato color. 74 00:03:57,366 --> 00:04:00,200 And for the real points I just use a green four which is a color 75 00:04:00,200 --> 00:04:01,866 in R and red three. 76 00:04:01,866 --> 00:04:05,566 So this color is green four and this color is red three. 77 00:04:06,233 --> 00:04:07,066 And that's it. 78 00:04:07,066 --> 00:04:09,100 That's the idea behind this code. 79 00:04:09,100 --> 00:04:12,100 It's totally fine if you didn't understand some things in this code. 80 00:04:12,533 --> 00:04:15,866 Because anyway, we're going to use this code as a template for our 81 00:04:15,866 --> 00:04:17,333 next classifier. 82 00:04:17,333 --> 00:04:19,600 And so we'll just have to copy paste this code. 83 00:04:19,600 --> 00:04:20,700 But that's all. 84 00:04:20,700 --> 00:04:23,700 So that was just for those of you who are interested in coding, 85 00:04:23,966 --> 00:04:27,133 who are interested in how we can use code to make such a plot. 86 00:04:27,500 --> 00:04:31,766 But the important thing to understand here is that the logistic regression 87 00:04:31,766 --> 00:04:33,500 is a linear classifier, 88 00:04:33,500 --> 00:04:37,466 which in two dimensions means that it's a linear separator, 89 00:04:37,466 --> 00:04:40,800 and therefore it can make some predictions, like the predictions here 90 00:04:41,033 --> 00:04:42,133 there are incorrect. 91 00:04:42,133 --> 00:04:42,566 All right. 92 00:04:42,566 --> 00:04:44,966 So thank you for watching this R tutorials. 93 00:04:44,966 --> 00:04:46,200 And congratulations. 94 00:04:46,200 --> 00:04:50,100 Now you know how to implement a logistic regression model using R. 95 00:04:50,433 --> 00:04:51,133 You're going to see 96 00:04:51,133 --> 00:04:54,766 that we will need some more powerful classifiers later in our journey. 97 00:04:55,000 --> 00:04:58,866 So I look forward to continue this journey and until then enjoy machine learning.