1 00:00:00,166 --> 00:00:02,700 Hello and welcome to this art tutorial. 2 00:00:02,700 --> 00:00:06,566 So in a previous tutorial, we just pre-processed our data set to make 3 00:00:06,566 --> 00:00:10,833 it ready for our logistic regression model that we will fit on the training set. 4 00:00:11,266 --> 00:00:13,366 So that's what we're going to do in this tutorial. 5 00:00:13,366 --> 00:00:14,866 And let's do it right now. 6 00:00:14,866 --> 00:00:16,333 So it's going to be very simple. 7 00:00:16,333 --> 00:00:18,933 We're just going to create our classifier 8 00:00:20,466 --> 00:00:21,466 equals. 9 00:00:21,466 --> 00:00:25,766 And then we're going to use an existing function of R called the glm function. 10 00:00:26,000 --> 00:00:29,100 That is going to build this logistic regression classifier. 11 00:00:29,666 --> 00:00:31,800 So let's type here glm. 12 00:00:31,800 --> 00:00:35,700 And then let's press F1 to see what parameters we have to input. 13 00:00:36,433 --> 00:00:36,733 Okay. 14 00:00:36,733 --> 00:00:41,233 So the first thing we see is that GLM is for generalized linear models. 15 00:00:41,233 --> 00:00:45,000 And that's because the logistic regression is a linear classifier. 16 00:00:45,100 --> 00:00:47,466 For those of you who didn't follow the Python tutorial, 17 00:00:47,466 --> 00:00:50,133 you will see that our logistic regression classifier 18 00:00:50,133 --> 00:00:53,700 will linearly separate our two classes of users. 19 00:00:54,366 --> 00:00:57,366 And then let's look at the arguments to see why we have two inputs. 20 00:00:58,800 --> 00:01:01,233 All right so the first argument is formula. 21 00:01:01,233 --> 00:01:03,900 So that's the same as what we did in the regression section here. 22 00:01:03,900 --> 00:01:05,966 We're going to write formula 23 00:01:07,166 --> 00:01:08,266 equals. 24 00:01:08,266 --> 00:01:11,933 And then we take the dependent variable which is purchased. 25 00:01:13,533 --> 00:01:15,600 Then alt plus n. 26 00:01:15,600 --> 00:01:18,900 And then here a dot to specify that we want to take all 27 00:01:18,900 --> 00:01:20,600 the independent variables. 28 00:01:20,600 --> 00:01:23,900 So that's way R understands that we want to predict 29 00:01:24,100 --> 00:01:28,033 the purchased result based on all the independent variables 30 00:01:28,033 --> 00:01:31,633 which are the age and the estimated salary. 31 00:01:33,000 --> 00:01:33,533 All right. 32 00:01:33,533 --> 00:01:35,333 Now what is the next argument. 33 00:01:35,333 --> 00:01:37,966 So let's add a comma here. Enter. 34 00:01:37,966 --> 00:01:40,066 And let's add the second argument. 35 00:01:40,066 --> 00:01:41,900 So the second argument is family. 36 00:01:41,900 --> 00:01:46,400 And here since we are in logistic regression we're going to add family 37 00:01:47,500 --> 00:01:50,500 equals binomial. 38 00:01:51,700 --> 00:01:52,900 So at this stage it's 39 00:01:52,900 --> 00:01:56,066 not important to understand what binomial means exactly. 40 00:01:56,433 --> 00:01:57,800 But you just have to remember that 41 00:01:57,800 --> 00:02:01,666 for logistic regression you have to specify the binomial family. 42 00:02:01,666 --> 00:02:01,900 Here. 43 00:02:02,900 --> 00:02:05,966 And then we just need to input one last argument. 44 00:02:06,166 --> 00:02:07,600 It's the data here 45 00:02:07,600 --> 00:02:11,100 which is the data on which you want to train your logistic regression model. 46 00:02:11,500 --> 00:02:13,800 So of course that's our training set. 47 00:02:13,800 --> 00:02:17,900 So we're going to add here data equals training set. 48 00:02:17,933 --> 00:02:20,933 Here we go. Here it is. Okay. 49 00:02:21,533 --> 00:02:22,266 And that's it. 50 00:02:22,266 --> 00:02:27,133 By only using this GLM function and precisely the formula the family 51 00:02:27,133 --> 00:02:30,133 and the fact that we're building the model and the training set 52 00:02:30,366 --> 00:02:33,333 that build our logistic regression classifier. 53 00:02:33,333 --> 00:02:36,033 So now let's select this 54 00:02:36,033 --> 00:02:39,033 command and control plus enter to execute. 55 00:02:39,566 --> 00:02:40,433 And here we go. 56 00:02:40,433 --> 00:02:42,900 The classifier is built. 57 00:02:42,900 --> 00:02:43,666 Okay so that's it. 58 00:02:43,666 --> 00:02:47,000 For this tutorial we fitted logistic regression to the training set. 59 00:02:47,100 --> 00:02:48,366 And in the next tutorial 60 00:02:48,366 --> 00:02:52,633 we will be predicting the test results using this classifier that we just built. 61 00:02:53,200 --> 00:02:55,000 All right thank you for watching this tutorial. 62 00:02:55,000 --> 00:02:56,800 I look forward to seeing you in the next one. 63 00:02:56,800 --> 00:02:58,700 And until then, enjoy machine learning.