1 00:00:00,300 --> 00:00:01,066 Perfect. 2 00:00:01,066 --> 00:00:02,566 And now, final step. 3 00:00:02,566 --> 00:00:04,633 You know what you have to do, right? 4 00:00:04,633 --> 00:00:07,700 We have to take first our classifier and then. 5 00:00:07,700 --> 00:00:12,066 Well, now what we want to do is to train our classifier on the training set. 6 00:00:12,300 --> 00:00:15,966 Because remember that this line of code only builds 7 00:00:16,166 --> 00:00:19,166 the logistic regression model but doesn't train it yet. 8 00:00:19,200 --> 00:00:22,333 And this next line of code is indeed the final step 9 00:00:22,333 --> 00:00:25,633 where we train our classifier on the training set. 10 00:00:25,966 --> 00:00:30,533 And remember to do this, we have to call the fit method, 11 00:00:30,733 --> 00:00:34,566 which takes as input two entities, you know, two sets of data. 12 00:00:34,566 --> 00:00:37,933 The first one is the matrix of features of the training set, 13 00:00:38,266 --> 00:00:41,700 and the second one is the dependent variable vector of the training set, where 14 00:00:41,700 --> 00:00:45,300 of course, all the purchase decisions of the dependent variable vector 15 00:00:45,633 --> 00:00:49,800 correspond to the same customers of the matrix of features. 16 00:00:49,900 --> 00:00:50,700 All right. 17 00:00:50,700 --> 00:00:53,800 So what we had to input here was simply X 18 00:00:54,400 --> 00:00:57,333 train for the matrix of features of the training set. 19 00:00:57,333 --> 00:01:00,333 And then y train for 20 00:01:00,333 --> 00:01:03,333 the dependent variable vector of the same training set. 21 00:01:03,833 --> 00:01:06,266 And there you go my friends. Congratulations. 22 00:01:06,266 --> 00:01:10,200 If you obtained this and also congratulations if you tried. 23 00:01:10,500 --> 00:01:14,266 Because indeed that's how you build the logistic regression model. 24 00:01:14,266 --> 00:01:15,000 So there you go. 25 00:01:15,000 --> 00:01:18,000 That's one extra machinery model in your toolkit. 26 00:01:18,166 --> 00:01:20,333 And you're going to have many more. 27 00:01:20,333 --> 00:01:23,800 And don't worry I will train you on how to select 28 00:01:23,800 --> 00:01:26,800 the best one for any data set. 29 00:01:27,066 --> 00:01:29,500 Okay. Great. So let's run this cell. 30 00:01:29,500 --> 00:01:32,766 This will indeed build and train the logistic regression 31 00:01:32,766 --> 00:01:36,600 model on your training set composed of X and Y train. 32 00:01:36,866 --> 00:01:39,100 And here, by the way, you see all the parameters 33 00:01:39,100 --> 00:01:41,800 of the logistic regression model which you can tune. 34 00:01:41,800 --> 00:01:45,366 The most famous one is C which is inverse of the regularization 35 00:01:45,366 --> 00:01:50,133 strength, meaning that the smaller is c, the stronger will be the regularization, 36 00:01:50,133 --> 00:01:53,700 and therefore the more it will protect you from overfitting. 37 00:01:54,100 --> 00:01:56,833 Right here we'll just keep the default value of one. 38 00:01:56,833 --> 00:01:57,700 And there you go. 39 00:01:57,700 --> 00:02:01,433 We're ready to move on to the next step predicting a new result. 40 00:02:01,633 --> 00:02:06,600 And so now new exercise for you because we already did it with regression 41 00:02:06,600 --> 00:02:10,700 I taught you before how to, you know, take your regressor and then call 42 00:02:10,700 --> 00:02:14,133 the predict method to predict the result of a single observation. 43 00:02:14,500 --> 00:02:17,633 And so here the exercise that I would like you to do 44 00:02:18,066 --> 00:02:22,000 is to predict the purchased decision of a single result, 45 00:02:22,200 --> 00:02:25,033 you know, the purchase decision of a single customer. 46 00:02:25,033 --> 00:02:27,766 And I will tell you which one I would like you to predict 47 00:02:27,766 --> 00:02:32,600 the purchased decision of the first customer in the test set. 48 00:02:32,900 --> 00:02:36,900 Who you will see is 30 years old and earns 49 00:02:36,900 --> 00:02:40,633 an estimated salary of $87,000. 50 00:02:40,833 --> 00:02:42,900 So I would like you to take these two inputs. 51 00:02:42,900 --> 00:02:46,400 You know, these two features with these values, 30 years old 52 00:02:46,400 --> 00:02:49,500 and $87,000 as the estimated salary. 53 00:02:49,700 --> 00:02:55,066 And I would like you to predict whether this customer has bought yes or no. 54 00:02:55,200 --> 00:02:58,566 The SUV, you have the answer and y test. 55 00:02:58,566 --> 00:03:01,233 You know, you take the first result of Y test, 56 00:03:01,233 --> 00:03:05,166 and thus you'll be able to compare your prediction to the real result 57 00:03:05,166 --> 00:03:08,166 to figure out whether your prediction was correct. 58 00:03:08,466 --> 00:03:11,100 And so there you go. Please do the exercise. 59 00:03:11,100 --> 00:03:14,400 Please try to do this on your own first before we do it together 60 00:03:14,400 --> 00:03:17,800 in the next tutorial, please predict the purchase decision 61 00:03:17,800 --> 00:03:21,900 of that first customer of the test set and we will 62 00:03:22,200 --> 00:03:26,800 of course implement the solution in the next tutorial. 63 00:03:27,433 --> 00:03:29,700 Until then, enjoy machine learning.