1 00:00:00,066 --> 00:00:01,200 All right, so that's the problem. 2 00:00:01,200 --> 00:00:01,900 I hope you like it. 3 00:00:01,900 --> 00:00:04,233 I hope you're excited to work on it. 4 00:00:04,233 --> 00:00:07,800 And so now we're gonna, without further ado, start 5 00:00:07,800 --> 00:00:12,566 our logistic regression implementation on your favorite IDE. 6 00:00:12,966 --> 00:00:17,466 Whether it is Google Colaboratory or Jupyter Notebook, you have the choice. 7 00:00:17,700 --> 00:00:20,900 But my favorite is by far Google Colaboratory. 8 00:00:20,900 --> 00:00:26,366 So if you love it to follow me here and now, let's re-implement 9 00:00:26,366 --> 00:00:29,900 this logistic regression implementation step by step. 10 00:00:30,033 --> 00:00:33,100 Right now it is laying out the notebook and we were about to have it in a second. 11 00:00:33,100 --> 00:00:34,100 There we go. 12 00:00:34,100 --> 00:00:34,433 All right. 13 00:00:34,433 --> 00:00:36,333 So that's the whole notebook. 14 00:00:36,333 --> 00:00:37,933 It is in Read-Only mode. 15 00:00:37,933 --> 00:00:42,600 So right now what we have to do is to create a copy of this notebook. 16 00:00:42,866 --> 00:00:46,766 And to do this we just have to click save a Copy and drive. 17 00:00:47,066 --> 00:00:48,633 And this will create a copy. 18 00:00:48,633 --> 00:00:52,200 As you can see of this notebook in which we will be able 19 00:00:52,200 --> 00:00:55,200 to re-implement the whole model from scratch. 20 00:00:55,800 --> 00:00:56,733 All right. Great. 21 00:00:56,733 --> 00:01:00,600 So as usual, the first thing we're going to do is to delete all the code cells. 22 00:01:00,600 --> 00:01:02,833 Right. Because I want you to take action. 23 00:01:02,833 --> 00:01:04,666 I want you to learn by doing so 24 00:01:04,666 --> 00:01:08,500 I really, really want you to reimplement all these code cells from scratch. 25 00:01:08,500 --> 00:01:10,466 So we're going to delete all of them. 26 00:01:10,466 --> 00:01:13,733 To do this we just have to click them and then click the trash button. 27 00:01:13,733 --> 00:01:16,500 Here. Just do as I do. 28 00:01:16,500 --> 00:01:16,800 All right. 29 00:01:16,800 --> 00:01:18,900 And make sure not to delete the text cells 30 00:01:18,900 --> 00:01:22,433 because we want to keep that well highlighted structure. 31 00:01:22,933 --> 00:01:24,833 All right features killing. 32 00:01:24,833 --> 00:01:29,000 So yes they will be feature scaling for logistic regression. 33 00:01:29,000 --> 00:01:30,933 And I will explain why. All right. 34 00:01:30,933 --> 00:01:35,033 So now we train the logistic regression model predict a new result. 35 00:01:35,100 --> 00:01:35,666 All right. 36 00:01:35,666 --> 00:01:38,400 And you really have everything in this implementation. 37 00:01:38,400 --> 00:01:41,700 You'll see that you will learn how to predict an ensemble of results. 38 00:01:41,700 --> 00:01:42,533 You know in the test set. 39 00:01:42,533 --> 00:01:45,033 You will also learn how to predict a single result. 40 00:01:45,033 --> 00:01:45,533 Like, you know, 41 00:01:45,533 --> 00:01:47,233 when you deploy your model in production, 42 00:01:47,233 --> 00:01:49,833 when you want to predict a single observation. 43 00:01:49,833 --> 00:01:51,366 So now confusion matrix 44 00:01:51,366 --> 00:01:54,966 that's to evaluate your model and of course the visualizations at the. 45 00:01:55,133 --> 00:01:59,033 And once again I chose a data set of only two features right. 46 00:01:59,033 --> 00:02:03,433 The age and the estimated salary, so that we can indeed visualize 47 00:02:03,600 --> 00:02:06,433 the results in the end on the training set and on the test set. 48 00:02:06,433 --> 00:02:11,200 Because remember, in the plot, each dimension corresponds to one feature, 49 00:02:11,200 --> 00:02:14,366 and therefore there are as many dimensions as there are features. 50 00:02:14,633 --> 00:02:17,933 And so since we have two features, we'll have a nice 2D plot. 51 00:02:17,933 --> 00:02:20,866 And that's exactly the reason why I needed to take two features. 52 00:02:20,866 --> 00:02:24,300 But no worries, the implementations were able to make works 53 00:02:24,300 --> 00:02:27,366 for any data set regardless the number of features. 54 00:02:27,533 --> 00:02:30,100 And I will prove this to you at the end of this part. 55 00:02:30,100 --> 00:02:32,966 When deploying all our classification models 56 00:02:32,966 --> 00:02:36,900 on a brand new generic data set with more features. 57 00:02:37,233 --> 00:02:41,700 And this is how I will also teach you on how to select the best model. 58 00:02:41,800 --> 00:02:43,800 All right, so there you go. 59 00:02:43,800 --> 00:02:44,933 I hope you're excited. 60 00:02:44,933 --> 00:02:48,600 You know, both by the problem case study and this implementation. 61 00:02:48,966 --> 00:02:51,966 And now before we finish and move on to the next tutorial, 62 00:02:52,066 --> 00:02:55,133 well I would like you to do a little exercise. 63 00:02:55,433 --> 00:02:58,066 Now that you saw the data set and understands it. 64 00:02:58,066 --> 00:03:02,400 And since you also have your data preprocessing template, well there you go. 65 00:03:02,400 --> 00:03:05,500 The exercise is I would like you to implement 66 00:03:05,500 --> 00:03:09,066 on your own the data preprocessing phase up to this step. 67 00:03:09,066 --> 00:03:10,900 You know feature scaling. 68 00:03:10,900 --> 00:03:11,600 So basically 69 00:03:11,600 --> 00:03:15,533 I would like to implement on your own this step importing the libraries. 70 00:03:15,666 --> 00:03:17,700 Then this step importing the data set 71 00:03:17,700 --> 00:03:20,766 and this step splitting the data set into the training set and test it. 72 00:03:21,033 --> 00:03:25,200 And finally this last step of the data preprocessing phase feature scaling. 73 00:03:25,266 --> 00:03:25,900 All right. 74 00:03:25,900 --> 00:03:30,033 So please try use of course your data preprocessing template 75 00:03:30,333 --> 00:03:33,133 and of course your data preprocessing toolkit. 76 00:03:33,133 --> 00:03:37,500 Because indeed in order to implement that step you will need to grab a tool 77 00:03:37,533 --> 00:03:40,766 of your data preprocessing toolkit and I'm sure you will find it. 78 00:03:41,033 --> 00:03:43,300 So you can totally do this on your own. 79 00:03:43,300 --> 00:03:45,866 There is no trap. It's actually super easy. 80 00:03:45,866 --> 00:03:48,900 And of course we will implement the solution together 81 00:03:49,200 --> 00:03:52,766 in the next tutorial, so I can't wait to see what you end up with. 82 00:03:52,833 --> 00:03:56,100 And I'm sure we will end up with the same thing. 83 00:03:56,100 --> 00:03:58,966 So let's see. And until then, enjoy machine learning.