1 00:00:00,100 --> 00:00:01,100 Hello my friends! 2 00:00:01,100 --> 00:00:03,466 I'm super excited to start now. 3 00:00:03,466 --> 00:00:07,566 A part of this course that we've been excitedly waiting for. 4 00:00:07,766 --> 00:00:10,300 I'm talking, of course, about deep learning. 5 00:00:10,300 --> 00:00:13,200 You just got the amazing intuition lectures by Kirill. 6 00:00:13,200 --> 00:00:16,366 And so now you understand that what deep learning consists 7 00:00:16,366 --> 00:00:19,366 of is building an artificial brain. 8 00:00:19,400 --> 00:00:20,666 So there you go, my friends. 9 00:00:20,666 --> 00:00:26,633 Now we're about to build an artificial brain using the brand new TensorFlow 2.0. 10 00:00:26,966 --> 00:00:28,400 This will be super exciting. 11 00:00:28,400 --> 00:00:32,266 We will really build a deep neural network with neurons 12 00:00:32,266 --> 00:00:34,933 and fully connected layers connecting these neurons. 13 00:00:34,933 --> 00:00:38,333 And we will apply all this to a business problem as usual. 14 00:00:38,333 --> 00:00:40,233 And now that we are, you know, quite advanced. 15 00:00:40,233 --> 00:00:41,000 And of course 16 00:00:41,000 --> 00:00:44,100 the data set we'll be working on looks more like a real world 17 00:00:44,100 --> 00:00:47,400 data set with many observations and many features. 18 00:00:47,400 --> 00:00:48,066 And you will see that 19 00:00:48,066 --> 00:00:51,766 we will actually have to not only use our data preprocessing template, 20 00:00:51,933 --> 00:00:55,500 but also use some of the tools of our data preprocessing toolkit. 21 00:00:56,000 --> 00:00:57,600 So are you ready? 22 00:00:57,600 --> 00:00:59,333 Are you ready to go next level? 23 00:00:59,333 --> 00:00:59,833 Indeed. 24 00:00:59,833 --> 00:01:02,833 Now we're about to do some more advanced machine learning. 25 00:01:02,900 --> 00:01:04,233 And so I'm super excited 26 00:01:04,233 --> 00:01:07,833 that you're even pushing further your expertise in machine learning. 27 00:01:08,300 --> 00:01:09,300 All right, so let's do this. 28 00:01:09,300 --> 00:01:12,700 But first let's just make sure everyone here is on the same page. 29 00:01:12,700 --> 00:01:15,400 So this is the folder of the whole machinery. 30 00:01:15,400 --> 00:01:17,666 It is it with all the codes and data sets. 31 00:01:17,666 --> 00:01:20,266 I gave you the link to this folder right before this tutorial. 32 00:01:20,266 --> 00:01:21,900 So make sure to connect to it. 33 00:01:21,900 --> 00:01:23,066 And now there we go. 34 00:01:23,066 --> 00:01:26,433 Let's end to part eight Deep learning with 35 00:01:26,433 --> 00:01:29,933 first the classic artificial neural network, 36 00:01:29,933 --> 00:01:34,700 meaning the fully connected neural network with only fully connected layers. 37 00:01:34,700 --> 00:01:38,566 You know, with no convolutional layers or other types of layers. 38 00:01:38,600 --> 00:01:42,166 Here we will just have an input vector containing different features, 39 00:01:42,166 --> 00:01:45,833 and we will predict an outcome which will be a binary variable, 40 00:01:45,866 --> 00:01:48,066 because you have to know that actually artificial neural 41 00:01:48,066 --> 00:01:51,733 networks can be both used for regression or classification. 42 00:01:51,966 --> 00:01:54,800 And here we're going to do it for classification. 43 00:01:54,800 --> 00:01:58,500 However note that we have a free course on artificial neural network, 44 00:01:58,666 --> 00:02:02,600 in which this time we built an artificial neural network for regression. 45 00:02:02,600 --> 00:02:03,800 So make sure to check it out. 46 00:02:03,800 --> 00:02:06,000 I will think of including the links somewhere 47 00:02:06,000 --> 00:02:08,100 so that you can get it as well, but it's really good. 48 00:02:08,100 --> 00:02:09,600 You will have both cases. 49 00:02:09,600 --> 00:02:13,433 You know, the classification case with this course and the regression case 50 00:02:13,433 --> 00:02:15,500 with the other free course. All right. 51 00:02:15,500 --> 00:02:18,600 So now as usual we're going to start with Python. 52 00:02:18,600 --> 00:02:19,733 And there we go. 53 00:02:19,733 --> 00:02:24,666 This folder contains the implementation first artificial neural network that ipynb 54 00:02:24,900 --> 00:02:28,666 which you can either open with Jupyter notebook or Google Collaboratory. 55 00:02:29,000 --> 00:02:33,500 And we have of course our data set, which I'm going to explain right now. 56 00:02:34,133 --> 00:02:34,500 All right. 57 00:02:34,500 --> 00:02:35,500 So as you can see, 58 00:02:35,500 --> 00:02:40,133 as I told you, it looks indeed more like a real world data set, right? 59 00:02:40,133 --> 00:02:40,866 For the first time. 60 00:02:40,866 --> 00:02:44,200 Our data set takes the full screen here because indeed, this time 61 00:02:44,200 --> 00:02:48,433 we have many features, you know, starting from here up to this one. 62 00:02:48,433 --> 00:02:51,033 And the dependent variable right here. 63 00:02:51,033 --> 00:02:51,400 All right. 64 00:02:51,400 --> 00:02:53,600 So let me explain what this is about. 65 00:02:53,600 --> 00:02:56,600 This is the data set of a bank 66 00:02:56,666 --> 00:03:00,533 which collected some informations about their customers. 67 00:03:00,866 --> 00:03:03,166 These informations are well row number. 68 00:03:03,166 --> 00:03:05,100 That's just a non relevant feature. 69 00:03:05,100 --> 00:03:06,266 We will get rid of it. 70 00:03:06,266 --> 00:03:07,466 Then the customer ID. 71 00:03:07,466 --> 00:03:10,566 That's just the identification key of each customer. 72 00:03:10,866 --> 00:03:15,000 The surname, the credit score, the geography meaning the country, 73 00:03:15,000 --> 00:03:18,233 the customer lives in, the gender, the age, 74 00:03:18,233 --> 00:03:21,600 the tenure meaning the number of years that have been in the bank, 75 00:03:21,900 --> 00:03:25,433 the balance meaning the amount of money they have on their account. 76 00:03:25,733 --> 00:03:27,833 The number of products they use from the bank. 77 00:03:27,833 --> 00:03:29,400 You know, like a credit card 78 00:03:29,400 --> 00:03:34,133 or a checkbook or a Mastercard or even a loan or home loan. 79 00:03:34,133 --> 00:03:36,000 You know, any banking products. 80 00:03:36,000 --> 00:03:36,400 Okay. 81 00:03:36,400 --> 00:03:39,266 So that's the number of products each customer has. 82 00:03:39,266 --> 00:03:43,366 Then has credit card, yes or no, meaning that this variable is equal to one. 83 00:03:43,366 --> 00:03:48,600 If the customer has a credit card and zero otherwise is active member, 84 00:03:48,600 --> 00:03:52,866 meaning is the customer active, is he or she using the bank, 85 00:03:52,866 --> 00:03:57,266 you know, connecting to its account or using its credit card or any other card? 86 00:03:57,266 --> 00:03:58,133 You know, let's say 87 00:03:58,133 --> 00:04:01,866 they have some measurement system to measure if a customer is active or not. 88 00:04:01,866 --> 00:04:02,566 And one means, 89 00:04:02,566 --> 00:04:06,600 of course, that the customer is active and zero means that it is an active. 90 00:04:06,666 --> 00:04:09,300 Okay then estimated salary. 91 00:04:09,300 --> 00:04:13,400 Well, you know, the salary of the customer estimated by the bank and that's it. 92 00:04:13,400 --> 00:04:14,800 That's the last feature. 93 00:04:14,800 --> 00:04:18,066 And then the last column here it is 94 00:04:18,066 --> 00:04:21,566 the dependent variable and it tells if yes or no. 95 00:04:21,866 --> 00:04:26,100 Well the customer stayed in the bank or left the bank. 96 00:04:26,366 --> 00:04:30,833 One means that it left the bank as in exited equals yes. 97 00:04:31,000 --> 00:04:34,666 And zero means that the customer stayed in the bank as an exit. 98 00:04:34,666 --> 00:04:35,766 It equals no. 99 00:04:35,766 --> 00:04:40,433 So what happened in reality is that this bank actually observed 100 00:04:40,466 --> 00:04:43,966 their customers for a certain period of time, let's say six months. 101 00:04:44,266 --> 00:04:48,600 They observed if during the six months they left the bank or stayed in the bank, 102 00:04:48,733 --> 00:04:52,533 and they gathered these outcomes in this last dependent variable, 103 00:04:52,800 --> 00:04:55,733 and at the same time, you know, they got all these features. 104 00:04:55,733 --> 00:05:01,200 Well, to guess what, understand the correlations between these features 105 00:05:01,400 --> 00:05:06,266 and the fact whether or not the customer stays in the bank or leave the bank. 106 00:05:06,500 --> 00:05:07,966 And that makes sense, right? 107 00:05:07,966 --> 00:05:11,400 Because a bank wants to have the maximum customers, right? 108 00:05:11,400 --> 00:05:12,300 That's how they make money. 109 00:05:12,300 --> 00:05:15,800 The more customers they have, the more they have money in the bank, and the more 110 00:05:15,800 --> 00:05:19,566 they make money from the diverse products they offer to their customers. 111 00:05:19,733 --> 00:05:23,533 So of course, their interest is to keep the maximum customers, 112 00:05:23,733 --> 00:05:27,366 and therefore they made this data set to understand 113 00:05:27,400 --> 00:05:31,033 the reasons somehow why customers leave the bank. 114 00:05:31,300 --> 00:05:34,800 And mostly once they managed to build a predictive model 115 00:05:34,800 --> 00:05:38,100 that can predict if any new customer leaves the bank. 116 00:05:38,100 --> 00:05:41,100 You know, a model that was trained, of course, on this data set. 117 00:05:41,133 --> 00:05:44,833 Well, they will deploy this model on new customers 118 00:05:44,833 --> 00:05:47,933 and for all the customers, where the model predicts that 119 00:05:47,933 --> 00:05:51,866 the customer leaves the bank, well, they will be prepared and they might do 120 00:05:51,866 --> 00:05:56,666 some special offer to the customer so that it will stay in the bank, you see. 121 00:05:57,000 --> 00:06:02,633 So all this is to prevent the maximum customers from leaving the bank. 122 00:06:02,633 --> 00:06:04,733 And why is this girl churn modeling? 123 00:06:04,733 --> 00:06:07,900 Because customer churn means exactly the situation 124 00:06:07,900 --> 00:06:11,833 where some customers exit, you know, become no longer customer. 125 00:06:12,300 --> 00:06:12,900 All right. 126 00:06:12,900 --> 00:06:17,200 And of course, the bank has asked you the most down to data scientist 127 00:06:17,200 --> 00:06:21,166 to make this predictive model, to first train it on this data 128 00:06:21,166 --> 00:06:24,300 set to understand the correlations of all these features 129 00:06:24,300 --> 00:06:29,300 and the dependent variable, and then to deploy this model on future customers. 130 00:06:29,533 --> 00:06:32,600 And you will see that in the implementation, we will actually deploy 131 00:06:32,866 --> 00:06:36,533 the future machine learning model will get on a different customer, 132 00:06:36,533 --> 00:06:39,300 you know, not part of this data set, so as to predict 133 00:06:39,300 --> 00:06:43,333 if this new customer will stay in the bank or leaves the bank. 134 00:06:43,333 --> 00:06:44,700 And even better than this, 135 00:06:44,700 --> 00:06:49,500 we will actually predict the probability that this customer leaves the bank. 136 00:06:49,833 --> 00:06:51,666 So we have a lot to do ahead of us. 137 00:06:51,666 --> 00:06:53,033 But I'm super excited 138 00:06:53,033 --> 00:06:56,333 because deep learning is a fascinating branch of machine learning. 139 00:06:56,466 --> 00:07:00,966 So let's start right away and let's open our implementation 140 00:07:00,966 --> 00:07:05,366 artificial Neural Network dot Ipynb, which you can feel free to open 141 00:07:05,366 --> 00:07:07,200 with either Google Collaboratory 142 00:07:07,200 --> 00:07:10,966 as I'm about to do, or Jupyter Notebook as you want. 143 00:07:11,133 --> 00:07:15,333 Just make sure you feel comfortable coding on your favorite IDE. 144 00:07:16,366 --> 00:07:16,800 All right, 145 00:07:16,800 --> 00:07:20,700 so now the implementation is opening laying it out the notebook. 146 00:07:20,700 --> 00:07:21,633 Perfect. 147 00:07:21,633 --> 00:07:24,600 That's the implementation once again in read only mode. 148 00:07:24,600 --> 00:07:29,633 So right now we will click file here to create a copy of this implementation. 149 00:07:29,900 --> 00:07:33,900 And as usual we will re-implement all this from scratch 150 00:07:34,166 --> 00:07:37,366 so that we can really learn by doing. 151 00:07:37,900 --> 00:07:38,433 All right. 152 00:07:38,433 --> 00:07:39,300 So that's the copy. 153 00:07:39,300 --> 00:07:42,833 Now let's remove all the cells right. 154 00:07:42,966 --> 00:07:47,333 This one not the text cells of course, because we want to keep the well 155 00:07:47,333 --> 00:07:49,833 highlighted structure of this implementation. 156 00:07:49,833 --> 00:07:52,533 But let's definitely remove the code cells. 157 00:07:52,533 --> 00:07:53,533 So there are many of them. 158 00:07:53,533 --> 00:07:56,133 So you will actually have a bit of time doing it. 159 00:07:56,133 --> 00:07:58,666 But you know it's not that long. It's fine. 160 00:07:58,666 --> 00:07:59,766 There is a long data 161 00:07:59,766 --> 00:08:04,100 preprocessing phase and then a long phase of building the neural network. 162 00:08:04,366 --> 00:08:07,733 And I really gave the details of the implementation. 163 00:08:07,733 --> 00:08:10,233 So you will actually see many steps. 164 00:08:10,233 --> 00:08:11,066 All right. 165 00:08:11,066 --> 00:08:14,700 And there are so many steps that I actually added a level of 166 00:08:14,700 --> 00:08:18,666 structure because as you can see the full implementation is divided 167 00:08:18,666 --> 00:08:20,433 into three parts. Right. 168 00:08:20,433 --> 00:08:22,500 There is a homework at the end. The solution. 169 00:08:22,500 --> 00:08:24,966 So let me not show it to you. 170 00:08:24,966 --> 00:08:29,433 And that's almost the end confusion matrix and perfect. 171 00:08:29,600 --> 00:08:33,900 All right okay so as you saw this is a long implementation. 172 00:08:33,900 --> 00:08:35,400 This will be a long part. 173 00:08:35,400 --> 00:08:37,766 But it is absolutely worth it. 174 00:08:37,766 --> 00:08:42,500 Deep learning is one of the most powerful branches of machine learning okay. 175 00:08:42,500 --> 00:08:43,466 So let's see. 176 00:08:43,466 --> 00:08:46,466 We start first by importing the libraries as usual. 177 00:08:46,566 --> 00:08:49,433 And then we enter part one data preprocessing. 178 00:08:49,433 --> 00:08:53,633 So that's the first part of the whole implementation structured in four parts 179 00:08:53,633 --> 00:08:56,733 which are the following parts one data preprocessing part 180 00:08:56,733 --> 00:08:59,833 two building two and part three training, 181 00:09:00,133 --> 00:09:03,933 and in part for making the predictions and evaluating the model. 182 00:09:04,300 --> 00:09:07,000 And then each part well we have two different steps. 183 00:09:07,000 --> 00:09:09,966 In part one data preprocessing we will first import the data set. 184 00:09:09,966 --> 00:09:12,900 Of course then we'll have some data preprocessing to do. 185 00:09:12,900 --> 00:09:17,233 You know, not only the classic steps of the template but also some extra tools. 186 00:09:17,233 --> 00:09:18,866 And we will see that together. 187 00:09:18,866 --> 00:09:23,800 Then in part two we will first initialize and then we'll add an input layer. 188 00:09:23,800 --> 00:09:26,533 And the first hidden layer to our artificial brain. 189 00:09:26,533 --> 00:09:29,700 Then we'll add the second hidden layer, then the output layer. 190 00:09:30,133 --> 00:09:33,833 Then in the next part three training and we will first start 191 00:09:33,833 --> 00:09:37,833 by compiling the CNN to, you know, an optimizer and a loss function. 192 00:09:38,166 --> 00:09:40,900 Then we will train the CNN on the training set. 193 00:09:40,900 --> 00:09:45,166 And finally in part four, well, we will deploy our model in production 194 00:09:45,166 --> 00:09:48,166 to predict the result of a single observation, meaning 195 00:09:48,233 --> 00:09:51,900 to predict if a new customer will stay or leave the bank, 196 00:09:52,333 --> 00:09:56,600 then we will predict the test results to, you know, get that Y vector 197 00:09:56,700 --> 00:10:01,800 to eventually make the confusion matrix with the final accuracy. 198 00:10:02,600 --> 00:10:03,133 All right. 199 00:10:03,133 --> 00:10:06,333 So once again as you can see this implementation is pretty long. 200 00:10:06,333 --> 00:10:09,766 So make sure to have full energy and full motivation for this 201 00:10:10,000 --> 00:10:13,733 because we have a long but yet exciting journey in front of us. 202 00:10:13,900 --> 00:10:15,200 So as soon as you're ready, 203 00:10:15,200 --> 00:10:18,900 well, meet me in the next tutorial to smash this implementation. 204 00:10:19,066 --> 00:10:20,900 And until then, enjoy machine learning.