1 00:00:00,780 --> 00:00:07,960 In the last lecture, we saw how to create, how to compile and how to screen our classification modern 2 00:00:08,220 --> 00:00:09,780 intensive, locate us. 3 00:00:11,590 --> 00:00:17,170 In this video, we will be and crane a deflation model using get us. 4 00:00:18,860 --> 00:00:26,510 For this, we will be using the very popular regression dataset that is California housing data set. 5 00:00:28,560 --> 00:00:33,300 This dataset is available in a Skillern database library. 6 00:00:35,140 --> 00:00:42,460 The objective here is to predict the prices of homes using eight different independent variables. 7 00:00:43,540 --> 00:00:44,980 So let's get started. 8 00:00:45,790 --> 00:00:52,080 First, we are importing some basic liabilities, such as by undoes and my plot. 9 00:00:54,920 --> 00:00:57,770 Then we are importing tents up, flowing gave us. 10 00:01:00,250 --> 00:01:08,050 And then since this data is available in Skillern dataset, we are also importing California housing 11 00:01:08,230 --> 00:01:09,570 from Escalon big. 12 00:01:10,930 --> 00:01:15,760 And we are saving this database and doing another really well called housing. 13 00:01:18,090 --> 00:01:23,670 I also want to share one small shortcut to access the help of any function. 14 00:01:23,910 --> 00:01:32,310 So if you just click inbetween of any function parenthesis and then hold the ship key and click on tab. 15 00:01:32,670 --> 00:01:39,900 So if you hit ship less there, it will open the help or documentation of that function. 16 00:01:41,260 --> 00:01:46,530 If you hit shift 10 two times, it will expand the documentation. 17 00:01:48,050 --> 00:01:50,990 You can see here this function will return. 18 00:01:51,080 --> 00:01:54,380 Different parameters such as not data. 19 00:01:54,780 --> 00:01:57,260 It will give us the independent variables. 20 00:01:57,490 --> 00:02:00,600 Dot, dot, dot will give us the dependent variable. 21 00:02:01,760 --> 00:02:05,270 And the feature entity will give us the details of features. 22 00:02:07,130 --> 00:02:08,810 Let's just close this. 23 00:02:09,590 --> 00:02:16,850 So if you have any doubt regarding any function, just click between Prentice's and her shift plus. 24 00:02:19,410 --> 00:02:24,330 So I have alluded this set some of the information about this database. 25 00:02:25,920 --> 00:02:29,840 So in this database, there are around 20000 records. 26 00:02:30,780 --> 00:02:33,240 There are eight independent variables. 27 00:02:34,230 --> 00:02:37,930 We have first variable as made in. 28 00:02:38,730 --> 00:02:41,910 This is the medium income in that particular block. 29 00:02:41,970 --> 00:02:43,090 Warehouses located. 30 00:02:44,100 --> 00:02:46,940 Then we have a second variable that is house eight. 31 00:02:47,430 --> 00:02:50,550 This is the median house eight in that block. 32 00:02:51,570 --> 00:02:55,560 Then we have average rooms, which is the average number of rooms. 33 00:02:56,010 --> 00:03:01,050 And then we have average bedrooms for average number of bedrooms. 34 00:03:01,710 --> 00:03:03,660 Next, we have population variable. 35 00:03:04,080 --> 00:03:05,700 That is the block population. 36 00:03:07,680 --> 00:03:10,260 Then we have average occupancy. 37 00:03:10,470 --> 00:03:12,650 That is the average house occupancy. 38 00:03:13,260 --> 00:03:21,090 And then latitude and longitude of that pole's block using this eight independent variables. 39 00:03:21,510 --> 00:03:24,540 We want to predict the value of the house. 40 00:03:25,020 --> 00:03:27,810 The values are in a hundred thousand. 41 00:03:28,080 --> 00:03:30,660 So supposing forward, why would you build these five? 42 00:03:30,720 --> 00:03:32,850 That means the value of their houses. 43 00:03:33,360 --> 00:03:34,980 Five hundred thousand dollars. 44 00:03:37,370 --> 00:03:38,960 So this is our dataset. 45 00:03:39,440 --> 00:03:44,150 We have this eight independent variable and one target variable as space. 46 00:03:45,650 --> 00:03:50,960 If you want some more detail about this dataset, you can click on this documentation link. 47 00:03:52,700 --> 00:03:57,550 This will open the official Escalon documentation of this dataset here. 48 00:03:58,090 --> 00:04:01,220 You will get to know about all the parameters that you can give. 49 00:04:02,040 --> 00:04:05,360 And what are included in this database? 50 00:04:06,620 --> 00:04:07,910 Let's go back. 51 00:04:10,990 --> 00:04:15,160 So this housing is in the form of a dictionary. 52 00:04:17,650 --> 00:04:20,950 We have one you loopier as feature me. 53 00:04:21,860 --> 00:04:25,100 So just look at the feature names first. 54 00:04:26,680 --> 00:04:31,120 You can see these are the eight variable names that we have discussed already. 55 00:04:33,720 --> 00:04:41,950 Now to EXs, the independent data we have to use housing data and to access the independent data set. 56 00:04:42,330 --> 00:04:44,540 We have to use housing target. 57 00:04:47,430 --> 00:04:56,190 So in this line of code, we are splitting our data first into green full and best dataset. 58 00:04:57,450 --> 00:05:04,890 Then we add further dividing this green full dataset in to extreme and X valuation dataset. 59 00:05:08,200 --> 00:05:16,240 We will be using test train straight from Escalon model selection, and then we will use test train, 60 00:05:16,330 --> 00:05:19,240 the split method to divide our data. 61 00:05:20,860 --> 00:05:24,760 We are not giving any additional parameter for test sites. 62 00:05:25,330 --> 00:05:31,080 That's because by default, the test size is 25 percent of the total data. 63 00:05:32,200 --> 00:05:40,120 So 25 percent of the total data that is around 20000 goes will go into test set. 64 00:05:40,630 --> 00:05:45,780 And then the remaining 75 percent will go in to training site. 65 00:05:47,160 --> 00:05:49,320 All of that 75 percent. 66 00:05:49,920 --> 00:05:56,890 Again, 25 percent will go into validation set and less of 75 percent will go into a working site. 67 00:05:58,270 --> 00:05:59,800 Let's run this. 68 00:06:02,700 --> 00:06:11,430 Next step is to process our data and we will be using the standard is scalar from Escala to standardize 69 00:06:11,430 --> 00:06:11,930 our data. 70 00:06:14,180 --> 00:06:20,490 And standardizing, we subtract the mean of each variable from their individual values. 71 00:06:21,020 --> 00:06:30,080 And then we also divide it by the variance because at the end we want all the variables with mean as 72 00:06:30,080 --> 00:06:33,080 zero and their variance as one. 73 00:06:34,340 --> 00:06:42,230 This is a standard procedure to create any machine learning more than the steps here are very simple. 74 00:06:42,530 --> 00:06:47,270 First, we are importing standard scale it from a skill pre processing. 75 00:06:48,320 --> 00:06:53,900 Then we are creating the scalar object using standard scalar method. 76 00:06:55,150 --> 00:06:56,780 Then we are training. 77 00:06:56,780 --> 00:07:00,140 This is Escala Object using Nowhat Extreme Data. 78 00:07:01,920 --> 00:07:09,230 So on our extra data, this is killer will find the values to subtract, as I mean, and to be weighed 79 00:07:09,230 --> 00:07:10,020 as a variance. 80 00:07:10,620 --> 00:07:12,480 Then we will use those values. 81 00:07:12,660 --> 00:07:18,120 Or this is scalar object to standardize a word validation and assets as well. 82 00:07:19,550 --> 00:07:22,200 Just so repeat, we are fitting. 83 00:07:22,260 --> 00:07:24,780 This is scalar object on our training data. 84 00:07:25,320 --> 00:07:29,070 And we are transforming our validation and test set using. 85 00:07:29,070 --> 00:07:33,150 This is scalar object that we have put it on our extreme data. 86 00:07:35,840 --> 00:07:40,190 Fate will be using fate, underscore, transformed, mattered. 87 00:07:40,440 --> 00:07:44,700 And we will be using extreme as parameter to transform. 88 00:07:44,760 --> 00:07:51,990 We will be using dort transform method of this scalar object and we will be using relevant datasets 89 00:07:51,990 --> 00:07:52,220 here. 90 00:07:55,720 --> 00:07:59,210 So let's create a standardized datasets. 91 00:08:00,430 --> 00:08:03,400 We are saving this object in their original name only. 92 00:08:03,760 --> 00:08:08,110 So we are replacing the ordinal extreme with the standardized version of extreme. 93 00:08:09,280 --> 00:08:14,370 Is regional ex validation set in to a standardized version of X validation set? 94 00:08:14,470 --> 00:08:17,420 And same for this site as what it says. 95 00:08:17,830 --> 00:08:22,340 And this if you want some more information on this riskily. 96 00:08:22,690 --> 00:08:25,900 You can always do for two and Skillern documentation. 97 00:08:26,100 --> 00:08:27,640 More standard is scalars. 98 00:08:29,380 --> 00:08:31,990 Next step is to set random seeds. 99 00:08:33,830 --> 00:08:37,760 This is to generate the same result every time we run this modern. 100 00:08:42,220 --> 00:08:49,630 Now, as I said earlier, our initial data set was off or on point, deposing laws or records. 101 00:08:50,440 --> 00:08:55,060 Now let us see the shape of the extended SEC. 102 00:08:57,560 --> 00:09:04,580 Here you can see we have eight columns and around eleven thousand six hundred reports. 103 00:09:05,480 --> 00:09:05,960 You know what? 104 00:09:06,020 --> 00:09:06,980 Extremely desolate. 105 00:09:08,950 --> 00:09:17,710 We should never own five Posehn records in our X test dataset and around 4000 in valuation, say. 106 00:09:20,820 --> 00:09:26,220 Now, let's create the structure for over regulation neural network. 107 00:09:28,890 --> 00:09:31,590 Here we will be first having an input layer. 108 00:09:32,960 --> 00:09:37,520 Then we will be having the first dance layer with party neurons. 109 00:09:38,180 --> 00:09:42,920 Then we want to create second dance layer with another turbin neurons. 110 00:09:43,760 --> 00:09:51,620 And then since this is the immigration problem, we will be having a single output neuron without any 111 00:09:51,620 --> 00:09:52,730 activation function. 112 00:09:54,620 --> 00:09:59,840 A single neuron, since we want a continuous value as our output. 113 00:10:01,370 --> 00:10:04,060 Again, we will be using the sequential EPA. 114 00:10:05,060 --> 00:10:10,040 We are saving this structure or our model as model. 115 00:10:12,550 --> 00:10:14,800 And then for the first, Leot. 116 00:10:16,850 --> 00:10:17,630 We are writing. 117 00:10:17,690 --> 00:10:19,730 Get us dot leered, dot dense. 118 00:10:20,360 --> 00:10:25,950 Here in parenthesis, we have to provide the number of neurons, which is starty. 119 00:10:26,280 --> 00:10:31,710 Then, as discussed in our three lectures, we will be using activation function as the RELU. 120 00:10:32,450 --> 00:10:40,010 And then since this is our first Soudan there, we need to provide the input shape since the number 121 00:10:40,010 --> 00:10:43,630 of independent variables in our data is eight. 122 00:10:44,400 --> 00:10:47,240 We will be using input chip equal to. 123 00:10:52,130 --> 00:10:54,860 You can also use like this input shape. 124 00:10:55,220 --> 00:11:02,680 Do extreme dort shape and then calling the second and so on, elements of over input chip. 125 00:11:03,860 --> 00:11:09,140 This way you don't have to worry about changing this number every time you change your data base. 126 00:11:09,650 --> 00:11:17,870 You can just straight xstream not shift and it will automatically get the number of variables from the 127 00:11:17,870 --> 00:11:20,900 shape attribute of forward extreme object. 128 00:11:23,680 --> 00:11:29,050 So this is the structure of our first dance layer will be cleared. 129 00:11:29,140 --> 00:11:31,330 Second, dense layer in a similar fashion. 130 00:11:31,780 --> 00:11:34,950 We will be using give us dot layer, dot dense. 131 00:11:35,380 --> 00:11:39,580 And then the number of neurons in the parenthesis, which is 30. 132 00:11:40,030 --> 00:11:42,300 And activation function as the RELU. 133 00:11:44,070 --> 00:11:49,700 Similarly, for the output layer, we will be using guitars, dot layard's dance. 134 00:11:50,130 --> 00:11:56,550 And since this aggression problem, we will be using a single neuron without any activation function. 135 00:11:58,670 --> 00:11:59,570 Just run this. 136 00:12:01,580 --> 00:12:08,000 And again, one important thing you can comment using hash symbol inside the cells. 137 00:12:09,560 --> 00:12:16,980 So Biton will execute on this part of the code and will not be executing any code which starts with 138 00:12:16,980 --> 00:12:20,050 Peche as planned for starting the coming. 139 00:12:21,670 --> 00:12:30,700 So now we have created the structure or architecture of what neural network does to conform and view 140 00:12:30,700 --> 00:12:31,310 this structure. 141 00:12:31,670 --> 00:12:35,860 We can call thought somebody might just write Mordred or. 142 00:12:40,890 --> 00:12:47,490 Somebody here, you will get the information about the structure that we have created. 143 00:12:47,640 --> 00:12:50,280 So we have first dense layer with dirt in Iran. 144 00:12:51,330 --> 00:12:53,460 We have second dense, layered with protein neurons. 145 00:12:53,940 --> 00:12:58,350 And lastly, we have a single output layer with one neuron. 146 00:13:00,450 --> 00:13:02,030 This is what we wanted. 147 00:13:03,180 --> 00:13:06,680 The next step should be to come find this Martin. 148 00:13:09,280 --> 00:13:14,830 Again, the combined method works similar for both classification and duration model. 149 00:13:15,970 --> 00:13:19,900 First, we have to mention the loss in classification. 150 00:13:19,930 --> 00:13:21,380 We were using Crosson Groppi. 151 00:13:22,360 --> 00:13:29,970 But Pia, since we are planning regulation, we have to use mean squared error, also known as MASC. 152 00:13:31,840 --> 00:13:34,140 The second parameter is optimizer. 153 00:13:36,010 --> 00:13:41,200 Again, here also we are using as Sudi, so plastic gradient descent. 154 00:13:42,370 --> 00:13:49,870 And here we are also providing the learning rate by default, the value of learning rated zero point 155 00:13:49,870 --> 00:13:52,740 zero one and two genes. 156 00:13:53,740 --> 00:13:54,850 You can just write. 157 00:13:56,150 --> 00:13:58,330 The new value in the parents is. 158 00:14:00,540 --> 00:14:04,090 We have already discussed what is learning English in order to re lecture. 159 00:14:05,080 --> 00:14:08,330 So if you have any doubts, just revisit that lecture. 160 00:14:09,940 --> 00:14:13,270 And then the next parameter that we are passing is metrics. 161 00:14:13,600 --> 00:14:15,250 This is an optional parameter. 162 00:14:17,490 --> 00:14:20,060 In classification, we were using accuracy. 163 00:14:21,080 --> 00:14:31,490 But in regulation, we can use mean absolute error or m a e absolute error is the difference between 164 00:14:31,490 --> 00:14:34,040 the predicted value and the actual value. 165 00:14:35,060 --> 00:14:39,960 Whereas the mean is squared error is the square of that difference. 166 00:14:41,390 --> 00:14:49,160 So we are concluding both mean squared error as a lost function and Emmy as the metrics we additionally 167 00:14:49,160 --> 00:14:50,060 want to calculate. 168 00:14:51,480 --> 00:14:59,370 Again, just remember, if you want to look at the documentation or hell, just click inside any of 169 00:14:59,370 --> 00:15:00,840 the parenthesis and. 170 00:15:01,940 --> 00:15:07,550 And press shiftless tab, you will get this kind of documentation. 171 00:15:07,880 --> 00:15:12,850 And here you can see that by default the learning rate value is 0.01. 172 00:15:13,970 --> 00:15:17,420 So in classification, we did not provided any learning rate. 173 00:15:17,750 --> 00:15:22,530 So the learning rate that was used, there was zero point zero one. 174 00:15:24,260 --> 00:15:27,430 But you can always teach these values according to your need. 175 00:15:29,450 --> 00:15:30,550 Let's run this. 176 00:15:31,360 --> 00:15:33,140 So we have compiled over more than. 177 00:15:35,640 --> 00:15:40,740 The next step, the next step is to screen out more than using screening data. 178 00:15:42,240 --> 00:15:44,340 The method or the process is same. 179 00:15:44,940 --> 00:15:48,760 We are creating another object model underscoring speed for screening. 180 00:15:49,530 --> 00:15:57,930 Then we are using more than not fit and we are passing our training dataset number of box and the validation 181 00:15:57,980 --> 00:15:59,730 dataset that we have created. 182 00:16:00,890 --> 00:16:02,950 Just run this a statement. 183 00:16:05,860 --> 00:16:06,250 Again. 184 00:16:08,740 --> 00:16:14,980 Just like the classification model, you will get the lost value, which is the squared error. 185 00:16:15,910 --> 00:16:20,380 You will get the Emmy you and you mean absolute terror. 186 00:16:21,340 --> 00:16:25,960 And similarly, you will get these two values for your validation set. 187 00:16:26,000 --> 00:16:26,380 That's when. 188 00:16:30,070 --> 00:16:37,150 And you can see that the loss on both training set and relevation set is decreasing with each epoch. 189 00:16:39,550 --> 00:16:43,900 Now we have these values for our training and validation set. 190 00:16:44,680 --> 00:16:49,600 We can also evaluate performance of this train model on our test set. 191 00:16:50,860 --> 00:16:55,120 And we are going to use the same method as fitted with classification model. 192 00:16:55,540 --> 00:16:58,140 We'll call model DOT evaluate. 193 00:16:58,450 --> 00:17:00,860 And then we will pass our training test. 194 00:17:01,860 --> 00:17:02,500 Then this. 195 00:17:04,990 --> 00:17:10,180 You can see on our trained deselect loss is zero point three. 196 00:17:10,420 --> 00:17:19,870 That is masc or squared added and m e e mean absolute error is zero point four four nine three. 197 00:17:22,420 --> 00:17:29,200 Now, just like in classification, we can call modern history, not history, that will give us the 198 00:17:29,320 --> 00:17:32,860 values of all this much crisis in the form of dictionary. 199 00:17:33,800 --> 00:17:40,210 And this year you will get the loss and Emmy on dreaming. 200 00:17:40,300 --> 00:17:40,990 Does it end? 201 00:17:41,130 --> 00:17:43,200 And Relevation Lowson validation me. 202 00:17:43,890 --> 00:17:53,410 The beauty of this is we can load this dictionary on a block just like we did for classification, and 203 00:17:53,410 --> 00:17:55,630 that will show us how we're training. 204 00:17:55,630 --> 00:18:03,160 Loss and reputation loss are changing with each epoch and whether we have achieved the convergence or 205 00:18:03,160 --> 00:18:03,540 not. 206 00:18:04,770 --> 00:18:05,830 This and this. 207 00:18:07,970 --> 00:18:14,480 So you can see we have the loss values and the Emmy win lose photo, well, training and validation 208 00:18:14,480 --> 00:18:16,100 set out on this graph. 209 00:18:16,880 --> 00:18:24,920 And one thing to notice is this graph is still going down, meaning that if we won some more epochs, 210 00:18:25,700 --> 00:18:31,190 this will further decrease the losses and improve the accuracy of our THAN. 211 00:18:33,220 --> 00:18:40,140 So this is the one way to tell whether you have achieved convergence or not or whether you have to increase 212 00:18:40,140 --> 00:18:41,650 your epoch relative or not. 213 00:18:42,700 --> 00:18:46,440 You have to look at this validation, loss and validation Emmy value. 214 00:18:47,620 --> 00:18:49,090 So this the validation loss. 215 00:18:49,420 --> 00:18:53,410 And you can clearly see that it is going down. 216 00:18:55,490 --> 00:18:59,630 So to improve the accuracy, we can read on this called. 217 00:19:02,060 --> 00:19:04,210 Run it for finding more Reeboks. 218 00:19:09,140 --> 00:19:11,240 So just go there. 219 00:19:13,310 --> 00:19:20,780 Now, one important thing about guitars, these guitars have the beats and Bias's value in the memory. 220 00:19:21,020 --> 00:19:27,850 So if you just redone this whole this statement, again, this will not screen the data from Esack. 221 00:19:28,190 --> 00:19:32,720 But it will start training the data from this position. 222 00:19:34,790 --> 00:19:40,960 So if we run this a statement two times that is similar to running a statement with the box. 223 00:19:42,660 --> 00:19:45,420 If we just read on there one more time. 224 00:19:48,020 --> 00:19:50,450 You can see earlier the lost values were. 225 00:19:51,600 --> 00:19:55,160 Around point seven or eight point two for the first Sepo. 226 00:19:55,660 --> 00:19:57,730 And then gradually decreasing. 227 00:19:58,240 --> 00:20:02,170 But now we have started from that 20 plus epoch. 228 00:20:07,620 --> 00:20:11,190 Last same, the lost value on our test set was zero open three zero. 229 00:20:11,790 --> 00:20:16,020 Let's see whether we have improved this lost value or not. 230 00:20:18,000 --> 00:20:23,310 You can see the losses decrease from zero point three to zero point two five. 231 00:20:25,230 --> 00:20:32,260 So what have put this is was correct that the model was not converged in 2012. 232 00:20:33,000 --> 00:20:38,360 There was room of improvement and B, B, then the whole more than 420 more epochs. 233 00:20:38,460 --> 00:20:40,370 That is a total of four epochs. 234 00:20:43,630 --> 00:20:45,300 And we can see this graph. 235 00:20:46,850 --> 00:20:48,160 You can just. 236 00:20:49,050 --> 00:20:51,240 Focus on this reputation loss lane. 237 00:20:51,870 --> 00:20:56,910 Earlier, it was set upon for a long period and this. 238 00:20:59,300 --> 00:21:02,780 There is a slight decrease in valuation loss. 239 00:21:03,020 --> 00:21:06,320 Now you can see that the line is flat turned out. 240 00:21:06,890 --> 00:21:11,060 This means we have achieved the convergence on this model. 241 00:21:12,440 --> 00:21:14,360 So not just with regulation. 242 00:21:14,780 --> 00:21:16,880 If you are running classification model as well. 243 00:21:18,050 --> 00:21:22,640 Just look at this graph to identify whether you have achieved convergence or not. 244 00:21:24,250 --> 00:21:30,740 Now, to predict the values on the new dataset, you can always use, not predict metric. 245 00:21:31,400 --> 00:21:34,310 So your object name and dot predict method. 246 00:21:34,760 --> 00:21:36,330 And then the new dataset. 247 00:21:37,310 --> 00:21:38,830 I don't have any new dataset. 248 00:21:38,960 --> 00:21:44,000 So I'm just creating the sample of those three values of my X dataset. 249 00:21:44,120 --> 00:21:53,420 And considering it as my new dataset and then saving the information in Y pretty good values and using 250 00:21:53,520 --> 00:21:54,070 model lot. 251 00:21:54,080 --> 00:21:59,870 Predict my turn to predict the values that the Fed will use using this model. 252 00:22:01,760 --> 00:22:04,580 That's all for dyslectic in the next lecture. 253 00:22:04,640 --> 00:22:08,820 We will be looking at the functional EPA of get us. 254 00:22:09,830 --> 00:22:10,220 Thank you.