1 00:00:02,930 --> 00:00:03,590 Hello, everyone. 2 00:00:03,980 --> 00:00:04,640 Welcome back. 3 00:00:05,690 --> 00:00:08,840 In this lecture, we are going to learn about two Quantum's. 4 00:00:09,320 --> 00:00:13,640 One is how to be a neural network for regulation problems. 5 00:00:14,330 --> 00:00:18,170 And the second is how to do it using functional EPA. 6 00:00:19,840 --> 00:00:23,840 Now, we have been using sequential EPA and distractibility. 7 00:00:23,900 --> 00:00:33,770 How to use functional EPA functionally is basically used for defining complex models such as multi input 8 00:00:33,920 --> 00:00:37,640 or multi output models or models which have shared Lyd. 9 00:00:39,450 --> 00:00:46,990 So first, really make the normal model using functional EPA, which could also be done using sequentially. 10 00:00:48,290 --> 00:00:55,410 But then we will create a complex neural network structure which can only be handled by a functional 11 00:00:55,410 --> 00:00:55,710 EPA. 12 00:00:57,480 --> 00:01:04,400 Also, we'll be solving a problem, which means our output variable is a continuous measurement. 13 00:01:04,860 --> 00:01:06,670 That is, it is not due to one pipe. 14 00:01:06,960 --> 00:01:09,240 It can have any value without any boundaries. 15 00:01:11,340 --> 00:01:14,580 This problem we'll be using Boston housing data. 16 00:01:14,650 --> 00:01:20,340 It it is a very standard data set in which we have 14 variables. 17 00:01:23,550 --> 00:01:31,420 Thirteen of the productivity bills and 14th one is the value of the House, basically using the values 18 00:01:31,510 --> 00:01:32,250 of 13. 19 00:01:32,400 --> 00:01:36,040 Productivity was we want to predict the value of health. 20 00:01:39,270 --> 00:01:46,740 This is also an in big data, it may not get us, Labidi, but we got important using this line. 21 00:01:50,020 --> 00:01:54,310 If you want to know more about the Boston housing data set, you can visit this link. 22 00:01:54,970 --> 00:01:57,760 It has details of all the 13 predictor variables. 23 00:01:58,310 --> 00:02:03,340 The predicted variables include variables like crime rate, number of hotel rooms and Tequilla. 24 00:02:05,350 --> 00:02:09,300 You can see that Boston housing Natus is now imported. 25 00:02:11,230 --> 00:02:13,210 You can look at this by clicking on it. 26 00:02:14,590 --> 00:02:20,960 The Boston housing dataset has two parts brain part and the best part within brain. 27 00:02:21,250 --> 00:02:26,260 We have 404 observations of 13 predictive variables. 28 00:02:27,160 --> 00:02:30,640 That is in the X and we have the labels. 29 00:02:31,240 --> 00:02:33,170 That is the value of hosting. 30 00:02:33,200 --> 00:02:37,640 We predicted invite in test. 31 00:02:37,930 --> 00:02:40,420 We have a set of one hundred two observations. 32 00:02:40,690 --> 00:02:43,870 Again, taking predictive valuables and invite. 33 00:02:43,900 --> 00:02:45,620 We have the opportunity to. 34 00:02:49,240 --> 00:02:56,620 Now, as we did earlier, we'll be importing the training part into train data and train levels, variable 35 00:02:57,430 --> 00:03:02,820 and testing part of this dataset into best data and best labels. 36 00:03:03,120 --> 00:03:06,460 The next event, these two lines of code. 37 00:03:10,000 --> 00:03:15,160 And now we have these new labels, test data and brain data. 38 00:03:15,800 --> 00:03:17,960 These are the predictor part of the data. 39 00:03:19,580 --> 00:03:24,800 And best labels and brain labels, these are the upper part of the victim. 40 00:03:27,530 --> 00:03:29,930 Next, guns preparing the data. 41 00:03:30,350 --> 00:03:37,340 And one of the important steps that we saw earlier was normalizing the data in the previous problem. 42 00:03:37,700 --> 00:03:40,970 We had only pixel data, which was homogeneous. 43 00:03:41,240 --> 00:03:45,620 So we simply divided it by two to five to get the skilled version of that data. 44 00:03:47,280 --> 00:03:54,260 But now we have heterogeneous data that is all these 13 variables representing 13 different things. 45 00:03:56,180 --> 00:03:58,760 It is not easy to scale such kind of data 46 00:04:02,120 --> 00:04:03,350 to normalize this data. 47 00:04:03,830 --> 00:04:05,180 We use this function sked. 48 00:04:07,650 --> 00:04:13,650 This scale function automatically finds out the mean of every variable and the standard deviation of 49 00:04:13,650 --> 00:04:14,080 everybody. 50 00:04:15,450 --> 00:04:18,180 And it uses that formula that I showed you earlier. 51 00:04:18,960 --> 00:04:21,270 It's a prakriti mean from each value. 52 00:04:21,690 --> 00:04:24,390 And divides that value by the standard deviation. 53 00:04:25,320 --> 00:04:33,530 So simply using the scale function, you can normalize our training data to normalize the data. 54 00:04:34,680 --> 00:04:38,520 We do not use the mean and standard deviation of Vitez data. 55 00:04:39,100 --> 00:04:42,210 We used to mean standard deviation of deigning data. 56 00:04:43,680 --> 00:04:47,610 The concept is Maeno only deplaning part of the data. 57 00:04:48,270 --> 00:04:51,120 Our model does not know any other detail of the word. 58 00:04:52,590 --> 00:04:58,890 We have only the training, but from that we find out the mean and standard deviation of each variable. 59 00:04:59,640 --> 00:05:06,540 We assume that this standard deviation and mean of each variable applies to the entire dataset of the 60 00:05:06,540 --> 00:05:06,810 world. 61 00:05:08,400 --> 00:05:14,490 So using those mean and standard deviations, we will be scaling up test data. 62 00:05:14,600 --> 00:05:14,970 Also. 63 00:05:18,070 --> 00:05:26,830 So in this light, we will scale our training data using descale scale function and this line we will 64 00:05:27,160 --> 00:05:35,530 find out deep Collum means indicating data and we'll be storing that information in this variable called 65 00:05:35,530 --> 00:05:36,220 Mean Street. 66 00:05:39,230 --> 00:05:45,990 In this line, we are finding out the standard deviation of these variables and training data and storing 67 00:05:45,990 --> 00:05:48,930 them individual court call standards. 68 00:05:51,300 --> 00:05:55,260 Now using the means of training data and standard deviations. 69 00:05:56,430 --> 00:05:57,990 We use the scale function. 70 00:05:58,770 --> 00:06:00,030 It is the same scale function. 71 00:06:00,210 --> 00:06:07,770 But here we are specifying the mean and the standard deviation to be used for scaling this test data. 72 00:06:10,410 --> 00:06:11,710 Now our data is ready. 73 00:06:12,340 --> 00:06:13,890 Our trained data is not released. 74 00:06:14,040 --> 00:06:16,080 Our test data is also normalized. 75 00:06:18,480 --> 00:06:23,190 Any new data on which you want to predict the outcome of the model? 76 00:06:24,120 --> 00:06:25,910 You have to scale it again. 77 00:06:26,100 --> 00:06:27,780 Using this scale function. 78 00:06:31,330 --> 00:06:34,270 Now comes the part when we define neural network. 79 00:06:35,890 --> 00:06:41,980 This time we'll be using functionally be a functionally B.A. has two different parts. 80 00:06:42,670 --> 00:06:51,190 One is the input and one is output input with early model about all the variables that we are inputting 81 00:06:51,250 --> 00:06:52,150 in the model. 82 00:06:53,560 --> 00:07:01,330 So basically in the input layer, we build the model that we have a input layer off shape is equal to 83 00:07:01,420 --> 00:07:02,490 number of variables. 84 00:07:03,820 --> 00:07:10,060 I could have written 13 here because I know that the number of variables are 13 in this particular dataset. 85 00:07:11,560 --> 00:07:16,690 But even if you change your train data, you'd need not update your model. 86 00:07:17,050 --> 00:07:23,710 If you divided like this, if you write this way, this means that you want to get this second dimension 87 00:07:23,830 --> 00:07:24,700 of the training data. 88 00:07:25,930 --> 00:07:29,920 So basically training data has these two dimensions. 89 00:07:31,060 --> 00:07:33,880 It has 400 photos and 13 columns. 90 00:07:35,860 --> 00:07:41,470 We want this dimension because this represents the number of believers in this training data. 91 00:07:41,740 --> 00:07:44,130 So that is why we have we have written to here. 92 00:07:45,220 --> 00:07:52,120 So using this, even if you change your train data to any other dataset, you'd need not a baby or ship 93 00:07:52,120 --> 00:07:53,500 for this neural network. 94 00:07:53,740 --> 00:07:55,630 It will automatically get updated. 95 00:07:57,580 --> 00:08:02,170 The second part is the output layer and this layer. 96 00:08:02,680 --> 00:08:10,030 We first include the input layer to this output layer, which is the same as the input layer that we 97 00:08:10,210 --> 00:08:11,010 created earlier. 98 00:08:13,060 --> 00:08:19,330 This is important because this creates the connection between the input layer and output. 99 00:08:20,470 --> 00:08:26,770 If we do not specify that this output layer has this input layer, then there would be no connection 100 00:08:26,770 --> 00:08:27,790 between these two. 101 00:08:29,680 --> 00:08:34,480 So in the output layer, the first thing is always the input layer that it will take. 102 00:08:36,190 --> 00:08:43,510 Then comes the hidden lives, which is similar to the way that we specify in these sequential EPA. 103 00:08:44,860 --> 00:08:50,910 In this scenario, we are using two layers, both with 64 neurons. 104 00:08:52,330 --> 00:08:55,200 The activation function for both of these is really. 105 00:08:57,470 --> 00:09:03,730 Lastly, that is the output layer has only one neuron and it has no activation function because it is 106 00:09:03,730 --> 00:09:04,900 a regression problem. 107 00:09:06,430 --> 00:09:08,440 So let's run these two lines of code. 108 00:09:11,110 --> 00:09:14,650 This creates one input into inputs. 109 00:09:16,610 --> 00:09:20,830 Now, this will create another out preventer, which is predictions. 110 00:09:24,190 --> 00:09:25,500 Nine functionally. 111 00:09:25,960 --> 00:09:29,710 We create the model using Gaydos model function. 112 00:09:31,270 --> 00:09:33,190 It takes two parameters. 113 00:09:33,370 --> 00:09:37,650 One is the inputs and one is output inputs. 114 00:09:37,750 --> 00:09:42,700 We have named as inputs only and the outputs has been named as predictions. 115 00:09:43,780 --> 00:09:47,170 So in particular, input output is equal to predictions. 116 00:09:47,620 --> 00:09:49,880 And this defines the models architecture. 117 00:09:52,120 --> 00:09:59,980 So our models architecture is we have 13 variables which are coming in as input in the first 10 led. 118 00:10:00,370 --> 00:10:03,140 We have 64 neurons in the second layer. 119 00:10:03,250 --> 00:10:04,900 We have another 64 neurons. 120 00:10:05,220 --> 00:10:08,560 And in the output layer we have one output neuron. 121 00:10:10,480 --> 00:10:17,110 So when I've done this, a model is created and it's architected is specified. 122 00:10:18,420 --> 00:10:21,100 Now we configured this model in configuration. 123 00:10:21,280 --> 00:10:23,090 We specify the optimize it. 124 00:10:23,740 --> 00:10:27,100 We can use a duty Artemus prop or whichever you like. 125 00:10:28,920 --> 00:10:34,120 Lost function for immigration problems is a messy means squared. 126 00:10:34,320 --> 00:10:37,690 A matrix is not a necessity. 127 00:10:38,000 --> 00:10:40,790 However, we have used mean absolute error 128 00:10:43,800 --> 00:10:47,740 to win and underscore modellers configured. 129 00:10:50,890 --> 00:10:54,110 Now we bring in our model using the fit function. 130 00:10:55,170 --> 00:11:02,060 Again, we input training, data, training levels, epochs and deep Betsey's. 131 00:11:13,750 --> 00:11:19,240 You can see that the mortar is landing for the epochs and loss. 132 00:11:19,420 --> 00:11:21,920 That is the embassy is steadily closing. 133 00:11:23,140 --> 00:11:30,060 I mean, to do that, it is also because the model has run on the box. 134 00:11:32,320 --> 00:11:36,100 We can check the performance of this model on our desk. 135 00:11:37,930 --> 00:11:41,110 This is similar to what we have done when we were using sequentially. 136 00:11:41,120 --> 00:11:52,180 B, you d evaluate function and importantly tested them best labeled and do it do list on the test loss 137 00:11:52,210 --> 00:11:53,710 and test absolute error. 138 00:11:54,150 --> 00:12:00,760 We can run these two commands and we can see that the test loss is Tuqay two point five six. 139 00:12:01,780 --> 00:12:04,210 And this absolute error is four point four. 140 00:12:06,650 --> 00:12:12,040 So in this video we saw how to use functionally be to build another model. 141 00:12:13,390 --> 00:12:17,350 This model could have been built using sequential EPA as well. 142 00:12:18,700 --> 00:12:21,850 And in fact, it would have been easier to use sequentially pay it. 143 00:12:22,920 --> 00:12:29,980 But in the next lecture, you can see if we have a complex neural network architecture, how functionally 144 00:12:29,980 --> 00:12:33,640 they helps us in building that scene in the next one.