1
00:00:02,930 --> 00:00:03,590
Hello, everyone.

2
00:00:03,980 --> 00:00:04,640
Welcome back.

3
00:00:05,690 --> 00:00:08,840
In this lecture, we are going to learn about two Quantum's.

4
00:00:09,320 --> 00:00:13,640
One is how to be a neural network for regulation problems.

5
00:00:14,330 --> 00:00:18,170
And the second is how to do it using functional EPA.

6
00:00:19,840 --> 00:00:23,840
Now, we have been using sequential EPA and distractibility.

7
00:00:23,900 --> 00:00:33,770
How to use functional EPA functionally is basically used for defining complex models such as multi input

8
00:00:33,920 --> 00:00:37,640
or multi output models or models which have shared Lyd.

9
00:00:39,450 --> 00:00:46,990
So first, really make the normal model using functional EPA, which could also be done using sequentially.

10
00:00:48,290 --> 00:00:55,410
But then we will create a complex neural network structure which can only be handled by a functional

11
00:00:55,410 --> 00:00:55,710
EPA.

12
00:00:57,480 --> 00:01:04,400
Also, we'll be solving a problem, which means our output variable is a continuous measurement.

13
00:01:04,860 --> 00:01:06,670
That is, it is not due to one pipe.

14
00:01:06,960 --> 00:01:09,240
It can have any value without any boundaries.

15
00:01:11,340 --> 00:01:14,580
This problem we'll be using Boston housing data.

16
00:01:14,650 --> 00:01:20,340
It it is a very standard data set in which we have 14 variables.

17
00:01:23,550 --> 00:01:31,420
Thirteen of the productivity bills and 14th one is the value of the House, basically using the values

18
00:01:31,510 --> 00:01:32,250
of 13.

19
00:01:32,400 --> 00:01:36,040
Productivity was we want to predict the value of health.

20
00:01:39,270 --> 00:01:46,740
This is also an in big data, it may not get us, Labidi, but we got important using this line.

21
00:01:50,020 --> 00:01:54,310
If you want to know more about the Boston housing data set, you can visit this link.

22
00:01:54,970 --> 00:01:57,760
It has details of all the 13 predictor variables.

23
00:01:58,310 --> 00:02:03,340
The predicted variables include variables like crime rate, number of hotel rooms and Tequilla.

24
00:02:05,350 --> 00:02:09,300
You can see that Boston housing Natus is now imported.

25
00:02:11,230 --> 00:02:13,210
You can look at this by clicking on it.

26
00:02:14,590 --> 00:02:20,960
The Boston housing dataset has two parts brain part and the best part within brain.

27
00:02:21,250 --> 00:02:26,260
We have 404 observations of 13 predictive variables.

28
00:02:27,160 --> 00:02:30,640
That is in the X and we have the labels.

29
00:02:31,240 --> 00:02:33,170
That is the value of hosting.

30
00:02:33,200 --> 00:02:37,640
We predicted invite in test.

31
00:02:37,930 --> 00:02:40,420
We have a set of one hundred two observations.

32
00:02:40,690 --> 00:02:43,870
Again, taking predictive valuables and invite.

33
00:02:43,900 --> 00:02:45,620
We have the opportunity to.

34
00:02:49,240 --> 00:02:56,620
Now, as we did earlier, we'll be importing the training part into train data and train levels, variable

35
00:02:57,430 --> 00:03:02,820
and testing part of this dataset into best data and best labels.

36
00:03:03,120 --> 00:03:06,460
The next event, these two lines of code.

37
00:03:10,000 --> 00:03:15,160
And now we have these new labels, test data and brain data.

38
00:03:15,800 --> 00:03:17,960
These are the predictor part of the data.

39
00:03:19,580 --> 00:03:24,800
And best labels and brain labels, these are the upper part of the victim.

40
00:03:27,530 --> 00:03:29,930
Next, guns preparing the data.

41
00:03:30,350 --> 00:03:37,340
And one of the important steps that we saw earlier was normalizing the data in the previous problem.

42
00:03:37,700 --> 00:03:40,970
We had only pixel data, which was homogeneous.

43
00:03:41,240 --> 00:03:45,620
So we simply divided it by two to five to get the skilled version of that data.

44
00:03:47,280 --> 00:03:54,260
But now we have heterogeneous data that is all these 13 variables representing 13 different things.

45
00:03:56,180 --> 00:03:58,760
It is not easy to scale such kind of data

46
00:04:02,120 --> 00:04:03,350
to normalize this data.

47
00:04:03,830 --> 00:04:05,180
We use this function sked.

48
00:04:07,650 --> 00:04:13,650
This scale function automatically finds out the mean of every variable and the standard deviation of

49
00:04:13,650 --> 00:04:14,080
everybody.

50
00:04:15,450 --> 00:04:18,180
And it uses that formula that I showed you earlier.

51
00:04:18,960 --> 00:04:21,270
It's a prakriti mean from each value.

52
00:04:21,690 --> 00:04:24,390
And divides that value by the standard deviation.

53
00:04:25,320 --> 00:04:33,530
So simply using the scale function, you can normalize our training data to normalize the data.

54
00:04:34,680 --> 00:04:38,520
We do not use the mean and standard deviation of Vitez data.

55
00:04:39,100 --> 00:04:42,210
We used to mean standard deviation of deigning data.

56
00:04:43,680 --> 00:04:47,610
The concept is Maeno only deplaning part of the data.

57
00:04:48,270 --> 00:04:51,120
Our model does not know any other detail of the word.

58
00:04:52,590 --> 00:04:58,890
We have only the training, but from that we find out the mean and standard deviation of each variable.

59
00:04:59,640 --> 00:05:06,540
We assume that this standard deviation and mean of each variable applies to the entire dataset of the

60
00:05:06,540 --> 00:05:06,810
world.

61
00:05:08,400 --> 00:05:14,490
So using those mean and standard deviations, we will be scaling up test data.

62
00:05:14,600 --> 00:05:14,970
Also.

63
00:05:18,070 --> 00:05:26,830
So in this light, we will scale our training data using descale scale function and this line we will

64
00:05:27,160 --> 00:05:35,530
find out deep Collum means indicating data and we'll be storing that information in this variable called

65
00:05:35,530 --> 00:05:36,220
Mean Street.

66
00:05:39,230 --> 00:05:45,990
In this line, we are finding out the standard deviation of these variables and training data and storing

67
00:05:45,990 --> 00:05:48,930
them individual court call standards.

68
00:05:51,300 --> 00:05:55,260
Now using the means of training data and standard deviations.

69
00:05:56,430 --> 00:05:57,990
We use the scale function.

70
00:05:58,770 --> 00:06:00,030
It is the same scale function.

71
00:06:00,210 --> 00:06:07,770
But here we are specifying the mean and the standard deviation to be used for scaling this test data.

72
00:06:10,410 --> 00:06:11,710
Now our data is ready.

73
00:06:12,340 --> 00:06:13,890
Our trained data is not released.

74
00:06:14,040 --> 00:06:16,080
Our test data is also normalized.

75
00:06:18,480 --> 00:06:23,190
Any new data on which you want to predict the outcome of the model?

76
00:06:24,120 --> 00:06:25,910
You have to scale it again.

77
00:06:26,100 --> 00:06:27,780
Using this scale function.

78
00:06:31,330 --> 00:06:34,270
Now comes the part when we define neural network.

79
00:06:35,890 --> 00:06:41,980
This time we'll be using functionally be a functionally B.A. has two different parts.

80
00:06:42,670 --> 00:06:51,190
One is the input and one is output input with early model about all the variables that we are inputting

81
00:06:51,250 --> 00:06:52,150
in the model.

82
00:06:53,560 --> 00:07:01,330
So basically in the input layer, we build the model that we have a input layer off shape is equal to

83
00:07:01,420 --> 00:07:02,490
number of variables.

84
00:07:03,820 --> 00:07:10,060
I could have written 13 here because I know that the number of variables are 13 in this particular dataset.

85
00:07:11,560 --> 00:07:16,690
But even if you change your train data, you'd need not update your model.

86
00:07:17,050 --> 00:07:23,710
If you divided like this, if you write this way, this means that you want to get this second dimension

87
00:07:23,830 --> 00:07:24,700
of the training data.

88
00:07:25,930 --> 00:07:29,920
So basically training data has these two dimensions.

89
00:07:31,060 --> 00:07:33,880
It has 400 photos and 13 columns.

90
00:07:35,860 --> 00:07:41,470
We want this dimension because this represents the number of believers in this training data.

91
00:07:41,740 --> 00:07:44,130
So that is why we have we have written to here.

92
00:07:45,220 --> 00:07:52,120
So using this, even if you change your train data to any other dataset, you'd need not a baby or ship

93
00:07:52,120 --> 00:07:53,500
for this neural network.

94
00:07:53,740 --> 00:07:55,630
It will automatically get updated.

95
00:07:57,580 --> 00:08:02,170
The second part is the output layer and this layer.

96
00:08:02,680 --> 00:08:10,030
We first include the input layer to this output layer, which is the same as the input layer that we

97
00:08:10,210 --> 00:08:11,010
created earlier.

98
00:08:13,060 --> 00:08:19,330
This is important because this creates the connection between the input layer and output.

99
00:08:20,470 --> 00:08:26,770
If we do not specify that this output layer has this input layer, then there would be no connection

100
00:08:26,770 --> 00:08:27,790
between these two.

101
00:08:29,680 --> 00:08:34,480
So in the output layer, the first thing is always the input layer that it will take.

102
00:08:36,190 --> 00:08:43,510
Then comes the hidden lives, which is similar to the way that we specify in these sequential EPA.

103
00:08:44,860 --> 00:08:50,910
In this scenario, we are using two layers, both with 64 neurons.

104
00:08:52,330 --> 00:08:55,200
The activation function for both of these is really.

105
00:08:57,470 --> 00:09:03,730
Lastly, that is the output layer has only one neuron and it has no activation function because it is

106
00:09:03,730 --> 00:09:04,900
a regression problem.

107
00:09:06,430 --> 00:09:08,440
So let's run these two lines of code.

108
00:09:11,110 --> 00:09:14,650
This creates one input into inputs.

109
00:09:16,610 --> 00:09:20,830
Now, this will create another out preventer, which is predictions.

110
00:09:24,190 --> 00:09:25,500
Nine functionally.

111
00:09:25,960 --> 00:09:29,710
We create the model using Gaydos model function.

112
00:09:31,270 --> 00:09:33,190
It takes two parameters.

113
00:09:33,370 --> 00:09:37,650
One is the inputs and one is output inputs.

114
00:09:37,750 --> 00:09:42,700
We have named as inputs only and the outputs has been named as predictions.

115
00:09:43,780 --> 00:09:47,170
So in particular, input output is equal to predictions.

116
00:09:47,620 --> 00:09:49,880
And this defines the models architecture.

117
00:09:52,120 --> 00:09:59,980
So our models architecture is we have 13 variables which are coming in as input in the first 10 led.

118
00:10:00,370 --> 00:10:03,140
We have 64 neurons in the second layer.

119
00:10:03,250 --> 00:10:04,900
We have another 64 neurons.

120
00:10:05,220 --> 00:10:08,560
And in the output layer we have one output neuron.

121
00:10:10,480 --> 00:10:17,110
So when I've done this, a model is created and it's architected is specified.

122
00:10:18,420 --> 00:10:21,100
Now we configured this model in configuration.

123
00:10:21,280 --> 00:10:23,090
We specify the optimize it.

124
00:10:23,740 --> 00:10:27,100
We can use a duty Artemus prop or whichever you like.

125
00:10:28,920 --> 00:10:34,120
Lost function for immigration problems is a messy means squared.

126
00:10:34,320 --> 00:10:37,690
A matrix is not a necessity.

127
00:10:38,000 --> 00:10:40,790
However, we have used mean absolute error

128
00:10:43,800 --> 00:10:47,740
to win and underscore modellers configured.

129
00:10:50,890 --> 00:10:54,110
Now we bring in our model using the fit function.

130
00:10:55,170 --> 00:11:02,060
Again, we input training, data, training levels, epochs and deep Betsey's.

131
00:11:13,750 --> 00:11:19,240
You can see that the mortar is landing for the epochs and loss.

132
00:11:19,420 --> 00:11:21,920
That is the embassy is steadily closing.

133
00:11:23,140 --> 00:11:30,060
I mean, to do that, it is also because the model has run on the box.

134
00:11:32,320 --> 00:11:36,100
We can check the performance of this model on our desk.

135
00:11:37,930 --> 00:11:41,110
This is similar to what we have done when we were using sequentially.

136
00:11:41,120 --> 00:11:52,180
B, you d evaluate function and importantly tested them best labeled and do it do list on the test loss

137
00:11:52,210 --> 00:11:53,710
and test absolute error.

138
00:11:54,150 --> 00:12:00,760
We can run these two commands and we can see that the test loss is Tuqay two point five six.

139
00:12:01,780 --> 00:12:04,210
And this absolute error is four point four.

140
00:12:06,650 --> 00:12:12,040
So in this video we saw how to use functionally be to build another model.

141
00:12:13,390 --> 00:12:17,350
This model could have been built using sequential EPA as well.

142
00:12:18,700 --> 00:12:21,850
And in fact, it would have been easier to use sequentially pay it.

143
00:12:22,920 --> 00:12:29,980
But in the next lecture, you can see if we have a complex neural network architecture, how functionally

144
00:12:29,980 --> 00:12:33,640
they helps us in building that scene in the next one.