1
00:00:00,720 --> 00:00:05,350
In this lesson we're going to talk about the tensor flow graph in detail.

2
00:00:05,460 --> 00:00:09,770
This will give us a chance to review some of the code that we've written so far.

3
00:00:09,810 --> 00:00:16,320
The goal for this lesson is really to connect all the puzzle pieces that we've come across and come

4
00:00:16,320 --> 00:00:20,910
away with an understanding how everything fits together in tensor flow.

5
00:00:20,910 --> 00:00:24,340
I know that's quite ambitious but we can do it.

6
00:00:24,360 --> 00:00:30,870
So far we've seen a couple of different elements crop up in our code for example tensor flow placeholders

7
00:00:31,140 --> 00:00:38,380
tensor flow constants tensor flow variables the tensor flow session and something called a feed dictionary.

8
00:00:38,430 --> 00:00:43,350
These are a lot of little components to get your head around but all of these things link back to the

9
00:00:43,350 --> 00:00:51,460
central idea of the tensor flow graph head over to tensor board and click graphs at the very top.

10
00:00:51,720 --> 00:00:54,870
What you should then see is something like this.

11
00:00:54,870 --> 00:00:58,060
This monstrosity here is art tensor flow graph.

12
00:00:58,180 --> 00:01:02,090
That's pretty intimidating the very first time you see it hand.

13
00:01:02,490 --> 00:01:05,860
I know that I was super confused when I first looked at it.

14
00:01:06,090 --> 00:01:12,720
But let's demystify what the graph is how it works and how are all these little components that we have

15
00:01:12,720 --> 00:01:13,640
in our code.

16
00:01:13,650 --> 00:01:14,560
Link back to it.

17
00:01:14,910 --> 00:01:16,650
So what does the graph.

18
00:01:16,740 --> 00:01:21,760
The best way to understand what the graph is is to think about how we were working with Tensor Fund

19
00:01:21,780 --> 00:01:26,580
the first place namely this idea of a two step process.

20
00:01:26,580 --> 00:01:31,080
Step 1 was defining all the calculations and all the variables.

21
00:01:31,080 --> 00:01:35,740
And Step 2 was running and evaluating our calculations.

22
00:01:35,880 --> 00:01:43,470
The reason we have Step 1 is because tensor flow takes all your code and compiles this thing it compiles

23
00:01:43,590 --> 00:01:44,450
the graph.

24
00:01:44,460 --> 00:01:49,070
In other words it lays down the pipes before it allows us to pump any water through them.

25
00:01:49,170 --> 00:01:57,780
And if I zoom in here what we see is that this graph is composed of nodes and of edges so a node is

26
00:01:57,780 --> 00:02:05,240
something like this or something like this and an edge is the arrow connecting to nodes.

27
00:02:05,340 --> 00:02:12,030
So what does a node represent in this graph a node represents a mathematical operation.

28
00:02:12,570 --> 00:02:20,820
So addition would be an example of a node but also subtraction and multiplication and even some of the

29
00:02:20,820 --> 00:02:26,340
fancy operations that we've done like calculating the activation values through our Rela activation

30
00:02:26,340 --> 00:02:26,990
function.

31
00:02:27,000 --> 00:02:30,960
This is why you can find a real node on our graph as well.

32
00:02:30,960 --> 00:02:33,520
A lot of the other calculations you'll find as well.

33
00:02:33,600 --> 00:02:39,390
For example we can go to our graph and find our soft Max calculation and we can also go to our graph

34
00:02:39,720 --> 00:02:42,400
and find our cross entropy calculation.

35
00:02:42,450 --> 00:02:48,060
If I check intensive board I can see my soft Max here and my cross entropy calculation here.

36
00:02:48,540 --> 00:02:52,830
So every time we do a calculation with tensor flow it shows up on the graph.

37
00:02:52,920 --> 00:02:59,460
When we first set up our neural network we drew some random numbers from a distribution going down all

38
00:02:59,460 --> 00:03:00,660
the way on our graph.

39
00:03:00,660 --> 00:03:06,510
We can find this calculation right here as a node truncated underscore normal.

40
00:03:06,570 --> 00:03:11,350
Now if we look closely we can see that all the nodes are connected by these lines.

41
00:03:11,430 --> 00:03:18,030
They're all connected by these little arrows these arrows are called the edges of the graph and the

42
00:03:18,030 --> 00:03:25,220
edges are the things that carry the data meaning the data flows along these edges.

43
00:03:25,230 --> 00:03:28,960
Say we have this ad operation here in our Python code.

44
00:03:29,010 --> 00:03:34,510
This addition corresponds to this line right here in our tensor flow graph.

45
00:03:34,590 --> 00:03:41,730
We can see that the two edges that go into this edition are the inputs to the addition and the arrow

46
00:03:41,730 --> 00:03:46,410
that comes out of the addition is the result of this addition.

47
00:03:46,410 --> 00:03:53,010
So we've got the result of the matrix multiplication going into our addition and then we've got the

48
00:03:53,010 --> 00:03:56,970
result of the addition going into our reload activation function.

49
00:03:57,360 --> 00:04:03,130
So in our Python code that corresponds to the result of this being added to this.

50
00:04:03,330 --> 00:04:09,250
And then this result feeding into our reload activation function.

51
00:04:09,270 --> 00:04:16,770
This is why you can interpret the edges as the inputs and outputs for these operations.

52
00:04:16,770 --> 00:04:22,440
Now since we're adding our biases to our weights here after the multiplication we can take a closer

53
00:04:22,440 --> 00:04:23,920
look at these numbers here.

54
00:04:23,940 --> 00:04:30,450
This one here is upside down which makes it a little bit hard to read but it does read 512.

55
00:04:30,450 --> 00:04:32,880
Same as this one hit the streets question mark.

56
00:04:32,880 --> 00:04:36,120
Times five hundred and twelve.

57
00:04:36,180 --> 00:04:38,420
So what do these numbers represent.

58
00:04:38,550 --> 00:04:43,980
These numbers represent the shape of these two tenses.

59
00:04:44,010 --> 00:04:48,360
So not only can you think of these little arrows as inputs to the add function.

60
00:04:48,360 --> 00:04:55,440
This little arrow this edge actually represents a tensor and the tensor has a shape in the case of our

61
00:04:55,440 --> 00:04:56,420
biases.

62
00:04:56,430 --> 00:05:01,330
We have five hundred and twelve biases for that first layer.

63
00:05:01,400 --> 00:05:01,680
Right.

64
00:05:01,730 --> 00:05:04,340
Because we've got five hundred and 12 neurons.

65
00:05:04,340 --> 00:05:11,840
And when we created our biases we created five hundred and twelve of them one for each and every single

66
00:05:11,840 --> 00:05:13,480
neuron in the layer.

67
00:05:13,910 --> 00:05:17,240
And this is why there is the number 512 here.

68
00:05:17,390 --> 00:05:21,990
On the other side we also have five hundred and 12 different weights.

69
00:05:22,040 --> 00:05:28,970
The reason we have a question mark here is because we don't know yet how many samples we have in this

70
00:05:28,970 --> 00:05:35,690
tensor because we left this open for later and we did that in this hill here where we created the place

71
00:05:35,690 --> 00:05:39,770
holder and we left one part of the shape blank.

72
00:05:39,770 --> 00:05:46,550
So if the numbers on these edges represent the shapes of our tenses and we have data flowing between

73
00:05:46,550 --> 00:05:52,960
the nodes and now we know where the flow part of the name in tensor flow comes from.

74
00:05:53,120 --> 00:05:56,240
But how do we add things to our graph in the first place.

75
00:05:56,240 --> 00:06:02,840
So for example what did this variable come from and also what is a variable in tensor flow the way you

76
00:06:02,840 --> 00:06:07,980
can think of a variable is something that maintains the state of the graph.

77
00:06:08,020 --> 00:06:15,740
I know that sounds really abstract but variables can be updated and can change in our case the variables

78
00:06:15,740 --> 00:06:21,140
that we care about are the weights and the biases that weren't gonna be updating.

79
00:06:21,140 --> 00:06:25,800
As we're training our neural network so let's look at where we were creating our variables.

80
00:06:25,850 --> 00:06:29,960
We created our first tense of flow variables with this line of code here.

81
00:06:30,020 --> 00:06:34,540
The key thing that we had to specify when creating these variables was their shape.

82
00:06:34,550 --> 00:06:39,500
That's one of the reasons why this argument on the line above was so important.

83
00:06:39,500 --> 00:06:44,330
This is where we specified the shape of the variable that we're creating on the line below.

84
00:06:44,810 --> 00:06:49,730
Now if variables are the things that can change over time then the things that don't change over time

85
00:06:50,120 --> 00:06:51,640
are called constants.

86
00:06:51,650 --> 00:06:58,430
If you wanted to add a constant to the graph then we do so with T F dot constant pain you hear the constant

87
00:06:58,430 --> 00:07:04,640
values are only for the initialization is the biases themselves are actually variables.

88
00:07:04,880 --> 00:07:08,810
It's only their initial starting values which we've created as a constant.

89
00:07:08,810 --> 00:07:13,910
Now all of this was still part of the setup process but at some point we actually wanted to do some

90
00:07:13,910 --> 00:07:14,990
calculations.

91
00:07:15,110 --> 00:07:21,860
We wanted to launch this graph and as a prerequisite we had to initialize all the variables that we

92
00:07:21,860 --> 00:07:23,000
created.

93
00:07:23,120 --> 00:07:24,260
Where did we do that.

94
00:07:24,290 --> 00:07:27,250
We did it right here on this line of code.

95
00:07:27,260 --> 00:07:34,190
This is the line of code that evaluated all the initialization operations that we had above and then

96
00:07:34,280 --> 00:07:37,040
allowed us to start our session.

97
00:07:37,040 --> 00:07:43,220
The reason we have to run this line of code is because prior to this line being executed none of the

98
00:07:43,220 --> 00:07:46,220
variables actually hold any value.

99
00:07:46,220 --> 00:07:53,930
The tensor flow variables only get their values after the initial analyzer is evaluated and that happens

100
00:07:53,960 --> 00:07:54,980
on this line here.

101
00:07:55,430 --> 00:08:01,310
It's from this point onwards when we're inside a session that we can actually evaluate a particular

102
00:08:01,310 --> 00:08:05,720
tensor and look at its values be three.

103
00:08:05,750 --> 00:08:07,690
Remember it was one of our biases.

104
00:08:07,730 --> 00:08:12,610
These were the biases in the output layer which all had a starting value of zero.

105
00:08:12,620 --> 00:08:17,900
And this is the part of the discussion where we can start talking about this thing called a session

106
00:08:18,490 --> 00:08:23,930
a tensor flow session is when our place holders can start getting their values.

107
00:08:23,930 --> 00:08:30,560
Remember our place holders we credit those at the very top and we had a place holder for our data for

108
00:08:30,560 --> 00:08:35,180
our features and we had a place holder for our labels.

109
00:08:35,180 --> 00:08:38,600
We created these two tenses and the very beginning.

110
00:08:38,600 --> 00:08:41,500
So let's find them on the tensor flow graph.

111
00:08:41,510 --> 00:08:44,740
This here is the place holder for our X's.

112
00:08:44,990 --> 00:08:49,790
And the reason I know this is because after this creation operation we end up with a little tensor here

113
00:08:50,210 --> 00:08:56,810
that has a shape of question mark by seven hundred and eighty four and that corresponds to the shape

114
00:08:56,900 --> 00:08:58,630
that we've got right here.

115
00:08:58,790 --> 00:09:04,750
None for the question mark and seven hundred and eighty four for the total number of inputs and how.

116
00:09:04,760 --> 00:09:11,500
The thing about place holders is that place holders must be fed and they must be fed during the session.

117
00:09:11,570 --> 00:09:13,700
If you want tensor flow to do some work.

118
00:09:13,700 --> 00:09:19,520
This is my mental bridge the place holders are hungry and you have to feed them place holders will do

119
00:09:19,520 --> 00:09:20,860
your work for food.

120
00:09:21,110 --> 00:09:22,630
How do we feed them.

121
00:09:22,630 --> 00:09:26,710
Well we feed them with a feed dictionary.

122
00:09:26,710 --> 00:09:29,520
The feed dictionary is what we supply.

123
00:09:29,540 --> 00:09:31,220
When we run a session.

124
00:09:31,280 --> 00:09:34,010
You can see that we do this right here.

125
00:09:34,010 --> 00:09:40,220
The reason why the feed dictionary is important is because it maps the place holder are X and it maps

126
00:09:40,310 --> 00:09:44,280
are other place holder are y to the actual data.

127
00:09:44,420 --> 00:09:50,000
In this case it's our batch of training features and on this line it's the features that are part of

128
00:09:50,000 --> 00:09:51,750
our evaluation dataset.

129
00:09:51,770 --> 00:09:54,920
The important thing is that the shapes match right.

130
00:09:54,950 --> 00:09:57,870
This is the kind of check the tensor flow will actually do.

131
00:09:58,040 --> 00:10:01,880
Does the place holder shape match what we're mapping it to.

132
00:10:01,940 --> 00:10:10,460
So if you recall the shape of this batch underscore X was 1000 because we had 1000 samples in our batch

133
00:10:10,940 --> 00:10:15,380
by seven hundred and eighty four features down here.

134
00:10:15,410 --> 00:10:16,900
We had a different shape.

135
00:10:17,000 --> 00:10:20,990
We had 10000 samples instead of 1000.

136
00:10:20,990 --> 00:10:22,980
But we had the same number of features.

137
00:10:23,000 --> 00:10:25,430
Seven hundred and eighty four.

138
00:10:25,430 --> 00:10:32,420
The reason we can map both of these to the same place holder is because our place holder is willing

139
00:10:32,420 --> 00:10:38,520
to accept a different number of samples as long as the number of features are consistent.

140
00:10:38,540 --> 00:10:43,580
So by leaving this part of the shape blank we're able to feed it a different number of samples.

141
00:10:43,670 --> 00:10:46,740
As long as we're consistent on the other dimension.

142
00:10:46,940 --> 00:10:52,430
So this is something to bear in mind when you're creating your feed dictionary tensor flow will actually

143
00:10:52,430 --> 00:10:58,160
check that the shape of the data matches with the shape that you specified for the place holder.

144
00:10:58,670 --> 00:11:05,500
Once we're done running our session the last important thing that we did was release our resources.

145
00:11:05,660 --> 00:11:07,340
So we started our session.

146
00:11:07,340 --> 00:11:10,540
We ran our session did a bunch of calculations.

147
00:11:10,670 --> 00:11:16,880
So at the end we close our session closing our session frees up all the resources that we used.

148
00:11:16,880 --> 00:11:22,010
So I know that we've written a lot of code and we didn't really get to see it in action until we ran

149
00:11:22,010 --> 00:11:23,730
our session at the very end.

150
00:11:23,750 --> 00:11:29,750
So I hope this quick review tie things together and it allowed us to see the connection between the

151
00:11:29,750 --> 00:11:33,130
graph and our Python code.

152
00:11:33,170 --> 00:11:38,750
What I'd like to do now is actually clean this graph up because one thing that you'll notice is that

153
00:11:38,750 --> 00:11:43,800
the names in it variable variable underscore one variable underscore two.

154
00:11:43,880 --> 00:11:45,770
They're not terribly helpful right.

155
00:11:46,190 --> 00:11:52,760
So we can actually clean our graph up and give these parts of the graph a better name that will make

156
00:11:52,760 --> 00:11:55,180
it a lot more clear as to what's going on.

157
00:11:55,190 --> 00:12:01,850
Let's come to the very top where we're creating our place holders and I'll hit shift tab on my keyboard

158
00:12:02,480 --> 00:12:05,120
and I'll bring up the quick documentation.

159
00:12:05,120 --> 00:12:10,130
If you take a look here you see that there is an additional parameter that we can supply and that is

160
00:12:10,310 --> 00:12:16,670
the name the description for this name parameter is the name for the operation and they say this is

161
00:12:16,700 --> 00:12:19,840
optional which is why we haven't used it so far.

162
00:12:20,510 --> 00:12:26,840
But I do think naming the different operations is quite helpful because it makes the graph at the end

163
00:12:26,930 --> 00:12:28,910
intensive board a lot more clear.

164
00:12:29,480 --> 00:12:37,830
So a lot of come here and then I'll type name as equal to and I'll give it a capital X as the name for

165
00:12:37,830 --> 00:12:38,470
the wine.

166
00:12:38,550 --> 00:12:40,290
I'm going to add a name as well.

167
00:12:40,440 --> 00:12:47,300
And in this case uh I pick something else I might pick labels as the name.

168
00:12:47,400 --> 00:12:49,030
But let's not stop that.

169
00:12:49,080 --> 00:12:55,530
Let's come down where we're setting up our layers our weights and our biases and if we hit shift tab

170
00:12:55,650 --> 00:13:01,770
here again and bring up the quick documentation for the tensor flow variable we also see that it is

171
00:13:01,770 --> 00:13:04,530
able to accept this name parameter.

172
00:13:04,530 --> 00:13:14,220
So let's give it one name is equal to single quotes w 1 and this bias will stick with the same variable

173
00:13:14,220 --> 00:13:14,860
names.

174
00:13:14,940 --> 00:13:22,460
So name is equal to single quotes B one we can do the same thing for all our other layers.

175
00:13:22,470 --> 00:13:24,430
So this one will call.

176
00:13:24,480 --> 00:13:26,780
Name is equal to W2.

177
00:13:26,850 --> 00:13:29,760
This one will call names equal to single quotes.

178
00:13:29,770 --> 00:13:39,870
B 2 and we've got name is equal to w 3 and here name is equal to be three.

179
00:13:39,960 --> 00:13:43,960
Now what I'll do is I'll delete all the sub directories that I've got here.

180
00:13:44,070 --> 00:13:46,900
Intensive board amnesty digital logs.

181
00:13:47,220 --> 00:13:55,010
And I'm going to come back into Jupiter and I'm going to run all the cells below my setup markdown cell.

182
00:13:55,050 --> 00:14:00,500
At this point I should have a brand new sub folder here and I can go into my tensor board.

183
00:14:00,900 --> 00:14:04,190
Let's check our graph and see if we can see our new names.

184
00:14:04,200 --> 00:14:09,870
The first thing I see is a message about no data set being found but I'm going to ignore that and I'm

185
00:14:09,870 --> 00:14:17,160
going to zoom in here and I can see now that my place holder was created here and here I've created

186
00:14:17,160 --> 00:14:23,040
the weights for the first hidden layer here I've created the biases for my first hidden layer and they're

187
00:14:23,040 --> 00:14:25,080
flowing to my add function.

188
00:14:25,140 --> 00:14:27,020
This is my second hidden layer.

189
00:14:27,120 --> 00:14:29,130
This here is my output layer.

190
00:14:29,370 --> 00:14:33,030
My output layer flows to my soft Max activation function.

191
00:14:33,030 --> 00:14:39,660
And then I calculate my cross entropy loss using the labels that I've created here.

192
00:14:39,660 --> 00:14:48,270
So these are the actual labels and those alongside with whatever comes out of my output layer gets used.

193
00:14:48,270 --> 00:14:52,810
When we calculate the loss with our cross entropy loss function.

194
00:14:52,920 --> 00:14:53,750
Fantastic.

195
00:14:54,030 --> 00:14:57,880
So that already makes things a lot more clear than they were before.

196
00:14:58,080 --> 00:15:00,830
But I think we can do better than that still.

197
00:15:00,840 --> 00:15:06,480
One of the things that we can do is we can start grouping some of these calculations to make our graph

198
00:15:06,510 --> 00:15:13,170
even easier to understand and the things that when a group are actually all the calculations that belong

199
00:15:13,170 --> 00:15:14,250
to the same layer.

200
00:15:14,700 --> 00:15:21,690
So for example I want to group the creation of my weights and the matrix multiplication and this addition

201
00:15:21,690 --> 00:15:23,940
of the biases and this activation.

202
00:15:24,090 --> 00:15:30,300
I want to group all of this into a single layer and after that I want to do the same thing for Layer

203
00:15:30,300 --> 00:15:33,240
number two and my output layer.

204
00:15:33,240 --> 00:15:36,550
And that's exactly what we're going to talk about in the next lesson.

205
00:15:36,570 --> 00:15:37,500
I'll see you there.