1 00:00:00,720 --> 00:00:05,350 In this lesson we're going to talk about the tensor flow graph in detail. 2 00:00:05,460 --> 00:00:09,770 This will give us a chance to review some of the code that we've written so far. 3 00:00:09,810 --> 00:00:16,320 The goal for this lesson is really to connect all the puzzle pieces that we've come across and come 4 00:00:16,320 --> 00:00:20,910 away with an understanding how everything fits together in tensor flow. 5 00:00:20,910 --> 00:00:24,340 I know that's quite ambitious but we can do it. 6 00:00:24,360 --> 00:00:30,870 So far we've seen a couple of different elements crop up in our code for example tensor flow placeholders 7 00:00:31,140 --> 00:00:38,380 tensor flow constants tensor flow variables the tensor flow session and something called a feed dictionary. 8 00:00:38,430 --> 00:00:43,350 These are a lot of little components to get your head around but all of these things link back to the 9 00:00:43,350 --> 00:00:51,460 central idea of the tensor flow graph head over to tensor board and click graphs at the very top. 10 00:00:51,720 --> 00:00:54,870 What you should then see is something like this. 11 00:00:54,870 --> 00:00:58,060 This monstrosity here is art tensor flow graph. 12 00:00:58,180 --> 00:01:02,090 That's pretty intimidating the very first time you see it hand. 13 00:01:02,490 --> 00:01:05,860 I know that I was super confused when I first looked at it. 14 00:01:06,090 --> 00:01:12,720 But let's demystify what the graph is how it works and how are all these little components that we have 15 00:01:12,720 --> 00:01:13,640 in our code. 16 00:01:13,650 --> 00:01:14,560 Link back to it. 17 00:01:14,910 --> 00:01:16,650 So what does the graph. 18 00:01:16,740 --> 00:01:21,760 The best way to understand what the graph is is to think about how we were working with Tensor Fund 19 00:01:21,780 --> 00:01:26,580 the first place namely this idea of a two step process. 20 00:01:26,580 --> 00:01:31,080 Step 1 was defining all the calculations and all the variables. 21 00:01:31,080 --> 00:01:35,740 And Step 2 was running and evaluating our calculations. 22 00:01:35,880 --> 00:01:43,470 The reason we have Step 1 is because tensor flow takes all your code and compiles this thing it compiles 23 00:01:43,590 --> 00:01:44,450 the graph. 24 00:01:44,460 --> 00:01:49,070 In other words it lays down the pipes before it allows us to pump any water through them. 25 00:01:49,170 --> 00:01:57,780 And if I zoom in here what we see is that this graph is composed of nodes and of edges so a node is 26 00:01:57,780 --> 00:02:05,240 something like this or something like this and an edge is the arrow connecting to nodes. 27 00:02:05,340 --> 00:02:12,030 So what does a node represent in this graph a node represents a mathematical operation. 28 00:02:12,570 --> 00:02:20,820 So addition would be an example of a node but also subtraction and multiplication and even some of the 29 00:02:20,820 --> 00:02:26,340 fancy operations that we've done like calculating the activation values through our Rela activation 30 00:02:26,340 --> 00:02:26,990 function. 31 00:02:27,000 --> 00:02:30,960 This is why you can find a real node on our graph as well. 32 00:02:30,960 --> 00:02:33,520 A lot of the other calculations you'll find as well. 33 00:02:33,600 --> 00:02:39,390 For example we can go to our graph and find our soft Max calculation and we can also go to our graph 34 00:02:39,720 --> 00:02:42,400 and find our cross entropy calculation. 35 00:02:42,450 --> 00:02:48,060 If I check intensive board I can see my soft Max here and my cross entropy calculation here. 36 00:02:48,540 --> 00:02:52,830 So every time we do a calculation with tensor flow it shows up on the graph. 37 00:02:52,920 --> 00:02:59,460 When we first set up our neural network we drew some random numbers from a distribution going down all 38 00:02:59,460 --> 00:03:00,660 the way on our graph. 39 00:03:00,660 --> 00:03:06,510 We can find this calculation right here as a node truncated underscore normal. 40 00:03:06,570 --> 00:03:11,350 Now if we look closely we can see that all the nodes are connected by these lines. 41 00:03:11,430 --> 00:03:18,030 They're all connected by these little arrows these arrows are called the edges of the graph and the 42 00:03:18,030 --> 00:03:25,220 edges are the things that carry the data meaning the data flows along these edges. 43 00:03:25,230 --> 00:03:28,960 Say we have this ad operation here in our Python code. 44 00:03:29,010 --> 00:03:34,510 This addition corresponds to this line right here in our tensor flow graph. 45 00:03:34,590 --> 00:03:41,730 We can see that the two edges that go into this edition are the inputs to the addition and the arrow 46 00:03:41,730 --> 00:03:46,410 that comes out of the addition is the result of this addition. 47 00:03:46,410 --> 00:03:53,010 So we've got the result of the matrix multiplication going into our addition and then we've got the 48 00:03:53,010 --> 00:03:56,970 result of the addition going into our reload activation function. 49 00:03:57,360 --> 00:04:03,130 So in our Python code that corresponds to the result of this being added to this. 50 00:04:03,330 --> 00:04:09,250 And then this result feeding into our reload activation function. 51 00:04:09,270 --> 00:04:16,770 This is why you can interpret the edges as the inputs and outputs for these operations. 52 00:04:16,770 --> 00:04:22,440 Now since we're adding our biases to our weights here after the multiplication we can take a closer 53 00:04:22,440 --> 00:04:23,920 look at these numbers here. 54 00:04:23,940 --> 00:04:30,450 This one here is upside down which makes it a little bit hard to read but it does read 512. 55 00:04:30,450 --> 00:04:32,880 Same as this one hit the streets question mark. 56 00:04:32,880 --> 00:04:36,120 Times five hundred and twelve. 57 00:04:36,180 --> 00:04:38,420 So what do these numbers represent. 58 00:04:38,550 --> 00:04:43,980 These numbers represent the shape of these two tenses. 59 00:04:44,010 --> 00:04:48,360 So not only can you think of these little arrows as inputs to the add function. 60 00:04:48,360 --> 00:04:55,440 This little arrow this edge actually represents a tensor and the tensor has a shape in the case of our 61 00:04:55,440 --> 00:04:56,420 biases. 62 00:04:56,430 --> 00:05:01,330 We have five hundred and twelve biases for that first layer. 63 00:05:01,400 --> 00:05:01,680 Right. 64 00:05:01,730 --> 00:05:04,340 Because we've got five hundred and 12 neurons. 65 00:05:04,340 --> 00:05:11,840 And when we created our biases we created five hundred and twelve of them one for each and every single 66 00:05:11,840 --> 00:05:13,480 neuron in the layer. 67 00:05:13,910 --> 00:05:17,240 And this is why there is the number 512 here. 68 00:05:17,390 --> 00:05:21,990 On the other side we also have five hundred and 12 different weights. 69 00:05:22,040 --> 00:05:28,970 The reason we have a question mark here is because we don't know yet how many samples we have in this 70 00:05:28,970 --> 00:05:35,690 tensor because we left this open for later and we did that in this hill here where we created the place 71 00:05:35,690 --> 00:05:39,770 holder and we left one part of the shape blank. 72 00:05:39,770 --> 00:05:46,550 So if the numbers on these edges represent the shapes of our tenses and we have data flowing between 73 00:05:46,550 --> 00:05:52,960 the nodes and now we know where the flow part of the name in tensor flow comes from. 74 00:05:53,120 --> 00:05:56,240 But how do we add things to our graph in the first place. 75 00:05:56,240 --> 00:06:02,840 So for example what did this variable come from and also what is a variable in tensor flow the way you 76 00:06:02,840 --> 00:06:07,980 can think of a variable is something that maintains the state of the graph. 77 00:06:08,020 --> 00:06:15,740 I know that sounds really abstract but variables can be updated and can change in our case the variables 78 00:06:15,740 --> 00:06:21,140 that we care about are the weights and the biases that weren't gonna be updating. 79 00:06:21,140 --> 00:06:25,800 As we're training our neural network so let's look at where we were creating our variables. 80 00:06:25,850 --> 00:06:29,960 We created our first tense of flow variables with this line of code here. 81 00:06:30,020 --> 00:06:34,540 The key thing that we had to specify when creating these variables was their shape. 82 00:06:34,550 --> 00:06:39,500 That's one of the reasons why this argument on the line above was so important. 83 00:06:39,500 --> 00:06:44,330 This is where we specified the shape of the variable that we're creating on the line below. 84 00:06:44,810 --> 00:06:49,730 Now if variables are the things that can change over time then the things that don't change over time 85 00:06:50,120 --> 00:06:51,640 are called constants. 86 00:06:51,650 --> 00:06:58,430 If you wanted to add a constant to the graph then we do so with T F dot constant pain you hear the constant 87 00:06:58,430 --> 00:07:04,640 values are only for the initialization is the biases themselves are actually variables. 88 00:07:04,880 --> 00:07:08,810 It's only their initial starting values which we've created as a constant. 89 00:07:08,810 --> 00:07:13,910 Now all of this was still part of the setup process but at some point we actually wanted to do some 90 00:07:13,910 --> 00:07:14,990 calculations. 91 00:07:15,110 --> 00:07:21,860 We wanted to launch this graph and as a prerequisite we had to initialize all the variables that we 92 00:07:21,860 --> 00:07:23,000 created. 93 00:07:23,120 --> 00:07:24,260 Where did we do that. 94 00:07:24,290 --> 00:07:27,250 We did it right here on this line of code. 95 00:07:27,260 --> 00:07:34,190 This is the line of code that evaluated all the initialization operations that we had above and then 96 00:07:34,280 --> 00:07:37,040 allowed us to start our session. 97 00:07:37,040 --> 00:07:43,220 The reason we have to run this line of code is because prior to this line being executed none of the 98 00:07:43,220 --> 00:07:46,220 variables actually hold any value. 99 00:07:46,220 --> 00:07:53,930 The tensor flow variables only get their values after the initial analyzer is evaluated and that happens 100 00:07:53,960 --> 00:07:54,980 on this line here. 101 00:07:55,430 --> 00:08:01,310 It's from this point onwards when we're inside a session that we can actually evaluate a particular 102 00:08:01,310 --> 00:08:05,720 tensor and look at its values be three. 103 00:08:05,750 --> 00:08:07,690 Remember it was one of our biases. 104 00:08:07,730 --> 00:08:12,610 These were the biases in the output layer which all had a starting value of zero. 105 00:08:12,620 --> 00:08:17,900 And this is the part of the discussion where we can start talking about this thing called a session 106 00:08:18,490 --> 00:08:23,930 a tensor flow session is when our place holders can start getting their values. 107 00:08:23,930 --> 00:08:30,560 Remember our place holders we credit those at the very top and we had a place holder for our data for 108 00:08:30,560 --> 00:08:35,180 our features and we had a place holder for our labels. 109 00:08:35,180 --> 00:08:38,600 We created these two tenses and the very beginning. 110 00:08:38,600 --> 00:08:41,500 So let's find them on the tensor flow graph. 111 00:08:41,510 --> 00:08:44,740 This here is the place holder for our X's. 112 00:08:44,990 --> 00:08:49,790 And the reason I know this is because after this creation operation we end up with a little tensor here 113 00:08:50,210 --> 00:08:56,810 that has a shape of question mark by seven hundred and eighty four and that corresponds to the shape 114 00:08:56,900 --> 00:08:58,630 that we've got right here. 115 00:08:58,790 --> 00:09:04,750 None for the question mark and seven hundred and eighty four for the total number of inputs and how. 116 00:09:04,760 --> 00:09:11,500 The thing about place holders is that place holders must be fed and they must be fed during the session. 117 00:09:11,570 --> 00:09:13,700 If you want tensor flow to do some work. 118 00:09:13,700 --> 00:09:19,520 This is my mental bridge the place holders are hungry and you have to feed them place holders will do 119 00:09:19,520 --> 00:09:20,860 your work for food. 120 00:09:21,110 --> 00:09:22,630 How do we feed them. 121 00:09:22,630 --> 00:09:26,710 Well we feed them with a feed dictionary. 122 00:09:26,710 --> 00:09:29,520 The feed dictionary is what we supply. 123 00:09:29,540 --> 00:09:31,220 When we run a session. 124 00:09:31,280 --> 00:09:34,010 You can see that we do this right here. 125 00:09:34,010 --> 00:09:40,220 The reason why the feed dictionary is important is because it maps the place holder are X and it maps 126 00:09:40,310 --> 00:09:44,280 are other place holder are y to the actual data. 127 00:09:44,420 --> 00:09:50,000 In this case it's our batch of training features and on this line it's the features that are part of 128 00:09:50,000 --> 00:09:51,750 our evaluation dataset. 129 00:09:51,770 --> 00:09:54,920 The important thing is that the shapes match right. 130 00:09:54,950 --> 00:09:57,870 This is the kind of check the tensor flow will actually do. 131 00:09:58,040 --> 00:10:01,880 Does the place holder shape match what we're mapping it to. 132 00:10:01,940 --> 00:10:10,460 So if you recall the shape of this batch underscore X was 1000 because we had 1000 samples in our batch 133 00:10:10,940 --> 00:10:15,380 by seven hundred and eighty four features down here. 134 00:10:15,410 --> 00:10:16,900 We had a different shape. 135 00:10:17,000 --> 00:10:20,990 We had 10000 samples instead of 1000. 136 00:10:20,990 --> 00:10:22,980 But we had the same number of features. 137 00:10:23,000 --> 00:10:25,430 Seven hundred and eighty four. 138 00:10:25,430 --> 00:10:32,420 The reason we can map both of these to the same place holder is because our place holder is willing 139 00:10:32,420 --> 00:10:38,520 to accept a different number of samples as long as the number of features are consistent. 140 00:10:38,540 --> 00:10:43,580 So by leaving this part of the shape blank we're able to feed it a different number of samples. 141 00:10:43,670 --> 00:10:46,740 As long as we're consistent on the other dimension. 142 00:10:46,940 --> 00:10:52,430 So this is something to bear in mind when you're creating your feed dictionary tensor flow will actually 143 00:10:52,430 --> 00:10:58,160 check that the shape of the data matches with the shape that you specified for the place holder. 144 00:10:58,670 --> 00:11:05,500 Once we're done running our session the last important thing that we did was release our resources. 145 00:11:05,660 --> 00:11:07,340 So we started our session. 146 00:11:07,340 --> 00:11:10,540 We ran our session did a bunch of calculations. 147 00:11:10,670 --> 00:11:16,880 So at the end we close our session closing our session frees up all the resources that we used. 148 00:11:16,880 --> 00:11:22,010 So I know that we've written a lot of code and we didn't really get to see it in action until we ran 149 00:11:22,010 --> 00:11:23,730 our session at the very end. 150 00:11:23,750 --> 00:11:29,750 So I hope this quick review tie things together and it allowed us to see the connection between the 151 00:11:29,750 --> 00:11:33,130 graph and our Python code. 152 00:11:33,170 --> 00:11:38,750 What I'd like to do now is actually clean this graph up because one thing that you'll notice is that 153 00:11:38,750 --> 00:11:43,800 the names in it variable variable underscore one variable underscore two. 154 00:11:43,880 --> 00:11:45,770 They're not terribly helpful right. 155 00:11:46,190 --> 00:11:52,760 So we can actually clean our graph up and give these parts of the graph a better name that will make 156 00:11:52,760 --> 00:11:55,180 it a lot more clear as to what's going on. 157 00:11:55,190 --> 00:12:01,850 Let's come to the very top where we're creating our place holders and I'll hit shift tab on my keyboard 158 00:12:02,480 --> 00:12:05,120 and I'll bring up the quick documentation. 159 00:12:05,120 --> 00:12:10,130 If you take a look here you see that there is an additional parameter that we can supply and that is 160 00:12:10,310 --> 00:12:16,670 the name the description for this name parameter is the name for the operation and they say this is 161 00:12:16,700 --> 00:12:19,840 optional which is why we haven't used it so far. 162 00:12:20,510 --> 00:12:26,840 But I do think naming the different operations is quite helpful because it makes the graph at the end 163 00:12:26,930 --> 00:12:28,910 intensive board a lot more clear. 164 00:12:29,480 --> 00:12:37,830 So a lot of come here and then I'll type name as equal to and I'll give it a capital X as the name for 165 00:12:37,830 --> 00:12:38,470 the wine. 166 00:12:38,550 --> 00:12:40,290 I'm going to add a name as well. 167 00:12:40,440 --> 00:12:47,300 And in this case uh I pick something else I might pick labels as the name. 168 00:12:47,400 --> 00:12:49,030 But let's not stop that. 169 00:12:49,080 --> 00:12:55,530 Let's come down where we're setting up our layers our weights and our biases and if we hit shift tab 170 00:12:55,650 --> 00:13:01,770 here again and bring up the quick documentation for the tensor flow variable we also see that it is 171 00:13:01,770 --> 00:13:04,530 able to accept this name parameter. 172 00:13:04,530 --> 00:13:14,220 So let's give it one name is equal to single quotes w 1 and this bias will stick with the same variable 173 00:13:14,220 --> 00:13:14,860 names. 174 00:13:14,940 --> 00:13:22,460 So name is equal to single quotes B one we can do the same thing for all our other layers. 175 00:13:22,470 --> 00:13:24,430 So this one will call. 176 00:13:24,480 --> 00:13:26,780 Name is equal to W2. 177 00:13:26,850 --> 00:13:29,760 This one will call names equal to single quotes. 178 00:13:29,770 --> 00:13:39,870 B 2 and we've got name is equal to w 3 and here name is equal to be three. 179 00:13:39,960 --> 00:13:43,960 Now what I'll do is I'll delete all the sub directories that I've got here. 180 00:13:44,070 --> 00:13:46,900 Intensive board amnesty digital logs. 181 00:13:47,220 --> 00:13:55,010 And I'm going to come back into Jupiter and I'm going to run all the cells below my setup markdown cell. 182 00:13:55,050 --> 00:14:00,500 At this point I should have a brand new sub folder here and I can go into my tensor board. 183 00:14:00,900 --> 00:14:04,190 Let's check our graph and see if we can see our new names. 184 00:14:04,200 --> 00:14:09,870 The first thing I see is a message about no data set being found but I'm going to ignore that and I'm 185 00:14:09,870 --> 00:14:17,160 going to zoom in here and I can see now that my place holder was created here and here I've created 186 00:14:17,160 --> 00:14:23,040 the weights for the first hidden layer here I've created the biases for my first hidden layer and they're 187 00:14:23,040 --> 00:14:25,080 flowing to my add function. 188 00:14:25,140 --> 00:14:27,020 This is my second hidden layer. 189 00:14:27,120 --> 00:14:29,130 This here is my output layer. 190 00:14:29,370 --> 00:14:33,030 My output layer flows to my soft Max activation function. 191 00:14:33,030 --> 00:14:39,660 And then I calculate my cross entropy loss using the labels that I've created here. 192 00:14:39,660 --> 00:14:48,270 So these are the actual labels and those alongside with whatever comes out of my output layer gets used. 193 00:14:48,270 --> 00:14:52,810 When we calculate the loss with our cross entropy loss function. 194 00:14:52,920 --> 00:14:53,750 Fantastic. 195 00:14:54,030 --> 00:14:57,880 So that already makes things a lot more clear than they were before. 196 00:14:58,080 --> 00:15:00,830 But I think we can do better than that still. 197 00:15:00,840 --> 00:15:06,480 One of the things that we can do is we can start grouping some of these calculations to make our graph 198 00:15:06,510 --> 00:15:13,170 even easier to understand and the things that when a group are actually all the calculations that belong 199 00:15:13,170 --> 00:15:14,250 to the same layer. 200 00:15:14,700 --> 00:15:21,690 So for example I want to group the creation of my weights and the matrix multiplication and this addition 201 00:15:21,690 --> 00:15:23,940 of the biases and this activation. 202 00:15:24,090 --> 00:15:30,300 I want to group all of this into a single layer and after that I want to do the same thing for Layer 203 00:15:30,300 --> 00:15:33,240 number two and my output layer. 204 00:15:33,240 --> 00:15:36,550 And that's exactly what we're going to talk about in the next lesson. 205 00:15:36,570 --> 00:15:37,500 I'll see you there.