1 00:00:00,270 --> 00:00:05,010 So we've seen how we can evaluate our models performance or at least initial performance on something 2 00:00:05,010 --> 00:00:10,000 like epoch accuracy an epoch loss and see how we can track different experiments. 3 00:00:10,470 --> 00:00:15,270 But this is just on the models training and validation sets. 4 00:00:15,270 --> 00:00:18,450 And it's only really on what the model has learned in the data set. 5 00:00:18,450 --> 00:00:23,850 What are other really good way to evaluate what our model has learned is to use it for what we're building 6 00:00:23,850 --> 00:00:24,690 it for. 7 00:00:24,690 --> 00:00:28,980 If we're building dog vision and where you're always going to keep in mind what the end goal is with 8 00:00:28,980 --> 00:00:34,500 whatever project we're working on we're building dog vision to see if we can build a machine learning 9 00:00:34,500 --> 00:00:37,620 model that can identify a dog breed in a photo. 10 00:00:38,070 --> 00:00:42,660 So it's going to make some predictions giving a photo so let's do that if we come back to have a look 11 00:00:42,660 --> 00:00:44,240 at our workflow. 12 00:00:44,280 --> 00:00:46,050 We fit the model to the data. 13 00:00:46,050 --> 00:00:47,510 Now it's time to make a prediction. 14 00:00:47,520 --> 00:00:51,700 So we're in we're still in Step 3 We've done a little bit we've touched on step four. 15 00:00:51,850 --> 00:00:54,310 And remember these aren't really linear steps. 16 00:00:54,360 --> 00:00:57,890 So these are just I've just put them here because it's easy to understand it like this. 17 00:00:57,900 --> 00:01:03,140 But remember they aren't necessarily linear so you can do them out of order. 18 00:01:03,300 --> 00:01:14,890 We come back we'll make a little heading making and evaluating predictions using a trained model command 19 00:01:14,900 --> 00:01:22,490 ma'am for turning it into markdown shift and into Okay so making predictions with a trained model is 20 00:01:22,490 --> 00:01:25,250 very similar to how we did it in socket loan. 21 00:01:25,250 --> 00:01:32,330 So we can call the predict function and pass it data in the same form that the model was trained on 22 00:01:32,330 --> 00:01:33,970 that's the important thing. 23 00:01:33,980 --> 00:01:41,110 Let's see the code first and then we'll discuss what's happening so make predictions on the validation 24 00:01:41,110 --> 00:01:42,380 data. 25 00:01:42,430 --> 00:01:46,800 So remember we created a validation data batch. 26 00:01:46,960 --> 00:01:50,010 Now this was not used to train on. 27 00:01:50,050 --> 00:01:51,210 That's the important point. 28 00:01:51,220 --> 00:01:56,680 If we come back to our three sets how we evaluate a model is because it trains in the training set. 29 00:01:56,740 --> 00:01:57,790 We want to check it out. 30 00:01:57,820 --> 00:02:03,880 Its initial results on the validation set and then both of these exams so the practice exam and the 31 00:02:03,880 --> 00:02:08,180 final exam are data sets that the model hasn't seen before. 32 00:02:08,200 --> 00:02:09,380 So we come back. 33 00:02:09,460 --> 00:02:10,860 Let's go. 34 00:02:11,380 --> 00:02:14,530 Model or actually we'll save it to predictions predictions. 35 00:02:14,560 --> 00:02:23,850 Equal model don't predict Val data because let's just remind ourselves of what value data is the data 36 00:02:28,050 --> 00:02:29,130 batch data set. 37 00:02:29,130 --> 00:02:29,730 There we go. 38 00:02:29,730 --> 00:02:35,460 We've got some images and some labels so we know what the true labels are of the validation. 39 00:02:35,460 --> 00:02:42,060 So when we pass this batch data set to model not predict because of the nature of the predict function 40 00:02:42,390 --> 00:02:48,900 it's only going to look at the images in our data and then make predictions based on those images. 41 00:02:48,900 --> 00:02:54,840 So then what we can do is compare those predictions which are going to be in the form of labels while 42 00:02:54,930 --> 00:02:57,300 not exactly we're going to see what they look like in a second. 43 00:02:57,480 --> 00:03:04,080 We can compare the models predictions that it makes on the validation images to the actual labels from 44 00:03:04,080 --> 00:03:06,610 the validation images. 45 00:03:06,620 --> 00:03:08,150 Now that was a lot of talking. 46 00:03:08,210 --> 00:03:10,850 It's much better to see this in encode. 47 00:03:11,900 --> 00:03:15,010 So verbose is just going to say hey when you're making predictions. 48 00:03:15,080 --> 00:03:16,280 Show me your progress 49 00:03:21,080 --> 00:03:23,580 while that was really quick because we've got 200 images. 50 00:03:23,600 --> 00:03:28,430 So that's the beauty of working on a GP you predictions are really fast as well. 51 00:03:28,550 --> 00:03:33,740 And so that only took about 187 milliseconds so not even a whole second. 52 00:03:33,740 --> 00:03:39,110 And if we didn't set verbose to equal one we wouldn't get this little progress bar at putting you might 53 00:03:39,110 --> 00:03:42,300 be looking at this and going what is going on here. 54 00:03:42,320 --> 00:03:44,610 Lots of just different numbers. 55 00:03:44,840 --> 00:03:51,920 So let's look at the shape here prediction shape for remember. 56 00:03:52,070 --> 00:03:53,760 Where does this line up. 57 00:03:53,790 --> 00:04:02,340 So if we go Len y Val two hundred. 58 00:04:02,790 --> 00:04:03,980 So there's the first shape. 59 00:04:03,980 --> 00:04:06,430 So two hundred images. 60 00:04:06,550 --> 00:04:08,980 Where do you think the second shape is coming from. 61 00:04:08,980 --> 00:04:14,150 Hundred and twenty where have we seen that before. 62 00:04:14,540 --> 00:04:16,160 Len unique breeds 63 00:04:18,690 --> 00:04:19,780 120. 64 00:04:19,920 --> 00:04:26,630 So this means that we've got an array of 200 by 120 so we have 200. 65 00:04:26,700 --> 00:04:37,880 Let's see the first element so we have 200 go predictions zero two hundred arrays of one hundred and 66 00:04:37,880 --> 00:04:41,490 twenty different numbers that are really small. 67 00:04:41,680 --> 00:04:45,150 And we're wondering what is going on here. 68 00:04:45,490 --> 00:04:50,840 Well let's find out the length of this. 69 00:04:51,050 --> 00:04:52,510 Hundred and twenty. 70 00:04:52,590 --> 00:04:53,520 Wonderful. 71 00:04:53,520 --> 00:04:58,970 So what this actually is is an associated probability 72 00:05:01,300 --> 00:05:03,180 for the likeliness. 73 00:05:03,190 --> 00:05:10,450 So basically what our model thinks a certain image is so there's a probability value here for every 74 00:05:10,450 --> 00:05:12,010 single label. 75 00:05:12,010 --> 00:05:15,640 So the value here the highest value in this predictions array. 76 00:05:15,670 --> 00:05:17,740 So the zero predictions array. 77 00:05:18,100 --> 00:05:19,200 This is for one image. 78 00:05:19,210 --> 00:05:26,570 The first image in the validation data so the highest value in here is going to correspond to the index 79 00:05:26,660 --> 00:05:31,790 of the label that the model thinks is most likely. 80 00:05:31,960 --> 00:05:36,430 And again this is a lot of talking it's gonna make a bit more sense once we start to put it together 81 00:05:36,430 --> 00:05:39,490 with code but just bear with me for a second. 82 00:05:39,580 --> 00:05:46,030 If we sum up all of these they're going to equal very close to 1 or maybe 1 Exactly and I say very close 83 00:05:47,010 --> 00:05:51,810 because the way computer store numbers is not exactly. 84 00:05:51,870 --> 00:05:57,830 So if you keep going decimal points right it's gonna get very close to one but it might not be exactly 85 00:05:57,830 --> 00:06:02,330 one and you might be saying Daniel what are you telling me there's the way the computer store numbers 86 00:06:02,330 --> 00:06:03,400 is not exact. 87 00:06:03,410 --> 00:06:10,190 Well I don't even fully understand it but you just have to know that when a computer stores a number 88 00:06:10,190 --> 00:06:15,890 that has lots of decimals because of the way it's most efficient to actually stored on a computer chip. 89 00:06:15,890 --> 00:06:18,420 It's not going to be perfectly exact. 90 00:06:18,440 --> 00:06:24,350 So that's why I say when you sum up a single prediction array it's going to be very close to one and 91 00:06:24,350 --> 00:06:29,510 now I'm going to tie this back up into where we've come back into our model. 92 00:06:29,510 --> 00:06:33,380 This is why we've used soft Max activation. 93 00:06:33,380 --> 00:06:34,750 You might be saying Daniel far out. 94 00:06:34,760 --> 00:06:36,220 This is a video as ago go. 95 00:06:36,230 --> 00:06:40,640 You've already told us about soft Max it didn't really make sense and it's really not making sense now. 96 00:06:40,640 --> 00:06:47,090 But if we go back to soft Max we just go Yeah how about we go what is soft Max. 97 00:06:47,090 --> 00:06:56,580 If we come back to our friendly Wikipedia page it's going to tell us that but after applying soft Max 98 00:06:56,670 --> 00:07:04,840 each component will be in the interval between 0 to 1 and the components will add up to 1 so that is 99 00:07:04,840 --> 00:07:06,040 what we're getting here. 100 00:07:07,070 --> 00:07:14,310 We're getting our components because we've used a soft Max activation in the last layer of our network. 101 00:07:14,660 --> 00:07:24,300 It's output it an array a prediction array that is 120 in length and all of these values in prediction 102 00:07:24,300 --> 00:07:26,770 0 and each other. 103 00:07:26,910 --> 00:07:27,630 Prediction here. 104 00:07:27,630 --> 00:07:30,440 So predictions 1 will be very close to 1 as well. 105 00:07:30,720 --> 00:07:38,390 So predictions 1 add up to 1 or very close to 1 0. 106 00:07:38,400 --> 00:07:39,950 Predictions. 107 00:07:39,960 --> 00:07:40,570 There we go. 108 00:07:42,890 --> 00:07:47,870 See this is just over 1 so very close to one of the more decimals there are the less exact a number 109 00:07:47,870 --> 00:07:48,900 is with a computer. 110 00:07:49,850 --> 00:07:55,300 And so let's put this all together because right now it's just been a lot of talking. 111 00:07:55,300 --> 00:07:56,730 I want to show you a concrete example. 112 00:07:56,740 --> 00:07:58,780 Let's take predictions zero. 113 00:07:58,870 --> 00:08:00,520 So we're going to remove these cells 114 00:08:04,190 --> 00:08:06,100 and just one little tidbit as well. 115 00:08:06,110 --> 00:08:09,380 Remember our mobile Net V2 architecture. 116 00:08:09,380 --> 00:08:15,260 All that that our soft Max layer is doing is taking these 1280 output. 117 00:08:15,290 --> 00:08:22,080 This array of numbers and changing it into an array of numbers like this of length one hundred and twenty 118 00:08:22,580 --> 00:08:25,060 using a soft Max activation. 119 00:08:25,280 --> 00:08:31,520 Because remember for four multi class classification problems we want a soft Max activation but if we 120 00:08:31,520 --> 00:08:34,900 were doing only single classification we'd use sigmoid. 121 00:08:35,170 --> 00:08:37,130 So to come back here. 122 00:08:37,130 --> 00:08:38,620 Let's check it out. 123 00:08:38,630 --> 00:08:45,470 So the first prediction I'm going to type out a few little indexing tricks here and I just want you 124 00:08:45,470 --> 00:08:51,020 to just watch on I'll talk through it a little bit as it goes but I want you to start thinking about 125 00:08:51,320 --> 00:08:54,630 how these things all tied together so let's do it. 126 00:08:55,910 --> 00:09:01,330 Because what our goal now is our model has made some predictions here. 127 00:09:01,430 --> 00:09:06,590 But our goal is to understand these predictions furthers our computer understands these right because 128 00:09:06,590 --> 00:09:07,610 they're old numeric. 129 00:09:07,610 --> 00:09:13,850 So remember at the start how we spent a lot of time turning our data into numbers kind of when you get 130 00:09:13,850 --> 00:09:17,280 to the output of a machine learning model right. 131 00:09:17,280 --> 00:09:20,910 So this is what we're focused on we spent a lot of time preparing our inputs. 132 00:09:20,910 --> 00:09:26,660 Now we're going to spend a lot of time preparing our outputs which is what we're doing now so let's 133 00:09:26,660 --> 00:09:38,680 go print prediction zero and then what we're going to do is go find the max value probability so remember 134 00:09:38,680 --> 00:09:44,540 the function inside get loan you might not call predict program which outputs a prediction probability. 135 00:09:44,590 --> 00:09:50,130 This is what predict does by default in tensor flow when we go here. 136 00:09:50,140 --> 00:09:56,410 Prediction and pay Max predictions zero. 137 00:09:56,410 --> 00:10:02,660 So we're finding let's remind ourselves prediction 0 what it looks like. 138 00:10:03,000 --> 00:10:11,420 We're finding the max value in here that's all that line does and then what are we going to do next. 139 00:10:11,830 --> 00:10:13,680 So print we want to. 140 00:10:15,070 --> 00:10:15,700 What else do we want. 141 00:10:15,700 --> 00:10:24,220 We want to find the sum so NDP are actually we might set this up so we can do multiple so index equals 142 00:10:24,220 --> 00:10:29,460 zero just so we can see it with multiple examples. 143 00:10:29,460 --> 00:10:37,260 So the sum sum is going to be NDP some predictions index. 144 00:10:37,260 --> 00:10:43,320 Nice and simple then we're gonna go wow that was a terrible typo. 145 00:10:43,320 --> 00:10:49,410 Daniel come on we're gonna go to the max index so find the index in this array which has the max value 146 00:10:49,980 --> 00:10:58,710 so which is equal to max value NDP ARG Max predictions index. 147 00:10:58,710 --> 00:11:01,350 And of course what we're doing now is a little bit tedious. 148 00:11:01,350 --> 00:11:03,310 We're going to function ize this later on. 149 00:11:03,330 --> 00:11:08,730 So you could imagine if we just kept writing cells like this this is where we're functioning using all 150 00:11:08,730 --> 00:11:15,600 of our code as much as we can so that we don't just have to continually write these type of print statements 151 00:11:16,080 --> 00:11:26,970 to explore our data or explore our predictions predictions index wonderful predicted label now before 152 00:11:26,970 --> 00:11:31,830 we run this cell we're going against our cardinal rule of if in doubt run the code once you do just 153 00:11:31,830 --> 00:11:36,070 have a think about what this one is doing. 154 00:11:36,290 --> 00:11:38,390 Let me remind you of what unique breeds is 155 00:11:44,210 --> 00:11:51,610 so these are all our dog breeds so if we have predictions and we find the max index of it and then we 156 00:11:51,610 --> 00:11:56,260 find where that index occurs in unique breeds What do you think that will return hint we've typed it 157 00:11:56,260 --> 00:11:58,520 out here all right. 158 00:11:58,530 --> 00:12:01,200 Let's delete this bad boy and then we'll shift into 159 00:12:06,480 --> 00:12:07,820 wonderful. 160 00:12:07,890 --> 00:12:09,890 So here's index 0. 161 00:12:10,020 --> 00:12:12,350 This is that array just the exact same here. 162 00:12:12,570 --> 00:12:15,060 The output of our model when we call predict. 163 00:12:15,060 --> 00:12:21,720 So the max value the probability of prediction the maximum this can be as one because remember the total 164 00:12:21,720 --> 00:12:29,710 of a soft max function is between 0 1 so the highest a single value can be is 1. 165 00:12:29,720 --> 00:12:37,320 So this is saying we're predicting the highest label in here with a twenty one point six percent prediction 166 00:12:37,320 --> 00:12:38,580 probability. 167 00:12:38,580 --> 00:12:41,760 Now the sum of all of these values is very close to 1. 168 00:12:41,850 --> 00:12:48,270 Again aligning with the soft Max rule between 0 and 1 and will add up to 1. 169 00:12:48,270 --> 00:12:54,330 But because of how computers calculate numbers not exactly 1 and the maximum index are where this value 170 00:12:54,330 --> 00:12:57,720 occurs is 17. 171 00:12:57,720 --> 00:13:01,100 So if we counted three we find 17 somewhere maybe there. 172 00:13:01,110 --> 00:13:02,940 That's it. 173 00:13:03,010 --> 00:13:06,180 Now the predicted label is border area. 174 00:13:06,190 --> 00:13:13,730 So for index 0 or for validation image 0 the predicted label is Border Terrier. 175 00:13:13,730 --> 00:13:14,870 Now let's change this up. 176 00:13:14,870 --> 00:13:16,830 What if we did. 177 00:13:17,180 --> 00:13:20,000 What if we did INDEX What's our favorite number 42 178 00:13:24,380 --> 00:13:28,220 so we got max value probability of prediction so this is a lot higher. 179 00:13:28,220 --> 00:13:31,350 This is seventy five percent so it's pretty confident with this one. 180 00:13:31,640 --> 00:13:37,310 So you could say the higher this value is to 1 the closer it is to 1 the more confident a model is about 181 00:13:37,310 --> 00:13:45,180 a certain prediction the sum is very close to 1 the maximum index is 113 and the predicted label is 182 00:13:45,180 --> 00:13:46,210 walking hand. 183 00:13:46,230 --> 00:13:55,860 So if we go here unique breeds if we were to find index number one hundred and thirteen we would find 184 00:13:55,860 --> 00:13:56,870 walk a hand. 185 00:13:56,940 --> 00:13:57,990 There we go walk around 186 00:14:01,020 --> 00:14:05,630 113 walk a hand. 187 00:14:05,710 --> 00:14:13,420 So this is how we convert our prediction probabilities in numeric form something that our computer understands 188 00:14:13,960 --> 00:14:17,560 to something that we understand a predicted label. 189 00:14:17,560 --> 00:14:24,220 So now we've got a kind of way to turn our predictions into something that we can understand what we'll 190 00:14:24,220 --> 00:14:31,120 start to do next is we'll create a bit of a function that's going to allow us to find the predicted 191 00:14:31,120 --> 00:14:39,460 labels for every sample in the validation set So about 200 or so images and then we're going to UN batch 192 00:14:39,730 --> 00:14:49,210 this dataset the validation data and compare the predictions in a visual way to the truth labels now 193 00:14:49,210 --> 00:14:53,250 again a lot of talking but in the next video we'll start to build out that functionality and we'll see 194 00:14:53,250 --> 00:14:55,070 what it looks like in reality. 195 00:14:55,230 --> 00:15:00,450 And I'm so excited right because one of the funnest parts is comparing your model's predictions with 196 00:15:00,450 --> 00:15:03,660 what's actually going on dog vision is coming to life.