1 00:00:00,700 --> 00:00:01,120 All right. 2 00:00:01,150 --> 00:00:07,990 So in this lesson we're gonna be using our inception resonant model and making some predictions with 3 00:00:07,990 --> 00:00:08,890 it. 4 00:00:08,920 --> 00:00:12,450 So let's dive straight in so I'll add a markdown cell here. 5 00:00:12,880 --> 00:00:18,430 And this one going to read making predictions. 6 00:00:18,430 --> 00:00:26,530 So the question is how do we make a prediction with our inception model going to the carris documentation. 7 00:00:26,530 --> 00:00:29,690 We can click on model functional API. 8 00:00:30,220 --> 00:00:38,410 And here we will find a predictor method the predict method will require some input data as a num pi 9 00:00:38,470 --> 00:00:42,070 array or a list of an umpire race. 10 00:00:42,070 --> 00:00:44,980 We're only going to predict one image at a time. 11 00:00:44,980 --> 00:00:50,320 So the rest of the arguments we can leave as the defaults and we're not really going to be too concerned 12 00:00:50,320 --> 00:00:51,680 with them. 13 00:00:51,700 --> 00:00:54,120 All right so let's try and make a prediction. 14 00:00:54,430 --> 00:01:02,940 So I'll say Inception model not predict parentheses and then I have to pass in the array. 15 00:01:03,000 --> 00:01:06,180 Now let's go up here see what our array is called. 16 00:01:06,220 --> 00:01:08,070 Call it pick array. 17 00:01:08,440 --> 00:01:10,470 Very creative I know. 18 00:01:10,570 --> 00:01:15,610 So let's go down here and we'll just supply this and we'll see what happens. 19 00:01:15,850 --> 00:01:18,660 We in fact will get an error if we try this. 20 00:01:18,760 --> 00:01:24,860 So when I move this over slightly scroll down and read the error message. 21 00:01:25,120 --> 00:01:33,520 Error when checking input expected the input to have four dimensions but got an array with three dimensions. 22 00:01:33,520 --> 00:01:36,460 This is what the error message is telling us right here. 23 00:01:37,540 --> 00:01:42,100 So it seems like we can't do just this. 24 00:01:42,100 --> 00:01:48,300 And the reason is is that this predict method will require an extra dimension. 25 00:01:48,370 --> 00:01:53,590 But what would be in this extra dimensional what kind of information would be in this extra dimension. 26 00:01:53,590 --> 00:01:59,650 Well imagine that you were looking to predict more than one image then what you would do is you would 27 00:01:59,650 --> 00:02:01,450 have a whole list that you would supply. 28 00:02:02,410 --> 00:02:05,110 So this essentially would be that extra dimension. 29 00:02:05,530 --> 00:02:09,730 So it's not going to just be the height of the pixels the width of the pixels and the colors of the 30 00:02:09,730 --> 00:02:10,630 pixels. 31 00:02:10,630 --> 00:02:13,930 It's also gonna be which image you're actually using. 32 00:02:14,080 --> 00:02:17,410 That's going to be that fourth piece of information. 33 00:02:17,410 --> 00:02:21,670 So the question is how do I get that extra dimension for our picture array. 34 00:02:22,480 --> 00:02:24,350 Well I you use num PI for this. 35 00:02:24,370 --> 00:02:33,970 So up here under our pre processing I'll add another cell and I'll say expanded expanded is equal to 36 00:02:34,190 --> 00:02:34,820 NPR. 37 00:02:34,840 --> 00:02:44,740 Don't expand underscore DIMIA use parentheses and then I'll provide our picture array here and I'll 38 00:02:44,740 --> 00:02:54,250 say axis is equal to zero and then we'll see expanded dot shape. 39 00:02:54,250 --> 00:02:58,330 And this will show us that we now have this extra dimension here. 40 00:02:58,870 --> 00:02:59,820 It's just the one. 41 00:03:00,010 --> 00:03:03,830 But this would allow us to meet that requirement of the predicate method. 42 00:03:04,390 --> 00:03:13,870 So let's try to let's come down here and instead of pick underscore array I'll put an expanded and it 43 00:03:13,870 --> 00:03:18,440 shift enter at this point I'll get a different era. 44 00:03:18,730 --> 00:03:27,890 The arrow reads the expected input is expected to have a shape of 2 9 9 2 9 9 and then 3. 45 00:03:27,970 --> 00:03:36,840 But instead I'll picture had the following shape 2 5 6 by 2 5 6 by 3. 46 00:03:36,880 --> 00:03:38,290 Why is that. 47 00:03:38,290 --> 00:03:46,180 Well coming back to the documentation and going back to our inception resonant version to model I can 48 00:03:46,180 --> 00:03:54,330 see that the default size that this model expects is 2 9 9 by 2 9 9. 49 00:03:54,340 --> 00:04:02,790 Now again it looks like we have to change our input data in order to run this predict method. 50 00:04:02,890 --> 00:04:10,180 So let's get back up here and we've done the first step which was expanded our array. 51 00:04:10,180 --> 00:04:16,690 Now all we need to do is change this resolution of the picture and we can actually do this with our 52 00:04:16,690 --> 00:04:18,500 load image method. 53 00:04:18,700 --> 00:04:26,980 Check it out if I put my curse on here and hit tab then I can bring up the documentation inside the 54 00:04:26,980 --> 00:04:29,320 Google collab notebook. 55 00:04:29,320 --> 00:04:34,110 And one thing that you'll see here is this target on this course size input. 56 00:04:34,600 --> 00:04:36,450 At the moment we've left it blank. 57 00:04:36,520 --> 00:04:44,500 So we've got our original image size but we can also supply a height and a width for our desired dimensions. 58 00:04:44,680 --> 00:04:46,030 So let's try it out. 59 00:04:46,030 --> 00:04:56,050 So I want to put a comma here and I'll say Target underscore size and then parentheses 2 9 9 comma 2 60 00:04:56,230 --> 00:05:03,600 9 9 limit shift into we see that our image has just been scaled up. 61 00:05:03,730 --> 00:05:07,120 It's become a little bit more pixilated but that's fine. 62 00:05:07,120 --> 00:05:10,600 We've reached our target resolution. 63 00:05:10,600 --> 00:05:12,610 We've scaled up our image slightly. 64 00:05:12,970 --> 00:05:20,200 And then what we'll do is refresh this cell just to check its dimensions and indeed it's 2 9 9 by 2 65 00:05:20,200 --> 00:05:23,190 9 9 by 3 because we still have those three color channels. 66 00:05:23,860 --> 00:05:30,590 And then when we expand dimensions we go from three dimensions to four dimensions. 67 00:05:30,670 --> 00:05:31,180 Brilliant. 68 00:05:31,330 --> 00:05:38,950 So let's come down here and try again see if our predict method gives us another era. 69 00:05:38,950 --> 00:05:47,970 Let's try and so in this case we actually get some output and it looks really nonsensical. 70 00:05:47,970 --> 00:05:48,470 Right. 71 00:05:48,560 --> 00:05:55,800 In scientific notation but this here is the raw output for our prediction. 72 00:05:55,870 --> 00:06:03,790 All we need to do now is decode this prediction and carers actually has a very handy method for us going 73 00:06:03,790 --> 00:06:06,430 back up to our import statements. 74 00:06:06,430 --> 00:06:14,800 We can go to the same line where we imported our model Inception resonant version to put a comma there 75 00:06:15,070 --> 00:06:19,200 and say decode underscore predictions. 76 00:06:19,270 --> 00:06:29,350 So let's shift enter on that scroll all the way back down and then let's store this output in a variable 77 00:06:29,350 --> 00:06:38,140 called prediction then we can start a new line and see decode underscore predictions and provide this 78 00:06:38,140 --> 00:06:40,810 prediction as an input. 79 00:06:40,810 --> 00:06:41,380 There we go. 80 00:06:42,190 --> 00:06:42,840 That's it. 81 00:06:42,880 --> 00:06:51,400 Shift enter and see what we get what we see now are the top five predictions we see something called 82 00:06:51,400 --> 00:06:52,840 the class name. 83 00:06:52,900 --> 00:06:56,850 So this is the category name that's internal to the model. 84 00:06:57,100 --> 00:07:00,040 Then we see the class description. 85 00:07:00,130 --> 00:07:02,620 Basically what this means in English. 86 00:07:02,620 --> 00:07:05,530 And in this case it's comic book. 87 00:07:05,530 --> 00:07:07,780 And then we see the probability here. 88 00:07:07,840 --> 00:07:11,190 This is the third line item for this very first prediction. 89 00:07:11,680 --> 00:07:21,240 And here the probability is equal to 1 that this is a comic book even though we know it's an umbrella. 90 00:07:21,280 --> 00:07:25,240 The other predictions all have a probability of zero. 91 00:07:26,110 --> 00:07:29,180 So something is still going wrong. 92 00:07:29,500 --> 00:07:31,440 And I can tell you what it is. 93 00:07:31,690 --> 00:07:40,020 It's the fact that our data isn't formatted in the way that our model expected to be we're actually 94 00:07:40,020 --> 00:07:42,430 missing another step. 95 00:07:42,450 --> 00:07:47,650 So scrolling back up here I'm going to quickly close this. 96 00:07:47,820 --> 00:07:56,620 Put my cursor here at a comma and then I'll add pre process underscore input. 97 00:07:56,880 --> 00:08:04,440 I'm going to import the pre process input function from the caris applications Inception underscore 98 00:08:04,440 --> 00:08:06,700 resonant underscore V2. 99 00:08:06,870 --> 00:08:14,820 Now to that I've done this I can shift enter on the cell scroll down here and under pre processing images 100 00:08:14,950 --> 00:08:26,790 I'm going to say pre processed is equal to pre process underscore input and I provide the expanded num 101 00:08:26,790 --> 00:08:34,860 pi array this will format the data in the way that our model expects it to be formatted in so late shift 102 00:08:34,890 --> 00:08:44,550 enter I'll scroll down here and I want to make my prediction not using our expanded number hiring. 103 00:08:44,670 --> 00:08:53,820 I'm going to make my prediction using our pre processed pie array and if I refresh this now we should 104 00:08:53,820 --> 00:08:55,640 get some more sensible output. 105 00:08:55,770 --> 00:09:04,260 And indeed we do umbrella with a 3 percent probability and then mountain tent with almost zero percent 106 00:09:04,260 --> 00:09:07,550 probability on place two. 107 00:09:07,710 --> 00:09:10,670 So that worked pretty well right. 108 00:09:10,680 --> 00:09:17,760 The important takeaway here is that every single model that you use out of the box expects the data 109 00:09:17,790 --> 00:09:20,560 to be formatted in a particular way. 110 00:09:20,640 --> 00:09:24,330 And what we've seen is that to an extent the documentation helps us. 111 00:09:24,480 --> 00:09:31,500 It will provide some information on this front but also the error messages in the call up notebook will 112 00:09:31,680 --> 00:09:37,930 also give us some more feedback and some more hints as to what to change and what we did wrong. 113 00:09:37,950 --> 00:09:45,690 Perhaps the most difficult thing to discover is what happens when the neural network does its calculations 114 00:09:45,900 --> 00:09:50,520 gives us an output but the output actually doesn't make any sense. 115 00:09:50,670 --> 00:09:54,930 And this is what happened when we forgot to pre process our data and the right way. 116 00:09:56,060 --> 00:10:02,550 So in order that this doesn't happen again I'd like to pose a challenge to you as a challenge. 117 00:10:02,720 --> 00:10:09,330 Can you create a function called format underscore image and a score Inception resonate. 118 00:10:09,470 --> 00:10:16,670 That takes a final name as an argument and then this function will need to load the image and the default 119 00:10:16,700 --> 00:10:19,220 resolution for Inception resonant. 120 00:10:19,220 --> 00:10:27,290 Convert the image to an array and then return to pre processed image for the Inception resonant version 121 00:10:27,300 --> 00:10:29,780 to a model essentially. 122 00:10:29,840 --> 00:10:36,680 Can we summarize all the pre processing steps in a simple function so that we can call this function 123 00:10:36,800 --> 00:10:43,130 over and over again for different file names and try and predict what the classification is for other 124 00:10:43,130 --> 00:10:44,650 images. 125 00:10:44,750 --> 00:10:48,470 I'll give you a few seconds to pause the video before I show you the solution 126 00:10:51,870 --> 00:10:53,610 here we go. 127 00:10:53,610 --> 00:10:57,070 We'll go deaf if the key word for defining a function. 128 00:10:57,210 --> 00:11:00,480 And then we'll say format underscore image underscore. 129 00:11:00,810 --> 00:11:10,110 Inception resonant will open the parentheses and then file name shall be our argument semicolon there 130 00:11:10,950 --> 00:11:18,660 and then inside the function will divide up the work we'll see the picture that's gonna be equal to 131 00:11:19,110 --> 00:11:21,030 load underscore image. 132 00:11:21,030 --> 00:11:27,330 So this is the caris function and it shall load the image based on the file name which is effectively 133 00:11:27,360 --> 00:11:33,530 the relative path and then the target size. 134 00:11:33,750 --> 00:11:39,870 It's gonna be equal to a tuple 29 buying 2 9 9. 135 00:11:39,870 --> 00:11:49,530 Then we're going to save all of this information in an array Pick underscore a are r equal to image 136 00:11:49,750 --> 00:11:54,660 on a score to a score array which is going to take our picture. 137 00:11:55,230 --> 00:11:58,880 But remember we have to expand it by one dimension. 138 00:11:59,010 --> 00:12:01,650 So expanded. 139 00:12:01,950 --> 00:12:14,750 It's gonna be num PI's job MP dot expand on the score dims parentheses pick the score a r comma. 140 00:12:15,090 --> 00:12:29,070 Axis is equal to zero and then pre processed input shall be equal to pre process underscore input parentheses 141 00:12:29,580 --> 00:12:42,170 expanded and finally we can return the pre processed on a score input this value right here. 142 00:12:42,160 --> 00:12:47,520 Now if we want to make this a little bit more succinct we can even take this bit of code here. 143 00:12:47,520 --> 00:12:48,720 Cut it. 144 00:12:48,720 --> 00:12:53,860 Take this one here replace it and delete the rest of this line. 145 00:12:53,880 --> 00:12:55,080 There we go. 146 00:12:55,080 --> 00:12:56,530 That does it. 147 00:12:56,730 --> 00:13:04,140 And this pretty much summarizes all the pre processing steps that we've done in separate cells above. 148 00:13:04,140 --> 00:13:10,360 Now we've got them all in one place and we can easily call this function for the other images. 149 00:13:10,530 --> 00:13:13,040 So let's try this out right now. 150 00:13:13,110 --> 00:13:20,790 I'll store the pre processed input in a variable called data and then I'll say format underscore image 151 00:13:20,840 --> 00:13:21,260 on a score. 152 00:13:21,270 --> 00:13:31,410 Inception resonate and I'll provide a file number to our prediction store in a variable called prediction 153 00:13:31,920 --> 00:13:41,910 which will be equal to Inception underscore model thought predict parentheses data and just so that 154 00:13:41,910 --> 00:13:46,470 we can see what the images which the predictions pertain to. 155 00:13:46,470 --> 00:13:57,150 I'll say display parentheses load on the square image parentheses file underscore 2 and just below that 156 00:13:57,300 --> 00:14:06,390 I'll decode the predictions so decode underscore predictions parentheses prediction. 157 00:14:06,390 --> 00:14:06,980 There we go. 158 00:14:07,440 --> 00:14:10,760 Let's run this and see what happens. 159 00:14:10,770 --> 00:14:18,570 So here we see a typical wedding photograph and we've got this at the seaside but the number one prediction 160 00:14:18,660 --> 00:14:26,610 seems to be groom with 70 percent followed by gown with 11 percent I think rapeseed is a little bit 161 00:14:26,610 --> 00:14:33,140 a weird one but Inception resident seems to detect something about the picture that suggests a sandbar 162 00:14:33,140 --> 00:14:39,500 bar which you often find by the beach which I think is pretty good too how I leave it to you to try 163 00:14:39,540 --> 00:14:42,210 the predictions and a couple of other of these files. 164 00:14:42,210 --> 00:14:43,760 Some of them work better. 165 00:14:43,760 --> 00:14:49,710 Some of them work less well and I'll leave it to you to find out which ones those are. 166 00:14:50,430 --> 00:14:58,650 But I think what would be more interesting to do right now is to go back to the carer's home page and 167 00:14:58,950 --> 00:15:04,480 look at what other models are available because we've got one of them working and for good measure. 168 00:15:04,530 --> 00:15:07,320 I think we should get another one working as well. 169 00:15:07,350 --> 00:15:11,390 In particular I'd be interested to set this as a challenge to you. 170 00:15:11,490 --> 00:15:18,930 I'd like you to load this veggie 19 model and make a prediction on some of the sample images that we've 171 00:15:18,930 --> 00:15:25,130 already uploaded to our collab notebook so pack in our collab notebook. 172 00:15:25,350 --> 00:15:35,260 I'll add another markdown cell and this one is gonna read testing the veggie 19 model. 173 00:15:35,520 --> 00:15:43,650 Now Fiji actually stands for Visual geometry group and this model was created by a couple of researchers 174 00:15:43,980 --> 00:15:50,190 out of Oxford University in the UK and it was also trained on image net hand. 175 00:15:50,220 --> 00:15:54,000 We can try it out with the weights and see how we get on. 176 00:15:54,230 --> 00:16:01,580 What I'd like you to do as a challenge is use this g 19 model from Kara's loaded with the image net 177 00:16:01,580 --> 00:16:06,350 weights and I'd like you to make a prediction on some of these files. 178 00:16:06,380 --> 00:16:11,390 Some of these sample images that we've uploaded to our collab notebook. 179 00:16:11,570 --> 00:16:17,380 Now the way to figure out how to do this is to take a close look at the official documentation. 180 00:16:17,570 --> 00:16:22,970 You're going to have to process the data for this neural network in order to get a prediction that makes 181 00:16:22,970 --> 00:16:24,290 sense. 182 00:16:24,290 --> 00:16:31,340 So have a look at this documentation and see what we've done previously with our inception resonant 183 00:16:31,340 --> 00:16:34,640 model because the steps are going to be very very similar. 184 00:16:35,450 --> 00:16:39,440 There's a few tricks to be aware of to get this right. 185 00:16:39,510 --> 00:16:40,760 I'll see you in a bit.