1 00:00:00,630 --> 00:00:07,260 In this lesson we're getting up to the final and last part of this module the evaluation stage of our 2 00:00:07,260 --> 00:00:08,520 model. 3 00:00:08,520 --> 00:00:10,160 It's been quite a journey. 4 00:00:10,230 --> 00:00:12,040 We've formulated our question. 5 00:00:12,090 --> 00:00:13,800 We gathered our data. 6 00:00:13,830 --> 00:00:20,280 We've pre processed and clean their data and we've explored and visualize that as well. 7 00:00:20,310 --> 00:00:26,370 Then we spend quite a bit of time training three different versions of our model looking at dropout 8 00:00:26,550 --> 00:00:34,150 regularization early stopping and examining the performance of our neural networks. 9 00:00:34,170 --> 00:00:37,360 Now it's time to move on to the evaluation stage. 10 00:00:37,440 --> 00:00:43,560 Let's analyze our favorite neural network in a bit more detail in the module where we covered our naive 11 00:00:43,590 --> 00:00:45,180 bayes classifier. 12 00:00:45,180 --> 00:00:51,000 We looked at three metrics in addition to the accuracy for evaluating our classifier. 13 00:00:51,000 --> 00:00:53,480 The first was the recall school. 14 00:00:53,490 --> 00:01:00,180 The second was the precision and a third was a combination of the two namely the F school. 15 00:01:00,180 --> 00:01:04,480 So let's tackle each of these in turn in our Jupiter notebook. 16 00:01:04,500 --> 00:01:10,140 First we'll take a look at the accuracy and then we'll take a look at our false positives and false 17 00:01:10,140 --> 00:01:12,910 negatives in a confusion matrix. 18 00:01:12,960 --> 00:01:18,750 And finally we'll calculate our precision recall an F school. 19 00:01:18,750 --> 00:01:26,130 I'll create a subsection here with a markdown so and the first thing they'll do is select a model that 20 00:01:26,130 --> 00:01:26,770 we're going to look at. 21 00:01:27,690 --> 00:01:31,810 Now looking at tensor board it's a pretty close call. 22 00:01:32,040 --> 00:01:38,700 Our most accurate model on our valuation data set was actually model one with about 49 percent Model 23 00:01:38,700 --> 00:01:42,540 1 if you recall did not use any regularization. 24 00:01:42,540 --> 00:01:44,620 It had no dropout layers. 25 00:01:44,700 --> 00:01:51,150 As such it ended up with the largest difference between the valuation accuracy and the training accuracy 26 00:01:51,810 --> 00:01:52,800 on the validation set. 27 00:01:52,800 --> 00:01:59,410 Model 1 got about 49 percent but on the training data set it got around 60 percent. 28 00:01:59,490 --> 00:02:03,420 Model number 2 with one dropout layer was much closer. 29 00:02:03,540 --> 00:02:07,880 It had a bit of a rocky start and probably could have performed a little better. 30 00:02:07,980 --> 00:02:14,910 On another training run but its training accuracy and its valuation accuracy are pretty close to 50 31 00:02:14,910 --> 00:02:19,680 percent and looking at the stats it's not far behind model number one. 32 00:02:19,780 --> 00:02:22,860 So this is the one I'm going to go with. 33 00:02:22,860 --> 00:02:27,630 As you can see from tensor board they're already two metrics that were calculated as part of the training. 34 00:02:27,750 --> 00:02:31,250 One was the loss and one was the accuracy. 35 00:02:31,440 --> 00:02:38,340 We can reconfirm what these metrics were by putting our modelling then adult and then metrics underscore 36 00:02:38,370 --> 00:02:39,470 names. 37 00:02:39,600 --> 00:02:45,450 Here's the list of metrics that our model can calculate for us if we wanted to get the loss and the 38 00:02:45,450 --> 00:02:47,630 accuracy on the test dataset. 39 00:02:47,850 --> 00:02:50,130 We would use the Evaluate method. 40 00:02:50,130 --> 00:02:56,550 So looking back at the carrier's documentation we can see the Evaluate method listed here on our model. 41 00:02:56,550 --> 00:02:58,050 Functional API. 42 00:02:58,290 --> 00:03:05,840 The description here reads returns the loss value and metrics values for the test model in test mode. 43 00:03:06,050 --> 00:03:12,360 And again if you supply a lot of data then it will do this computation in batches automatically which 44 00:03:12,360 --> 00:03:13,490 is quite nice. 45 00:03:13,890 --> 00:03:18,940 If we scroll down a little bit then we can see what the return values are. 46 00:03:18,960 --> 00:03:24,480 Here we see that the attribute model dot metrics underscore names will give us the display values of 47 00:03:24,480 --> 00:03:28,740 the scalar outputs and we actually get more than one output right. 48 00:03:28,860 --> 00:03:36,380 We get the test loss if there are no other metrics but if there are then we get a list of scales scales 49 00:03:36,430 --> 00:03:39,090 is the same word that you saw here intensive board. 50 00:03:39,180 --> 00:03:47,210 So things like accuracy and loss are scales since we can get to return values from our evaluate method. 51 00:03:47,240 --> 00:03:49,960 Let's store these in two separate variables. 52 00:03:50,000 --> 00:03:52,970 I'll call the first one test on the score loss. 53 00:03:53,450 --> 00:04:01,790 Put a comma and then I'll write test on the score accuracy and I'll set that equal to model on a score 54 00:04:01,790 --> 00:04:12,320 too don't evaluate and between the parentheses I'll supply our test data set X underscore test and our 55 00:04:12,320 --> 00:04:20,960 test labels y on the score test next I'll print this out so I'll let a print statement that reads test 56 00:04:20,960 --> 00:04:35,680 loss is curly braces test under score loss and test accuracy is curly braces test on a score accuracy. 57 00:04:35,690 --> 00:04:37,370 Now let me add shift enter. 58 00:04:37,400 --> 00:04:39,140 Let's see what we get. 59 00:04:39,510 --> 00:04:47,150 Caris will run this evaluation on the entire test data set which if you recall was 10000 different samples. 60 00:04:47,390 --> 00:04:52,460 This calculation took me about 3 seconds to run and here's our output. 61 00:04:52,460 --> 00:04:59,420 We've got an accuracy of around forty nine percent on our testing dataset and this is also what we should 62 00:04:59,420 --> 00:05:04,880 have expected given that we had about 49 percent on our evaluation dataset. 63 00:05:04,880 --> 00:05:09,640 If you think this print statement is a little bit hard to read then of course you can format these numbers. 64 00:05:09,710 --> 00:05:15,980 So with a semicolon and zero point three I can format my loss so that only shows three decimal numbers 65 00:05:16,520 --> 00:05:23,670 and if I want to show my test accuracy as a percentage then I can see zero point one percent. 66 00:05:23,780 --> 00:05:28,250 Then I'll get my accuracy formatted to a percentage with one decimal point. 67 00:05:28,250 --> 00:05:30,180 Let me show you what I mean. 68 00:05:30,410 --> 00:05:31,780 That's a lot easier on the eyes. 69 00:05:31,790 --> 00:05:37,430 Right now let's take a look at our false positives and false negatives. 70 00:05:37,430 --> 00:05:43,130 If you remember from our boy who cried wolf story the false positive is when the boy cried wolf. 71 00:05:43,130 --> 00:05:48,210 And there was no wolf and the false negative would be if the boy had cried. 72 00:05:48,230 --> 00:05:49,280 There is no Wolf. 73 00:05:49,460 --> 00:05:54,370 And that was indeed a wolf that would have made for a very confusing story. 74 00:05:54,740 --> 00:06:01,580 But this is where the confusion matrix comes in to make things a lot more clear already small subheading 75 00:06:01,580 --> 00:06:09,680 here that reads confusion matrix and what we'll do is we'll go to the very top and we'll actually add 76 00:06:09,710 --> 00:06:11,780 another import statement. 77 00:06:11,780 --> 00:06:20,080 We'll import the confusion matrix from psychic learn as K learn don't. 78 00:06:20,630 --> 00:06:30,530 Import confusion underscore matrix let me hit shift enter here scroll back down and now we can create 79 00:06:30,530 --> 00:06:38,930 our confusion matrix so I'm going to store this under conf underscore matrix and I'll set that equal 80 00:06:38,930 --> 00:06:46,910 to confusion on this go matrix and here I have to supply two things my actual labels are actual classes 81 00:06:47,000 --> 00:06:54,610 so why on a score test and my predictions how do I get all of my predictions. 82 00:06:54,660 --> 00:07:02,030 Well if we scroll back up then we can see that we can take our model put a dot after it and say predict 83 00:07:02,110 --> 00:07:06,520 on the score classes and supply our entire testing dataset. 84 00:07:07,580 --> 00:07:10,050 So this is exactly what I want to do. 85 00:07:10,190 --> 00:07:20,450 So I'll see model on a score too don't predict on a score of classes parentheses x on a score test. 86 00:07:20,450 --> 00:07:28,520 Now if this is proving hard to read then what I'll do instead is I'll take this out I'll create a variable 87 00:07:28,520 --> 00:07:37,640 called predictions set that equal to my predicted classes I'll put predictions here and I'll also add 88 00:07:37,640 --> 00:07:48,740 the argument names so y underscore true is equal to y in a skirt test and lie on a score pred is equal 89 00:07:48,740 --> 00:07:50,810 to predictions. 90 00:07:50,810 --> 00:07:58,070 These are the argument names for the confusion matrix let me hit shift enter on the cell and let's take 91 00:07:58,070 --> 00:07:59,180 a look at what we've got. 92 00:07:59,300 --> 00:08:03,080 The interesting thing about this confusion matrix is that it has a shape right. 93 00:08:03,890 --> 00:08:12,420 It's a 10 by 10 matrix so the number of rows right in this matrix would be in our underscore rows. 94 00:08:12,590 --> 00:08:21,020 That would be equal to the confusion matrix thought shape square brackets zero and the number of columns 95 00:08:21,110 --> 00:08:29,050 would be the confusion matrix dots shape square brackets 1 The other thing that we can do is we can 96 00:08:29,050 --> 00:08:31,940 look at the largest value in this matrix. 97 00:08:32,170 --> 00:08:39,940 So confusion matrix dot Max will give us the largest value in this matrix and that's six hundred and 98 00:08:39,940 --> 00:08:41,770 forty five. 99 00:08:41,770 --> 00:08:48,640 In contrast the smallest value in the matrix is five and we can pull this out of the confusion matrix 100 00:08:48,880 --> 00:08:49,820 with thought Min. 101 00:08:50,500 --> 00:08:54,080 But what I'm actually interested in is creating a visualization. 102 00:08:54,610 --> 00:09:00,010 I want to create a chart that way we can see this confusion matrix a lot more clearly. 103 00:09:00,390 --> 00:09:11,230 So when he's mad plot lib for this with BLT dot figure parentheses fixed size set that equal to 7 by 104 00:09:11,410 --> 00:09:12,020 7. 105 00:09:12,100 --> 00:09:20,200 I think they'll look pretty good on screen and the way we can show the confusion matrix is with a BLT 106 00:09:20,210 --> 00:09:28,360 dot I am show conf on a second matrix will plot our confusion matrix on a chart and we can show this 107 00:09:28,360 --> 00:09:39,340 with Pulte Don show let's see what this looks like Tara it looks absolutely horrific completely unintelligible 108 00:09:40,150 --> 00:09:45,670 so we're gonna have to do some things we're gonna have to do some work on formatting the very first 109 00:09:45,670 --> 00:09:53,800 thing I'm going to do is I'm going to come in here and I'm going to fix the title and the labels so 110 00:09:53,800 --> 00:09:59,800 with Peel T dot Title I got a title for this thing so that we can see what it actually is. 111 00:09:59,860 --> 00:10:03,860 See it's font size is equal to 16. 112 00:10:03,940 --> 00:10:04,590 There we go. 113 00:10:04,590 --> 00:10:10,460 Here is a confused matrix which at the moment is a very confusing. 114 00:10:10,550 --> 00:10:16,690 The next thing I'll do is I'll add a y label so I'll add a label for our y axis. 115 00:10:16,780 --> 00:10:21,760 So on our y axis we're gonna have our actual labels our actual categories. 116 00:10:22,540 --> 00:10:23,090 So here we go. 117 00:10:23,170 --> 00:10:32,970 Here's our y label and the ex label should be added of course as well with predicted labels now the 118 00:10:32,970 --> 00:10:38,850 worst thing at the moment are still these little take walks 6 4 2 0. 119 00:10:39,060 --> 00:10:40,110 Now these tick marks. 120 00:10:40,200 --> 00:10:42,830 Actually it meant to correspond to our classes. 121 00:10:42,840 --> 00:10:43,900 Right. 122 00:10:43,920 --> 00:10:46,630 These range from 0 to 9. 123 00:10:46,650 --> 00:10:49,230 This is why we've got these funny numbers here. 124 00:10:49,230 --> 00:10:57,060 So what we actually need to do is we need to format Arctic marks so tick on a score marks shall be the 125 00:10:57,060 --> 00:10:59,980 numbers from 0 to 9. 126 00:11:00,030 --> 00:11:06,990 So we can create this with NUM pi with NPD and arrange and these should start from zero and then at 127 00:11:06,990 --> 00:11:07,630 nine. 128 00:11:07,860 --> 00:11:10,040 So our supply are constant here. 129 00:11:10,080 --> 00:11:13,140 The number of classes NPD arrange. 130 00:11:13,200 --> 00:11:14,760 And then this is equal to 10. 131 00:11:15,700 --> 00:11:26,550 And then what I can do is I can take the 16 BLT dot y ticks and in the parentheses I can supply my tick 132 00:11:26,550 --> 00:11:34,980 marks and I'll hit shift enter and I've got a tick mark for each and every single one of my classes 133 00:11:36,770 --> 00:11:43,070 but even though I really like numbers what I actually want to see are the names of our classes and I've 134 00:11:43,070 --> 00:11:49,310 stored these as a list at the very top with plain com bird cat and so on. 135 00:11:49,370 --> 00:11:53,020 So that's gonna make our access a lot less confusing. 136 00:11:53,060 --> 00:12:00,380 So what I want to do is supply another argument to this why ticks method and that's going to be my label 137 00:12:00,780 --> 00:12:03,500 on the score names constant. 138 00:12:03,500 --> 00:12:10,820 If I had shift enter on this then I can see that the tick marks will now correspond to the items in 139 00:12:10,820 --> 00:12:12,380 my list of labels. 140 00:12:12,530 --> 00:12:19,580 So instead of zero up here I have plane instead of one here I have com and since I've done this on the 141 00:12:19,760 --> 00:12:23,840 y axis I'm also gonna do this on the x axis. 142 00:12:23,870 --> 00:12:32,150 So with party type x ticks I can copy paste this line it shift enter and I'll get my labels here as 143 00:12:32,150 --> 00:12:33,340 well. 144 00:12:33,590 --> 00:12:35,750 Now what to tackle next. 145 00:12:35,750 --> 00:12:39,170 The first thing is is that I want to change these colors that are being used here. 146 00:12:39,800 --> 00:12:46,400 And the easiest way I can do this is with something called a color map and matte plot lib actually gives 147 00:12:46,400 --> 00:12:52,790 us some sample color maps that we can pick from the kind of color maps that will work rather well here 148 00:12:53,120 --> 00:13:00,830 are these single color maps for confusion matrix so grays purples blues greens oranges you can take 149 00:13:00,830 --> 00:13:01,760 your pick. 150 00:13:01,760 --> 00:13:04,020 I'm going to go with you. 151 00:13:04,040 --> 00:13:05,400 This is gonna be a tough choice. 152 00:13:05,450 --> 00:13:08,180 Facebook blue perhaps or purple. 153 00:13:08,810 --> 00:13:14,600 Maybe I'll just go for green coming back here and going to this line. 154 00:13:14,600 --> 00:13:23,930 I am show we can supply a color map with this sea map argument so let's try this out see map is equal 155 00:13:23,930 --> 00:13:32,700 to and then color maps I can get through BLT which is our map plot lib and then see m which stands for 156 00:13:32,700 --> 00:13:37,850 a color map and then we can go for a color map of our choice. 157 00:13:37,850 --> 00:13:40,200 So I would go for green. 158 00:13:40,530 --> 00:13:41,220 Yeah. 159 00:13:41,600 --> 00:13:45,940 The name has to correspond to what we see in the reference here. 160 00:13:46,310 --> 00:13:53,340 And if I hit shift enter then the color of my confusion matrix will change. 161 00:13:53,610 --> 00:13:57,920 Now why are some of these fields darker than other ones. 162 00:13:57,920 --> 00:14:02,530 Well that's because the color is supposed to signify a value. 163 00:14:02,720 --> 00:14:08,300 So a light color would be a low value and a dark color would be a high value. 164 00:14:08,300 --> 00:14:15,410 To make this a lot more clear what we can do is add a so-called color bar on the right hand side and 165 00:14:15,410 --> 00:14:20,760 with PDT dot color bar we can do just that. 166 00:14:20,810 --> 00:14:27,860 Don't forget the parentheses at the end and with the shift enter we can refresh ourselves and then we 167 00:14:27,860 --> 00:14:33,230 see this beautiful color bar next to our confusion matrix. 168 00:14:33,230 --> 00:14:38,720 One thing that we checked earlier was the maximum value in the confusion matrix and this was around 169 00:14:38,930 --> 00:14:44,240 six hundred and forty five and the minimum value was around five. 170 00:14:44,330 --> 00:14:49,790 So if we look at our confusion matrix again then we can see that this is the darkest color here. 171 00:14:50,390 --> 00:14:57,590 So this square here should correspond to six hundred and forty five and then any of the very white squares 172 00:14:57,830 --> 00:15:05,180 should correspond to very low values like five or 10 but instead of using our imagination to interpret 173 00:15:05,180 --> 00:15:11,760 what's going on here let's actually print the individual values on each of these squares. 174 00:15:11,780 --> 00:15:19,460 To do that we're gonna write a for loop we're gonna have to iterate along each row and along each column 175 00:15:19,970 --> 00:15:24,300 to write down and print out the value onto this matrix. 176 00:15:24,920 --> 00:15:27,920 So what we would need is a nested for loop. 177 00:15:28,010 --> 00:15:29,840 That's one way of doing this. 178 00:15:29,900 --> 00:15:32,540 We've worked with nested for loops before. 179 00:15:32,570 --> 00:15:38,420 So in this lesson I want to show you an efficient alternative to what we've done previously. 180 00:15:38,660 --> 00:15:42,320 And when I say efficient I mean computationally efficient. 181 00:15:42,590 --> 00:15:46,800 Python has something called it or tools iteration tools. 182 00:15:46,850 --> 00:15:51,360 So these are functions for creating iterator is for efficient looping. 183 00:15:51,360 --> 00:15:58,940 Now that's quite a mouthful but if we scroll down here and we look at these tables here we can see that 184 00:15:58,940 --> 00:16:03,710 there are all sorts of different types of problems where the acceleration tools have come up with a 185 00:16:03,710 --> 00:16:12,270 solution one of these is the so-called nested for loop and we see that this product method here is equivalent 186 00:16:12,270 --> 00:16:19,140 to a nested for loop and will provide us with a way that we can loop through our confusion matrix using 187 00:16:19,140 --> 00:16:21,060 these iteration tools. 188 00:16:21,240 --> 00:16:23,380 So let's try it out. 189 00:16:23,430 --> 00:16:30,660 No scroll to the very top to our import statements and of course as always I'm going to have to import 190 00:16:30,660 --> 00:16:32,510 my module before I can use it. 191 00:16:32,620 --> 00:16:37,440 It's want to import it or tools that shift into. 192 00:16:37,470 --> 00:16:42,990 Now I can scroll back down to my confusion matrix and add my code. 193 00:16:42,990 --> 00:16:47,340 Here's how we're going to write our nested for loop using these iteration tools. 194 00:16:47,460 --> 00:16:49,620 So I'll say for I. 195 00:16:49,650 --> 00:16:50,710 Come on J. 196 00:16:50,850 --> 00:16:52,960 Remember I've got two dimensions. 197 00:16:53,100 --> 00:16:54,120 Rows and columns. 198 00:16:54,120 --> 00:16:55,490 So I and J. 199 00:16:55,740 --> 00:17:11,460 In it your tools don't product open parentheses range 10 comma range parentheses 10. 200 00:17:11,460 --> 00:17:13,200 Why are we using range. 201 00:17:13,230 --> 00:17:18,540 Well just like with a normal for loop we're going to start at zero and we're gonna add a 10 minus one 202 00:17:18,630 --> 00:17:20,030 or nine. 203 00:17:20,160 --> 00:17:27,300 And because we've got I and J we front two ranges here as arguments for this product method. 204 00:17:27,300 --> 00:17:32,400 Now if we didn't want to use this magic number here 10 who could actually pull the dimensions directly 205 00:17:32,400 --> 00:17:35,400 out of the confusion matrix which we've actually done up here. 206 00:17:35,550 --> 00:17:39,990 So confusion matrix dots shape zero and confusion matrix don't shape 1. 207 00:17:40,260 --> 00:17:43,640 We stored in number of rows a number of columns. 208 00:17:43,830 --> 00:17:51,670 So instead of the 10 here we could have no underscore rows and in our underscore columns. 209 00:17:51,840 --> 00:17:53,830 So that's how we're gonna set up our loop. 210 00:17:53,910 --> 00:17:57,080 Now what are we doing inside the body of our loop. 211 00:17:57,090 --> 00:18:04,680 Well the goal of this whole thing was to print out the actual value in each of these cells and we could 212 00:18:04,680 --> 00:18:09,010 do that with Pulte dot text. 213 00:18:09,360 --> 00:18:14,400 So what are the values that we actually want printed in this first row. 214 00:18:14,400 --> 00:18:18,250 Well let's try and take a look at the confusion matrix and pull it out. 215 00:18:18,540 --> 00:18:21,150 So conf on this call matrix. 216 00:18:21,330 --> 00:18:23,220 Square brackets zero. 217 00:18:23,360 --> 00:18:33,480 Well pull out that first row this row should have the values five hundred and eighty one 33 71 17 and 218 00:18:33,480 --> 00:18:33,900 so on. 219 00:18:34,710 --> 00:18:36,050 So how do we get this printed here. 220 00:18:37,050 --> 00:18:45,690 Well we have to iterate through our confusion matrix and with PDT dot text we have to supply an X and 221 00:18:45,690 --> 00:18:51,390 a Y on the coordinate system of Pulte dot text and a string. 222 00:18:51,390 --> 00:18:58,170 So this is going to be j for the X eye for the Y and for the string for now. 223 00:18:58,200 --> 00:19:00,210 I'll just write a lower case Oh. 224 00:19:01,200 --> 00:19:09,410 Now let shift enter and what we see is that lowercase 0 is printed in all of these cells. 225 00:19:09,480 --> 00:19:11,660 Now this lowercase 0 is not what we want. 226 00:19:11,700 --> 00:19:15,750 That first row should actually have these values. 227 00:19:15,750 --> 00:19:21,210 So we need to go back into our confusion matrix and we need to pull this out the way I'm going to do 228 00:19:21,210 --> 00:19:24,810 this is with conf on the score matrix. 229 00:19:24,810 --> 00:19:33,690 Square brackets and I'm going to use I come a J to pull out the individual values from this two dimensional 230 00:19:33,900 --> 00:19:35,460 matrix. 231 00:19:35,700 --> 00:19:38,650 Let me hit shift enter and let's see what we get. 232 00:19:39,270 --> 00:19:40,650 So this is good news right. 233 00:19:40,650 --> 00:19:46,740 We've got the values that we expected in that first row it starts with five hundred and eighty one and 234 00:19:46,740 --> 00:19:51,120 ends with fifty eight exactly what we've got here. 235 00:19:51,270 --> 00:19:53,850 The only thing is this is really hard to read. 236 00:19:53,910 --> 00:19:58,110 First of all it's shifted so it would be nice if we could center it. 237 00:19:58,140 --> 00:20:01,050 So it actually shows up in the square. 238 00:20:01,050 --> 00:20:09,330 There is an argument called horizontal alignment that we can add to the party dot text and we can set 239 00:20:09,330 --> 00:20:15,560 that equal to center and heading shift into well center these numbers. 240 00:20:16,170 --> 00:20:24,850 But of course it will only do that hey if you've actually spelled this correctly Hari Zon total alignment. 241 00:20:24,960 --> 00:20:26,160 Let's try again. 242 00:20:26,160 --> 00:20:27,580 Here we go. 243 00:20:27,810 --> 00:20:32,810 Now our numbers are centered but there's still one slight problem. 244 00:20:33,060 --> 00:20:39,150 The higher the number the darker the cell and the darker the cell the more difficult it is to read. 245 00:20:39,600 --> 00:20:44,630 So ideally we want the color of this text to be black. 246 00:20:44,790 --> 00:20:53,280 If the cell is white or very light and we want the color to be dark if the cell is very dark and we 247 00:20:53,280 --> 00:20:57,330 can do that by supplying a color argument to this text method. 248 00:20:57,430 --> 00:21:02,630 So but a common him hit enter and see color is equal to. 249 00:21:02,880 --> 00:21:08,090 And now I can actually include a little bit of logic here which is very very cool. 250 00:21:08,350 --> 00:21:12,310 I can say that the color should be white. 251 00:21:12,490 --> 00:21:15,920 If the value in this cell. 252 00:21:15,940 --> 00:21:20,490 So come off on the scroll matrix I comma. 253 00:21:20,500 --> 00:21:24,940 J is greater than some number. 254 00:21:25,570 --> 00:21:28,470 So which numbers are kind of hard to read. 255 00:21:28,570 --> 00:21:31,070 Maybe all the numbers above. 256 00:21:31,060 --> 00:21:32,500 I don't know 450. 257 00:21:34,030 --> 00:21:43,270 So if the number in the cell is greater than 450 then the color will be white but otherwise else the 258 00:21:43,270 --> 00:21:45,730 color should be black. 259 00:21:45,730 --> 00:21:49,480 Let's try this well. 260 00:21:49,630 --> 00:21:51,470 Perfect right. 261 00:21:51,470 --> 00:21:56,900 If we wanted to make this number here a little bit less arbitrary and say oh we use the cutoff point 262 00:21:57,020 --> 00:21:59,840 in the middle of this confusion matrix. 263 00:21:59,960 --> 00:22:06,470 So this would depend on the maximum value then what we could do is we could replace this with conf on 264 00:22:06,470 --> 00:22:08,020 the score matrix. 265 00:22:08,060 --> 00:22:16,480 Dot Max divide it by two hitting shift enter on this will give us this cut off at around three hundred 266 00:22:16,500 --> 00:22:17,390 twenty. 267 00:22:17,600 --> 00:22:18,480 Brilliant. 268 00:22:18,500 --> 00:22:20,260 So what are we looking at here. 269 00:22:20,270 --> 00:22:25,730 We've spent quite a bit of time creating this visualization but we actually haven't talked about how 270 00:22:25,730 --> 00:22:29,920 to interpret it yet to make it a little bit easier on the eyes. 271 00:22:30,000 --> 00:22:33,200 But I might do is I might scale it up a little bit. 272 00:22:33,330 --> 00:22:41,640 So what I'll say up here under plotted out figure is a lot of karma and I'll scale it up with GPI equal 273 00:22:41,640 --> 00:22:50,100 to 2 2 7 which is the resolution of my screen now the whole thing has a much higher resolution and should 274 00:22:50,100 --> 00:22:54,240 be much easier to read for you watching this video. 275 00:22:54,240 --> 00:22:57,700 What I'd like to do at this stage is pose a challenge to you. 276 00:22:57,780 --> 00:23:04,110 I'd like you to have a think about the interpretation of this confusion Matrix for example what do the 277 00:23:04,110 --> 00:23:11,040 numbers on the diagonal represent and what do the numbers in a single row that are not on the diagonal 278 00:23:11,100 --> 00:23:12,040 represent. 279 00:23:12,060 --> 00:23:20,750 So this thirty three seventy one seventeen twenty nine and so on my challenge to you is try to identify 280 00:23:21,170 --> 00:23:29,650 the false positives the false negatives and the true positives in the confusion matrix. 281 00:23:29,930 --> 00:23:36,200 All right here's the solution I've scaled the matrix down a little bit so you can see both axes on the 282 00:23:36,200 --> 00:23:40,520 entire screen let's tackle the true positives first. 283 00:23:41,210 --> 00:23:47,720 So this is the case when our model predicted the correct outcome for example it predicted a plane when 284 00:23:47,720 --> 00:23:53,330 there was in fact a picture of a plane and it predicted the car when in fact there was a picture of 285 00:23:53,330 --> 00:23:53,720 a car. 286 00:23:54,200 --> 00:23:59,010 So the values along the diagonal are the true positives. 287 00:23:59,030 --> 00:24:02,390 Now what about the values down the column. 288 00:24:02,390 --> 00:24:08,570 In this case our model said there was a plane when in fact there was a call and here. 289 00:24:08,680 --> 00:24:14,470 One hundred and six times our models said there was a plane when it was in fact a bird. 290 00:24:14,470 --> 00:24:18,310 The definition of a false positive is a false alarm. 291 00:24:18,340 --> 00:24:23,560 Crying wolf when there is no wolf crying plane when there is no plane. 292 00:24:23,560 --> 00:24:31,950 So this number 39 represents the number of times our model cried plane when in fact there was no plane. 293 00:24:31,960 --> 00:24:37,080 In other words the values down this column are false positives. 294 00:24:37,360 --> 00:24:44,710 And if we sum all those values excluding the value in the diagonal then we get all the false positives 295 00:24:44,860 --> 00:24:47,530 for one particular category. 296 00:24:47,530 --> 00:24:49,330 Now what about the false negative. 297 00:24:49,690 --> 00:24:55,790 In this case our model is saying there is no plane but in fact there is a plane. 298 00:24:56,050 --> 00:24:59,260 Where would we find that value in this case. 299 00:24:59,310 --> 00:25:03,160 We have to look at a row in 33 cases. 300 00:25:03,180 --> 00:25:08,030 There was a picture of a plane but our model predicted a car. 301 00:25:08,040 --> 00:25:09,120 It said there was no plane. 302 00:25:09,180 --> 00:25:10,790 There's something else. 303 00:25:10,830 --> 00:25:15,490 So all these values represent the false negatives. 304 00:25:15,630 --> 00:25:23,200 Summing up the rows excluding the diagonal will give us the false negatives for a particular category. 305 00:25:23,610 --> 00:25:29,660 And summing up the columns apart from that angle will give us the false positives. 306 00:25:29,670 --> 00:25:36,000 Now one thing that's really really interesting about the confusion matrix is looking at the categories 307 00:25:36,270 --> 00:25:40,200 that were most often classified incorrectly. 308 00:25:40,320 --> 00:25:47,340 For example our model confused trucks and cars with each other more than any other category. 309 00:25:48,120 --> 00:25:55,710 Similarly dogs and cats were very difficult for a model to tell apart as we're ships and planes or planes 310 00:25:55,710 --> 00:26:03,340 and ships and even for some reason birds and deer armed with this knowledge we should be able to calculate 311 00:26:03,580 --> 00:26:07,770 both our precision and our recall for both of these. 312 00:26:07,810 --> 00:26:11,350 We need the true positives in the denominator. 313 00:26:11,590 --> 00:26:16,170 How do we get hold of the true positives in our confusion matrix. 314 00:26:16,480 --> 00:26:18,380 That's actually fairly straightforward. 315 00:26:18,400 --> 00:26:27,430 So the true positives are going to be along the diagonal and we can use num PIs MP dot dialogue and 316 00:26:27,430 --> 00:26:31,750 then supply our confusion matrix to get a hold of these values. 317 00:26:31,810 --> 00:26:33,480 Check it out. 318 00:26:33,490 --> 00:26:38,320 5 8 1 5 6 5 3 0 9. 319 00:26:38,590 --> 00:26:43,750 You can see that these are the true positive values along that agony. 320 00:26:43,780 --> 00:26:51,510 Now looking at our recall score we need the true positives and the false negatives in the denominator. 321 00:26:51,550 --> 00:26:53,390 So how do we get hold of this. 322 00:26:53,410 --> 00:27:01,840 Well that'll be the value in the diagonal plus all the values not on the diagonal but easier yet if 323 00:27:01,840 --> 00:27:08,980 we sum up all the values in a row which includes the true positive we should get the denominator for 324 00:27:08,980 --> 00:27:10,420 the recall score. 325 00:27:10,720 --> 00:27:23,490 So our recall is actually equal to end p dot Diack conf on the school matrix divided by end p dot sum. 326 00:27:23,650 --> 00:27:25,460 And now we have to sum along the row. 327 00:27:25,480 --> 00:27:25,820 Right. 328 00:27:26,290 --> 00:27:36,820 So that's come on a score matrix comma axis equals one axis equals one will sum all or rows. 329 00:27:36,880 --> 00:27:38,060 So let's check it out. 330 00:27:38,410 --> 00:27:42,730 Our recall is equal to an array with these values. 331 00:27:42,730 --> 00:27:50,540 We get one recall score for every single category now let's calculate the precision for each and every 332 00:27:50,540 --> 00:27:51,790 single category. 333 00:27:51,800 --> 00:27:53,870 Once again we need the diagonal values. 334 00:27:53,990 --> 00:28:00,110 But now we need to get the true positive and the false positives the false positives. 335 00:28:00,110 --> 00:28:05,840 We said where the values along an entire column excluding the value in the diagonal. 336 00:28:05,900 --> 00:28:10,280 But since we actually need to add it back we can actually sum all the values in a column. 337 00:28:10,970 --> 00:28:12,170 So let's go for it. 338 00:28:12,170 --> 00:28:13,150 Precision. 339 00:28:13,190 --> 00:28:16,540 Going to be equal to end p dot diagonal. 340 00:28:16,620 --> 00:28:26,600 Conference call matrix divided by NPD out some conference call matrix come on axis is equal to zero 341 00:28:26,600 --> 00:28:28,020 in this case. 342 00:28:28,070 --> 00:28:33,670 This is how we sum along our column our precision for each and every category. 343 00:28:33,710 --> 00:28:39,600 It's gonna give us an array and it's gonna be 10 values like so. 344 00:28:39,680 --> 00:28:44,960 So now that we've got 10 recall values and 10 precision values. 345 00:28:44,960 --> 00:28:52,240 How do we calculate the precision or the recall of the model overall Well the easiest thing to do is 346 00:28:52,240 --> 00:28:54,170 to actually just average these values. 347 00:28:54,490 --> 00:28:58,180 Averaging all the recall scores for every category. 348 00:28:58,360 --> 00:29:02,510 Well give us the average recall score for the model as a whole. 349 00:29:02,740 --> 00:29:10,670 The average recall is equal to end p dot mean recall. 350 00:29:10,840 --> 00:29:21,550 And this means that if we print this out and see model to recall score is curly braces average on a 351 00:29:21,550 --> 00:29:28,950 score recall and we can format this as a percentage such shift into c what we get. 352 00:29:29,590 --> 00:29:33,370 It's about forty nine point one four percent. 353 00:29:33,850 --> 00:29:40,800 Now as a challenge can you calculate the average precision for the model as a whole print out this value 354 00:29:40,920 --> 00:29:46,220 below yourself afterwards calculate the f score from Model Number Two. 355 00:29:46,230 --> 00:29:52,870 I'll give you a few seconds to pause the video and give this a go. 356 00:29:52,920 --> 00:29:53,760 Ready. 357 00:29:53,760 --> 00:29:55,360 Here's the solution. 358 00:29:55,470 --> 00:30:03,000 The average precision is going to be equal to N p dot mean and then that array of all the precision 359 00:30:03,000 --> 00:30:03,490 values. 360 00:30:03,920 --> 00:30:09,080 So that's going to be precision and we can print this out. 361 00:30:09,210 --> 00:30:17,820 Use the F string again formatted the same way as before looking back at the definition of how we calculate 362 00:30:17,820 --> 00:30:18,780 the f score. 363 00:30:18,780 --> 00:30:21,020 We see that it is equal to two times. 364 00:30:21,210 --> 00:30:29,040 Precision times recall divided by precision plus recall having already calculated the average recall 365 00:30:29,040 --> 00:30:36,330 score and the average precision score we can calculate the f score or F one score simply by using these 366 00:30:36,330 --> 00:30:47,640 values two times and then it was average precision times average recall divided by average precision 367 00:30:48,150 --> 00:30:55,890 plus average recall we can print the sound and there we go. 368 00:30:55,890 --> 00:31:01,310 Our f score for this model is forty nine point zero four percent. 369 00:31:01,410 --> 00:31:03,720 So now we've calculated all our metrics. 370 00:31:03,930 --> 00:31:06,060 We've calculated our accuracy. 371 00:31:06,060 --> 00:31:12,540 We've calculated our F school we've calculated our recall and we've calculated our precision school. 372 00:31:12,720 --> 00:31:13,800 How are we stacking up. 373 00:31:15,000 --> 00:31:22,920 Well considering that we're getting about 50 percent correct I'd say this is not bad for a very first 374 00:31:22,920 --> 00:31:25,970 try and for such a simple model. 375 00:31:26,280 --> 00:31:32,100 If we take a look at this Web site that's hosting our datasets we can see that the baseline result is 376 00:31:32,100 --> 00:31:35,070 closer to an 18 percent test error. 377 00:31:35,220 --> 00:31:38,100 We got close to 50 percent incorrect. 378 00:31:38,280 --> 00:31:41,640 Now at this point you might ask well why is our error so high. 379 00:31:42,630 --> 00:31:48,630 Well one thing that we saw was that the more data we had the more accurate our model became. 380 00:31:49,170 --> 00:31:52,390 But there are other ways to improve accuracy as well. 381 00:31:52,560 --> 00:31:58,440 You see a more specialized neural network for computer vision would definitely fare better with the 382 00:31:58,440 --> 00:32:02,570 same amount of data than our multilayer perception. 383 00:32:02,580 --> 00:32:09,180 In fact a model structure closer to the inception resonate that we've used with pre trained waits which 384 00:32:09,180 --> 00:32:14,490 is a convolution or neural network or CNN would achieve a much higher accuracy. 385 00:32:15,540 --> 00:32:19,910 However I think we should give the multilayer perception another chance. 386 00:32:20,190 --> 00:32:22,320 It's a very very simple model after all. 387 00:32:22,710 --> 00:32:28,380 And it'll be very interesting to see how it stacks up on a different dataset. 388 00:32:28,380 --> 00:32:34,650 Another reason I want to do this is because these past few lessons in this module were very very theory 389 00:32:34,650 --> 00:32:41,400 heavy and we've covered a lot of new concepts and in the next module what I want to do is I want to 390 00:32:41,400 --> 00:32:44,070 focus more on tensor flow itself. 391 00:32:44,070 --> 00:32:50,150 I want to take you into a deep dive on how to use tensor flow without Charisse. 392 00:32:50,190 --> 00:32:55,590 So what we're going to do next is we're going to continue building on our understanding of the multilayer 393 00:32:55,590 --> 00:33:03,030 perception for the time being but we will focus more on how tensor flow actually works on what a tensor 394 00:33:03,030 --> 00:33:09,240 actually is and how to set up your layers and your weights intensive flow and how to batch your data 395 00:33:09,450 --> 00:33:11,330 during your training session. 396 00:33:11,400 --> 00:33:15,970 Plus we're going to be exploring tensor board on a whole new level. 397 00:33:16,050 --> 00:33:21,570 So well done for persevering through this challenging module and I'm looking forward to seeing you in 398 00:33:21,570 --> 00:33:23,190 the next one. 399 00:33:23,190 --> 00:33:24,870 Until then take care.