1
00:00:00,630 --> 00:00:07,260
In this lesson we're getting up to the final and last part of this module the evaluation stage of our

2
00:00:07,260 --> 00:00:08,520
model.

3
00:00:08,520 --> 00:00:10,160
It's been quite a journey.

4
00:00:10,230 --> 00:00:12,040
We've formulated our question.

5
00:00:12,090 --> 00:00:13,800
We gathered our data.

6
00:00:13,830 --> 00:00:20,280
We've pre processed and clean their data and we've explored and visualize that as well.

7
00:00:20,310 --> 00:00:26,370
Then we spend quite a bit of time training three different versions of our model looking at dropout

8
00:00:26,550 --> 00:00:34,150
regularization early stopping and examining the performance of our neural networks.

9
00:00:34,170 --> 00:00:37,360
Now it's time to move on to the evaluation stage.

10
00:00:37,440 --> 00:00:43,560
Let's analyze our favorite neural network in a bit more detail in the module where we covered our naive

11
00:00:43,590 --> 00:00:45,180
bayes classifier.

12
00:00:45,180 --> 00:00:51,000
We looked at three metrics in addition to the accuracy for evaluating our classifier.

13
00:00:51,000 --> 00:00:53,480
The first was the recall school.

14
00:00:53,490 --> 00:01:00,180
The second was the precision and a third was a combination of the two namely the F school.

15
00:01:00,180 --> 00:01:04,480
So let's tackle each of these in turn in our Jupiter notebook.

16
00:01:04,500 --> 00:01:10,140
First we'll take a look at the accuracy and then we'll take a look at our false positives and false

17
00:01:10,140 --> 00:01:12,910
negatives in a confusion matrix.

18
00:01:12,960 --> 00:01:18,750
And finally we'll calculate our precision recall an F school.

19
00:01:18,750 --> 00:01:26,130
I'll create a subsection here with a markdown so and the first thing they'll do is select a model that

20
00:01:26,130 --> 00:01:26,770
we're going to look at.

21
00:01:27,690 --> 00:01:31,810
Now looking at tensor board it's a pretty close call.

22
00:01:32,040 --> 00:01:38,700
Our most accurate model on our valuation data set was actually model one with about 49 percent Model

23
00:01:38,700 --> 00:01:42,540
1 if you recall did not use any regularization.

24
00:01:42,540 --> 00:01:44,620
It had no dropout layers.

25
00:01:44,700 --> 00:01:51,150
As such it ended up with the largest difference between the valuation accuracy and the training accuracy

26
00:01:51,810 --> 00:01:52,800
on the validation set.

27
00:01:52,800 --> 00:01:59,410
Model 1 got about 49 percent but on the training data set it got around 60 percent.

28
00:01:59,490 --> 00:02:03,420
Model number 2 with one dropout layer was much closer.

29
00:02:03,540 --> 00:02:07,880
It had a bit of a rocky start and probably could have performed a little better.

30
00:02:07,980 --> 00:02:14,910
On another training run but its training accuracy and its valuation accuracy are pretty close to 50

31
00:02:14,910 --> 00:02:19,680
percent and looking at the stats it's not far behind model number one.

32
00:02:19,780 --> 00:02:22,860
So this is the one I'm going to go with.

33
00:02:22,860 --> 00:02:27,630
As you can see from tensor board they're already two metrics that were calculated as part of the training.

34
00:02:27,750 --> 00:02:31,250
One was the loss and one was the accuracy.

35
00:02:31,440 --> 00:02:38,340
We can reconfirm what these metrics were by putting our modelling then adult and then metrics underscore

36
00:02:38,370 --> 00:02:39,470
names.

37
00:02:39,600 --> 00:02:45,450
Here's the list of metrics that our model can calculate for us if we wanted to get the loss and the

38
00:02:45,450 --> 00:02:47,630
accuracy on the test dataset.

39
00:02:47,850 --> 00:02:50,130
We would use the Evaluate method.

40
00:02:50,130 --> 00:02:56,550
So looking back at the carrier's documentation we can see the Evaluate method listed here on our model.

41
00:02:56,550 --> 00:02:58,050
Functional API.

42
00:02:58,290 --> 00:03:05,840
The description here reads returns the loss value and metrics values for the test model in test mode.

43
00:03:06,050 --> 00:03:12,360
And again if you supply a lot of data then it will do this computation in batches automatically which

44
00:03:12,360 --> 00:03:13,490
is quite nice.

45
00:03:13,890 --> 00:03:18,940
If we scroll down a little bit then we can see what the return values are.

46
00:03:18,960 --> 00:03:24,480
Here we see that the attribute model dot metrics underscore names will give us the display values of

47
00:03:24,480 --> 00:03:28,740
the scalar outputs and we actually get more than one output right.

48
00:03:28,860 --> 00:03:36,380
We get the test loss if there are no other metrics but if there are then we get a list of scales scales

49
00:03:36,430 --> 00:03:39,090
is the same word that you saw here intensive board.

50
00:03:39,180 --> 00:03:47,210
So things like accuracy and loss are scales since we can get to return values from our evaluate method.

51
00:03:47,240 --> 00:03:49,960
Let's store these in two separate variables.

52
00:03:50,000 --> 00:03:52,970
I'll call the first one test on the score loss.

53
00:03:53,450 --> 00:04:01,790
Put a comma and then I'll write test on the score accuracy and I'll set that equal to model on a score

54
00:04:01,790 --> 00:04:12,320
too don't evaluate and between the parentheses I'll supply our test data set X underscore test and our

55
00:04:12,320 --> 00:04:20,960
test labels y on the score test next I'll print this out so I'll let a print statement that reads test

56
00:04:20,960 --> 00:04:35,680
loss is curly braces test under score loss and test accuracy is curly braces test on a score accuracy.

57
00:04:35,690 --> 00:04:37,370
Now let me add shift enter.

58
00:04:37,400 --> 00:04:39,140
Let's see what we get.

59
00:04:39,510 --> 00:04:47,150
Caris will run this evaluation on the entire test data set which if you recall was 10000 different samples.

60
00:04:47,390 --> 00:04:52,460
This calculation took me about 3 seconds to run and here's our output.

61
00:04:52,460 --> 00:04:59,420
We've got an accuracy of around forty nine percent on our testing dataset and this is also what we should

62
00:04:59,420 --> 00:05:04,880
have expected given that we had about 49 percent on our evaluation dataset.

63
00:05:04,880 --> 00:05:09,640
If you think this print statement is a little bit hard to read then of course you can format these numbers.

64
00:05:09,710 --> 00:05:15,980
So with a semicolon and zero point three I can format my loss so that only shows three decimal numbers

65
00:05:16,520 --> 00:05:23,670
and if I want to show my test accuracy as a percentage then I can see zero point one percent.

66
00:05:23,780 --> 00:05:28,250
Then I'll get my accuracy formatted to a percentage with one decimal point.

67
00:05:28,250 --> 00:05:30,180
Let me show you what I mean.

68
00:05:30,410 --> 00:05:31,780
That's a lot easier on the eyes.

69
00:05:31,790 --> 00:05:37,430
Right now let's take a look at our false positives and false negatives.

70
00:05:37,430 --> 00:05:43,130
If you remember from our boy who cried wolf story the false positive is when the boy cried wolf.

71
00:05:43,130 --> 00:05:48,210
And there was no wolf and the false negative would be if the boy had cried.

72
00:05:48,230 --> 00:05:49,280
There is no Wolf.

73
00:05:49,460 --> 00:05:54,370
And that was indeed a wolf that would have made for a very confusing story.

74
00:05:54,740 --> 00:06:01,580
But this is where the confusion matrix comes in to make things a lot more clear already small subheading

75
00:06:01,580 --> 00:06:09,680
here that reads confusion matrix and what we'll do is we'll go to the very top and we'll actually add

76
00:06:09,710 --> 00:06:11,780
another import statement.

77
00:06:11,780 --> 00:06:20,080
We'll import the confusion matrix from psychic learn as K learn don't.

78
00:06:20,630 --> 00:06:30,530
Import confusion underscore matrix let me hit shift enter here scroll back down and now we can create

79
00:06:30,530 --> 00:06:38,930
our confusion matrix so I'm going to store this under conf underscore matrix and I'll set that equal

80
00:06:38,930 --> 00:06:46,910
to confusion on this go matrix and here I have to supply two things my actual labels are actual classes

81
00:06:47,000 --> 00:06:54,610
so why on a score test and my predictions how do I get all of my predictions.

82
00:06:54,660 --> 00:07:02,030
Well if we scroll back up then we can see that we can take our model put a dot after it and say predict

83
00:07:02,110 --> 00:07:06,520
on the score classes and supply our entire testing dataset.

84
00:07:07,580 --> 00:07:10,050
So this is exactly what I want to do.

85
00:07:10,190 --> 00:07:20,450
So I'll see model on a score too don't predict on a score of classes parentheses x on a score test.

86
00:07:20,450 --> 00:07:28,520
Now if this is proving hard to read then what I'll do instead is I'll take this out I'll create a variable

87
00:07:28,520 --> 00:07:37,640
called predictions set that equal to my predicted classes I'll put predictions here and I'll also add

88
00:07:37,640 --> 00:07:48,740
the argument names so y underscore true is equal to y in a skirt test and lie on a score pred is equal

89
00:07:48,740 --> 00:07:50,810
to predictions.

90
00:07:50,810 --> 00:07:58,070
These are the argument names for the confusion matrix let me hit shift enter on the cell and let's take

91
00:07:58,070 --> 00:07:59,180
a look at what we've got.

92
00:07:59,300 --> 00:08:03,080
The interesting thing about this confusion matrix is that it has a shape right.

93
00:08:03,890 --> 00:08:12,420
It's a 10 by 10 matrix so the number of rows right in this matrix would be in our underscore rows.

94
00:08:12,590 --> 00:08:21,020
That would be equal to the confusion matrix thought shape square brackets zero and the number of columns

95
00:08:21,110 --> 00:08:29,050
would be the confusion matrix dots shape square brackets 1 The other thing that we can do is we can

96
00:08:29,050 --> 00:08:31,940
look at the largest value in this matrix.

97
00:08:32,170 --> 00:08:39,940
So confusion matrix dot Max will give us the largest value in this matrix and that's six hundred and

98
00:08:39,940 --> 00:08:41,770
forty five.

99
00:08:41,770 --> 00:08:48,640
In contrast the smallest value in the matrix is five and we can pull this out of the confusion matrix

100
00:08:48,880 --> 00:08:49,820
with thought Min.

101
00:08:50,500 --> 00:08:54,080
But what I'm actually interested in is creating a visualization.

102
00:08:54,610 --> 00:09:00,010
I want to create a chart that way we can see this confusion matrix a lot more clearly.

103
00:09:00,390 --> 00:09:11,230
So when he's mad plot lib for this with BLT dot figure parentheses fixed size set that equal to 7 by

104
00:09:11,410 --> 00:09:12,020
7.

105
00:09:12,100 --> 00:09:20,200
I think they'll look pretty good on screen and the way we can show the confusion matrix is with a BLT

106
00:09:20,210 --> 00:09:28,360
dot I am show conf on a second matrix will plot our confusion matrix on a chart and we can show this

107
00:09:28,360 --> 00:09:39,340
with Pulte Don show let's see what this looks like Tara it looks absolutely horrific completely unintelligible

108
00:09:40,150 --> 00:09:45,670
so we're gonna have to do some things we're gonna have to do some work on formatting the very first

109
00:09:45,670 --> 00:09:53,800
thing I'm going to do is I'm going to come in here and I'm going to fix the title and the labels so

110
00:09:53,800 --> 00:09:59,800
with Peel T dot Title I got a title for this thing so that we can see what it actually is.

111
00:09:59,860 --> 00:10:03,860
See it's font size is equal to 16.

112
00:10:03,940 --> 00:10:04,590
There we go.

113
00:10:04,590 --> 00:10:10,460
Here is a confused matrix which at the moment is a very confusing.

114
00:10:10,550 --> 00:10:16,690
The next thing I'll do is I'll add a y label so I'll add a label for our y axis.

115
00:10:16,780 --> 00:10:21,760
So on our y axis we're gonna have our actual labels our actual categories.

116
00:10:22,540 --> 00:10:23,090
So here we go.

117
00:10:23,170 --> 00:10:32,970
Here's our y label and the ex label should be added of course as well with predicted labels now the

118
00:10:32,970 --> 00:10:38,850
worst thing at the moment are still these little take walks 6 4 2 0.

119
00:10:39,060 --> 00:10:40,110
Now these tick marks.

120
00:10:40,200 --> 00:10:42,830
Actually it meant to correspond to our classes.

121
00:10:42,840 --> 00:10:43,900
Right.

122
00:10:43,920 --> 00:10:46,630
These range from 0 to 9.

123
00:10:46,650 --> 00:10:49,230
This is why we've got these funny numbers here.

124
00:10:49,230 --> 00:10:57,060
So what we actually need to do is we need to format Arctic marks so tick on a score marks shall be the

125
00:10:57,060 --> 00:10:59,980
numbers from 0 to 9.

126
00:11:00,030 --> 00:11:06,990
So we can create this with NUM pi with NPD and arrange and these should start from zero and then at

127
00:11:06,990 --> 00:11:07,630
nine.

128
00:11:07,860 --> 00:11:10,040
So our supply are constant here.

129
00:11:10,080 --> 00:11:13,140
The number of classes NPD arrange.

130
00:11:13,200 --> 00:11:14,760
And then this is equal to 10.

131
00:11:15,700 --> 00:11:26,550
And then what I can do is I can take the 16 BLT dot y ticks and in the parentheses I can supply my tick

132
00:11:26,550 --> 00:11:34,980
marks and I'll hit shift enter and I've got a tick mark for each and every single one of my classes

133
00:11:36,770 --> 00:11:43,070
but even though I really like numbers what I actually want to see are the names of our classes and I've

134
00:11:43,070 --> 00:11:49,310
stored these as a list at the very top with plain com bird cat and so on.

135
00:11:49,370 --> 00:11:53,020
So that's gonna make our access a lot less confusing.

136
00:11:53,060 --> 00:12:00,380
So what I want to do is supply another argument to this why ticks method and that's going to be my label

137
00:12:00,780 --> 00:12:03,500
on the score names constant.

138
00:12:03,500 --> 00:12:10,820
If I had shift enter on this then I can see that the tick marks will now correspond to the items in

139
00:12:10,820 --> 00:12:12,380
my list of labels.

140
00:12:12,530 --> 00:12:19,580
So instead of zero up here I have plane instead of one here I have com and since I've done this on the

141
00:12:19,760 --> 00:12:23,840
y axis I'm also gonna do this on the x axis.

142
00:12:23,870 --> 00:12:32,150
So with party type x ticks I can copy paste this line it shift enter and I'll get my labels here as

143
00:12:32,150 --> 00:12:33,340
well.

144
00:12:33,590 --> 00:12:35,750
Now what to tackle next.

145
00:12:35,750 --> 00:12:39,170
The first thing is is that I want to change these colors that are being used here.

146
00:12:39,800 --> 00:12:46,400
And the easiest way I can do this is with something called a color map and matte plot lib actually gives

147
00:12:46,400 --> 00:12:52,790
us some sample color maps that we can pick from the kind of color maps that will work rather well here

148
00:12:53,120 --> 00:13:00,830
are these single color maps for confusion matrix so grays purples blues greens oranges you can take

149
00:13:00,830 --> 00:13:01,760
your pick.

150
00:13:01,760 --> 00:13:04,020
I'm going to go with you.

151
00:13:04,040 --> 00:13:05,400
This is gonna be a tough choice.

152
00:13:05,450 --> 00:13:08,180
Facebook blue perhaps or purple.

153
00:13:08,810 --> 00:13:14,600
Maybe I'll just go for green coming back here and going to this line.

154
00:13:14,600 --> 00:13:23,930
I am show we can supply a color map with this sea map argument so let's try this out see map is equal

155
00:13:23,930 --> 00:13:32,700
to and then color maps I can get through BLT which is our map plot lib and then see m which stands for

156
00:13:32,700 --> 00:13:37,850
a color map and then we can go for a color map of our choice.

157
00:13:37,850 --> 00:13:40,200
So I would go for green.

158
00:13:40,530 --> 00:13:41,220
Yeah.

159
00:13:41,600 --> 00:13:45,940
The name has to correspond to what we see in the reference here.

160
00:13:46,310 --> 00:13:53,340
And if I hit shift enter then the color of my confusion matrix will change.

161
00:13:53,610 --> 00:13:57,920
Now why are some of these fields darker than other ones.

162
00:13:57,920 --> 00:14:02,530
Well that's because the color is supposed to signify a value.

163
00:14:02,720 --> 00:14:08,300
So a light color would be a low value and a dark color would be a high value.

164
00:14:08,300 --> 00:14:15,410
To make this a lot more clear what we can do is add a so-called color bar on the right hand side and

165
00:14:15,410 --> 00:14:20,760
with PDT dot color bar we can do just that.

166
00:14:20,810 --> 00:14:27,860
Don't forget the parentheses at the end and with the shift enter we can refresh ourselves and then we

167
00:14:27,860 --> 00:14:33,230
see this beautiful color bar next to our confusion matrix.

168
00:14:33,230 --> 00:14:38,720
One thing that we checked earlier was the maximum value in the confusion matrix and this was around

169
00:14:38,930 --> 00:14:44,240
six hundred and forty five and the minimum value was around five.

170
00:14:44,330 --> 00:14:49,790
So if we look at our confusion matrix again then we can see that this is the darkest color here.

171
00:14:50,390 --> 00:14:57,590
So this square here should correspond to six hundred and forty five and then any of the very white squares

172
00:14:57,830 --> 00:15:05,180
should correspond to very low values like five or 10 but instead of using our imagination to interpret

173
00:15:05,180 --> 00:15:11,760
what's going on here let's actually print the individual values on each of these squares.

174
00:15:11,780 --> 00:15:19,460
To do that we're gonna write a for loop we're gonna have to iterate along each row and along each column

175
00:15:19,970 --> 00:15:24,300
to write down and print out the value onto this matrix.

176
00:15:24,920 --> 00:15:27,920
So what we would need is a nested for loop.

177
00:15:28,010 --> 00:15:29,840
That's one way of doing this.

178
00:15:29,900 --> 00:15:32,540
We've worked with nested for loops before.

179
00:15:32,570 --> 00:15:38,420
So in this lesson I want to show you an efficient alternative to what we've done previously.

180
00:15:38,660 --> 00:15:42,320
And when I say efficient I mean computationally efficient.

181
00:15:42,590 --> 00:15:46,800
Python has something called it or tools iteration tools.

182
00:15:46,850 --> 00:15:51,360
So these are functions for creating iterator is for efficient looping.

183
00:15:51,360 --> 00:15:58,940
Now that's quite a mouthful but if we scroll down here and we look at these tables here we can see that

184
00:15:58,940 --> 00:16:03,710
there are all sorts of different types of problems where the acceleration tools have come up with a

185
00:16:03,710 --> 00:16:12,270
solution one of these is the so-called nested for loop and we see that this product method here is equivalent

186
00:16:12,270 --> 00:16:19,140
to a nested for loop and will provide us with a way that we can loop through our confusion matrix using

187
00:16:19,140 --> 00:16:21,060
these iteration tools.

188
00:16:21,240 --> 00:16:23,380
So let's try it out.

189
00:16:23,430 --> 00:16:30,660
No scroll to the very top to our import statements and of course as always I'm going to have to import

190
00:16:30,660 --> 00:16:32,510
my module before I can use it.

191
00:16:32,620 --> 00:16:37,440
It's want to import it or tools that shift into.

192
00:16:37,470 --> 00:16:42,990
Now I can scroll back down to my confusion matrix and add my code.

193
00:16:42,990 --> 00:16:47,340
Here's how we're going to write our nested for loop using these iteration tools.

194
00:16:47,460 --> 00:16:49,620
So I'll say for I.

195
00:16:49,650 --> 00:16:50,710
Come on J.

196
00:16:50,850 --> 00:16:52,960
Remember I've got two dimensions.

197
00:16:53,100 --> 00:16:54,120
Rows and columns.

198
00:16:54,120 --> 00:16:55,490
So I and J.

199
00:16:55,740 --> 00:17:11,460
In it your tools don't product open parentheses range 10 comma range parentheses 10.

200
00:17:11,460 --> 00:17:13,200
Why are we using range.

201
00:17:13,230 --> 00:17:18,540
Well just like with a normal for loop we're going to start at zero and we're gonna add a 10 minus one

202
00:17:18,630 --> 00:17:20,030
or nine.

203
00:17:20,160 --> 00:17:27,300
And because we've got I and J we front two ranges here as arguments for this product method.

204
00:17:27,300 --> 00:17:32,400
Now if we didn't want to use this magic number here 10 who could actually pull the dimensions directly

205
00:17:32,400 --> 00:17:35,400
out of the confusion matrix which we've actually done up here.

206
00:17:35,550 --> 00:17:39,990
So confusion matrix dots shape zero and confusion matrix don't shape 1.

207
00:17:40,260 --> 00:17:43,640
We stored in number of rows a number of columns.

208
00:17:43,830 --> 00:17:51,670
So instead of the 10 here we could have no underscore rows and in our underscore columns.

209
00:17:51,840 --> 00:17:53,830
So that's how we're gonna set up our loop.

210
00:17:53,910 --> 00:17:57,080
Now what are we doing inside the body of our loop.

211
00:17:57,090 --> 00:18:04,680
Well the goal of this whole thing was to print out the actual value in each of these cells and we could

212
00:18:04,680 --> 00:18:09,010
do that with Pulte dot text.

213
00:18:09,360 --> 00:18:14,400
So what are the values that we actually want printed in this first row.

214
00:18:14,400 --> 00:18:18,250
Well let's try and take a look at the confusion matrix and pull it out.

215
00:18:18,540 --> 00:18:21,150
So conf on this call matrix.

216
00:18:21,330 --> 00:18:23,220
Square brackets zero.

217
00:18:23,360 --> 00:18:33,480
Well pull out that first row this row should have the values five hundred and eighty one 33 71 17 and

218
00:18:33,480 --> 00:18:33,900
so on.

219
00:18:34,710 --> 00:18:36,050
So how do we get this printed here.

220
00:18:37,050 --> 00:18:45,690
Well we have to iterate through our confusion matrix and with PDT dot text we have to supply an X and

221
00:18:45,690 --> 00:18:51,390
a Y on the coordinate system of Pulte dot text and a string.

222
00:18:51,390 --> 00:18:58,170
So this is going to be j for the X eye for the Y and for the string for now.

223
00:18:58,200 --> 00:19:00,210
I'll just write a lower case Oh.

224
00:19:01,200 --> 00:19:09,410
Now let shift enter and what we see is that lowercase 0 is printed in all of these cells.

225
00:19:09,480 --> 00:19:11,660
Now this lowercase 0 is not what we want.

226
00:19:11,700 --> 00:19:15,750
That first row should actually have these values.

227
00:19:15,750 --> 00:19:21,210
So we need to go back into our confusion matrix and we need to pull this out the way I'm going to do

228
00:19:21,210 --> 00:19:24,810
this is with conf on the score matrix.

229
00:19:24,810 --> 00:19:33,690
Square brackets and I'm going to use I come a J to pull out the individual values from this two dimensional

230
00:19:33,900 --> 00:19:35,460
matrix.

231
00:19:35,700 --> 00:19:38,650
Let me hit shift enter and let's see what we get.

232
00:19:39,270 --> 00:19:40,650
So this is good news right.

233
00:19:40,650 --> 00:19:46,740
We've got the values that we expected in that first row it starts with five hundred and eighty one and

234
00:19:46,740 --> 00:19:51,120
ends with fifty eight exactly what we've got here.

235
00:19:51,270 --> 00:19:53,850
The only thing is this is really hard to read.

236
00:19:53,910 --> 00:19:58,110
First of all it's shifted so it would be nice if we could center it.

237
00:19:58,140 --> 00:20:01,050
So it actually shows up in the square.

238
00:20:01,050 --> 00:20:09,330
There is an argument called horizontal alignment that we can add to the party dot text and we can set

239
00:20:09,330 --> 00:20:15,560
that equal to center and heading shift into well center these numbers.

240
00:20:16,170 --> 00:20:24,850
But of course it will only do that hey if you've actually spelled this correctly Hari Zon total alignment.

241
00:20:24,960 --> 00:20:26,160
Let's try again.

242
00:20:26,160 --> 00:20:27,580
Here we go.

243
00:20:27,810 --> 00:20:32,810
Now our numbers are centered but there's still one slight problem.

244
00:20:33,060 --> 00:20:39,150
The higher the number the darker the cell and the darker the cell the more difficult it is to read.

245
00:20:39,600 --> 00:20:44,630
So ideally we want the color of this text to be black.

246
00:20:44,790 --> 00:20:53,280
If the cell is white or very light and we want the color to be dark if the cell is very dark and we

247
00:20:53,280 --> 00:20:57,330
can do that by supplying a color argument to this text method.

248
00:20:57,430 --> 00:21:02,630
So but a common him hit enter and see color is equal to.

249
00:21:02,880 --> 00:21:08,090
And now I can actually include a little bit of logic here which is very very cool.

250
00:21:08,350 --> 00:21:12,310
I can say that the color should be white.

251
00:21:12,490 --> 00:21:15,920
If the value in this cell.

252
00:21:15,940 --> 00:21:20,490
So come off on the scroll matrix I comma.

253
00:21:20,500 --> 00:21:24,940
J is greater than some number.

254
00:21:25,570 --> 00:21:28,470
So which numbers are kind of hard to read.

255
00:21:28,570 --> 00:21:31,070
Maybe all the numbers above.

256
00:21:31,060 --> 00:21:32,500
I don't know 450.

257
00:21:34,030 --> 00:21:43,270
So if the number in the cell is greater than 450 then the color will be white but otherwise else the

258
00:21:43,270 --> 00:21:45,730
color should be black.

259
00:21:45,730 --> 00:21:49,480
Let's try this well.

260
00:21:49,630 --> 00:21:51,470
Perfect right.

261
00:21:51,470 --> 00:21:56,900
If we wanted to make this number here a little bit less arbitrary and say oh we use the cutoff point

262
00:21:57,020 --> 00:21:59,840
in the middle of this confusion matrix.

263
00:21:59,960 --> 00:22:06,470
So this would depend on the maximum value then what we could do is we could replace this with conf on

264
00:22:06,470 --> 00:22:08,020
the score matrix.

265
00:22:08,060 --> 00:22:16,480
Dot Max divide it by two hitting shift enter on this will give us this cut off at around three hundred

266
00:22:16,500 --> 00:22:17,390
twenty.

267
00:22:17,600 --> 00:22:18,480
Brilliant.

268
00:22:18,500 --> 00:22:20,260
So what are we looking at here.

269
00:22:20,270 --> 00:22:25,730
We've spent quite a bit of time creating this visualization but we actually haven't talked about how

270
00:22:25,730 --> 00:22:29,920
to interpret it yet to make it a little bit easier on the eyes.

271
00:22:30,000 --> 00:22:33,200
But I might do is I might scale it up a little bit.

272
00:22:33,330 --> 00:22:41,640
So what I'll say up here under plotted out figure is a lot of karma and I'll scale it up with GPI equal

273
00:22:41,640 --> 00:22:50,100
to 2 2 7 which is the resolution of my screen now the whole thing has a much higher resolution and should

274
00:22:50,100 --> 00:22:54,240
be much easier to read for you watching this video.

275
00:22:54,240 --> 00:22:57,700
What I'd like to do at this stage is pose a challenge to you.

276
00:22:57,780 --> 00:23:04,110
I'd like you to have a think about the interpretation of this confusion Matrix for example what do the

277
00:23:04,110 --> 00:23:11,040
numbers on the diagonal represent and what do the numbers in a single row that are not on the diagonal

278
00:23:11,100 --> 00:23:12,040
represent.

279
00:23:12,060 --> 00:23:20,750
So this thirty three seventy one seventeen twenty nine and so on my challenge to you is try to identify

280
00:23:21,170 --> 00:23:29,650
the false positives the false negatives and the true positives in the confusion matrix.

281
00:23:29,930 --> 00:23:36,200
All right here's the solution I've scaled the matrix down a little bit so you can see both axes on the

282
00:23:36,200 --> 00:23:40,520
entire screen let's tackle the true positives first.

283
00:23:41,210 --> 00:23:47,720
So this is the case when our model predicted the correct outcome for example it predicted a plane when

284
00:23:47,720 --> 00:23:53,330
there was in fact a picture of a plane and it predicted the car when in fact there was a picture of

285
00:23:53,330 --> 00:23:53,720
a car.

286
00:23:54,200 --> 00:23:59,010
So the values along the diagonal are the true positives.

287
00:23:59,030 --> 00:24:02,390
Now what about the values down the column.

288
00:24:02,390 --> 00:24:08,570
In this case our model said there was a plane when in fact there was a call and here.

289
00:24:08,680 --> 00:24:14,470
One hundred and six times our models said there was a plane when it was in fact a bird.

290
00:24:14,470 --> 00:24:18,310
The definition of a false positive is a false alarm.

291
00:24:18,340 --> 00:24:23,560
Crying wolf when there is no wolf crying plane when there is no plane.

292
00:24:23,560 --> 00:24:31,950
So this number 39 represents the number of times our model cried plane when in fact there was no plane.

293
00:24:31,960 --> 00:24:37,080
In other words the values down this column are false positives.

294
00:24:37,360 --> 00:24:44,710
And if we sum all those values excluding the value in the diagonal then we get all the false positives

295
00:24:44,860 --> 00:24:47,530
for one particular category.

296
00:24:47,530 --> 00:24:49,330
Now what about the false negative.

297
00:24:49,690 --> 00:24:55,790
In this case our model is saying there is no plane but in fact there is a plane.

298
00:24:56,050 --> 00:24:59,260
Where would we find that value in this case.

299
00:24:59,310 --> 00:25:03,160
We have to look at a row in 33 cases.

300
00:25:03,180 --> 00:25:08,030
There was a picture of a plane but our model predicted a car.

301
00:25:08,040 --> 00:25:09,120
It said there was no plane.

302
00:25:09,180 --> 00:25:10,790
There's something else.

303
00:25:10,830 --> 00:25:15,490
So all these values represent the false negatives.

304
00:25:15,630 --> 00:25:23,200
Summing up the rows excluding the diagonal will give us the false negatives for a particular category.

305
00:25:23,610 --> 00:25:29,660
And summing up the columns apart from that angle will give us the false positives.

306
00:25:29,670 --> 00:25:36,000
Now one thing that's really really interesting about the confusion matrix is looking at the categories

307
00:25:36,270 --> 00:25:40,200
that were most often classified incorrectly.

308
00:25:40,320 --> 00:25:47,340
For example our model confused trucks and cars with each other more than any other category.

309
00:25:48,120 --> 00:25:55,710
Similarly dogs and cats were very difficult for a model to tell apart as we're ships and planes or planes

310
00:25:55,710 --> 00:26:03,340
and ships and even for some reason birds and deer armed with this knowledge we should be able to calculate

311
00:26:03,580 --> 00:26:07,770
both our precision and our recall for both of these.

312
00:26:07,810 --> 00:26:11,350
We need the true positives in the denominator.

313
00:26:11,590 --> 00:26:16,170
How do we get hold of the true positives in our confusion matrix.

314
00:26:16,480 --> 00:26:18,380
That's actually fairly straightforward.

315
00:26:18,400 --> 00:26:27,430
So the true positives are going to be along the diagonal and we can use num PIs MP dot dialogue and

316
00:26:27,430 --> 00:26:31,750
then supply our confusion matrix to get a hold of these values.

317
00:26:31,810 --> 00:26:33,480
Check it out.

318
00:26:33,490 --> 00:26:38,320
5 8 1 5 6 5 3 0 9.

319
00:26:38,590 --> 00:26:43,750
You can see that these are the true positive values along that agony.

320
00:26:43,780 --> 00:26:51,510
Now looking at our recall score we need the true positives and the false negatives in the denominator.

321
00:26:51,550 --> 00:26:53,390
So how do we get hold of this.

322
00:26:53,410 --> 00:27:01,840
Well that'll be the value in the diagonal plus all the values not on the diagonal but easier yet if

323
00:27:01,840 --> 00:27:08,980
we sum up all the values in a row which includes the true positive we should get the denominator for

324
00:27:08,980 --> 00:27:10,420
the recall score.

325
00:27:10,720 --> 00:27:23,490
So our recall is actually equal to end p dot Diack conf on the school matrix divided by end p dot sum.

326
00:27:23,650 --> 00:27:25,460
And now we have to sum along the row.

327
00:27:25,480 --> 00:27:25,820
Right.

328
00:27:26,290 --> 00:27:36,820
So that's come on a score matrix comma axis equals one axis equals one will sum all or rows.

329
00:27:36,880 --> 00:27:38,060
So let's check it out.

330
00:27:38,410 --> 00:27:42,730
Our recall is equal to an array with these values.

331
00:27:42,730 --> 00:27:50,540
We get one recall score for every single category now let's calculate the precision for each and every

332
00:27:50,540 --> 00:27:51,790
single category.

333
00:27:51,800 --> 00:27:53,870
Once again we need the diagonal values.

334
00:27:53,990 --> 00:28:00,110
But now we need to get the true positive and the false positives the false positives.

335
00:28:00,110 --> 00:28:05,840
We said where the values along an entire column excluding the value in the diagonal.

336
00:28:05,900 --> 00:28:10,280
But since we actually need to add it back we can actually sum all the values in a column.

337
00:28:10,970 --> 00:28:12,170
So let's go for it.

338
00:28:12,170 --> 00:28:13,150
Precision.

339
00:28:13,190 --> 00:28:16,540
Going to be equal to end p dot diagonal.

340
00:28:16,620 --> 00:28:26,600
Conference call matrix divided by NPD out some conference call matrix come on axis is equal to zero

341
00:28:26,600 --> 00:28:28,020
in this case.

342
00:28:28,070 --> 00:28:33,670
This is how we sum along our column our precision for each and every category.

343
00:28:33,710 --> 00:28:39,600
It's gonna give us an array and it's gonna be 10 values like so.

344
00:28:39,680 --> 00:28:44,960
So now that we've got 10 recall values and 10 precision values.

345
00:28:44,960 --> 00:28:52,240
How do we calculate the precision or the recall of the model overall Well the easiest thing to do is

346
00:28:52,240 --> 00:28:54,170
to actually just average these values.

347
00:28:54,490 --> 00:28:58,180
Averaging all the recall scores for every category.

348
00:28:58,360 --> 00:29:02,510
Well give us the average recall score for the model as a whole.

349
00:29:02,740 --> 00:29:10,670
The average recall is equal to end p dot mean recall.

350
00:29:10,840 --> 00:29:21,550
And this means that if we print this out and see model to recall score is curly braces average on a

351
00:29:21,550 --> 00:29:28,950
score recall and we can format this as a percentage such shift into c what we get.

352
00:29:29,590 --> 00:29:33,370
It's about forty nine point one four percent.

353
00:29:33,850 --> 00:29:40,800
Now as a challenge can you calculate the average precision for the model as a whole print out this value

354
00:29:40,920 --> 00:29:46,220
below yourself afterwards calculate the f score from Model Number Two.

355
00:29:46,230 --> 00:29:52,870
I'll give you a few seconds to pause the video and give this a go.

356
00:29:52,920 --> 00:29:53,760
Ready.

357
00:29:53,760 --> 00:29:55,360
Here's the solution.

358
00:29:55,470 --> 00:30:03,000
The average precision is going to be equal to N p dot mean and then that array of all the precision

359
00:30:03,000 --> 00:30:03,490
values.

360
00:30:03,920 --> 00:30:09,080
So that's going to be precision and we can print this out.

361
00:30:09,210 --> 00:30:17,820
Use the F string again formatted the same way as before looking back at the definition of how we calculate

362
00:30:17,820 --> 00:30:18,780
the f score.

363
00:30:18,780 --> 00:30:21,020
We see that it is equal to two times.

364
00:30:21,210 --> 00:30:29,040
Precision times recall divided by precision plus recall having already calculated the average recall

365
00:30:29,040 --> 00:30:36,330
score and the average precision score we can calculate the f score or F one score simply by using these

366
00:30:36,330 --> 00:30:47,640
values two times and then it was average precision times average recall divided by average precision

367
00:30:48,150 --> 00:30:55,890
plus average recall we can print the sound and there we go.

368
00:30:55,890 --> 00:31:01,310
Our f score for this model is forty nine point zero four percent.

369
00:31:01,410 --> 00:31:03,720
So now we've calculated all our metrics.

370
00:31:03,930 --> 00:31:06,060
We've calculated our accuracy.

371
00:31:06,060 --> 00:31:12,540
We've calculated our F school we've calculated our recall and we've calculated our precision school.

372
00:31:12,720 --> 00:31:13,800
How are we stacking up.

373
00:31:15,000 --> 00:31:22,920
Well considering that we're getting about 50 percent correct I'd say this is not bad for a very first

374
00:31:22,920 --> 00:31:25,970
try and for such a simple model.

375
00:31:26,280 --> 00:31:32,100
If we take a look at this Web site that's hosting our datasets we can see that the baseline result is

376
00:31:32,100 --> 00:31:35,070
closer to an 18 percent test error.

377
00:31:35,220 --> 00:31:38,100
We got close to 50 percent incorrect.

378
00:31:38,280 --> 00:31:41,640
Now at this point you might ask well why is our error so high.

379
00:31:42,630 --> 00:31:48,630
Well one thing that we saw was that the more data we had the more accurate our model became.

380
00:31:49,170 --> 00:31:52,390
But there are other ways to improve accuracy as well.

381
00:31:52,560 --> 00:31:58,440
You see a more specialized neural network for computer vision would definitely fare better with the

382
00:31:58,440 --> 00:32:02,570
same amount of data than our multilayer perception.

383
00:32:02,580 --> 00:32:09,180
In fact a model structure closer to the inception resonate that we've used with pre trained waits which

384
00:32:09,180 --> 00:32:14,490
is a convolution or neural network or CNN would achieve a much higher accuracy.

385
00:32:15,540 --> 00:32:19,910
However I think we should give the multilayer perception another chance.

386
00:32:20,190 --> 00:32:22,320
It's a very very simple model after all.

387
00:32:22,710 --> 00:32:28,380
And it'll be very interesting to see how it stacks up on a different dataset.

388
00:32:28,380 --> 00:32:34,650
Another reason I want to do this is because these past few lessons in this module were very very theory

389
00:32:34,650 --> 00:32:41,400
heavy and we've covered a lot of new concepts and in the next module what I want to do is I want to

390
00:32:41,400 --> 00:32:44,070
focus more on tensor flow itself.

391
00:32:44,070 --> 00:32:50,150
I want to take you into a deep dive on how to use tensor flow without Charisse.

392
00:32:50,190 --> 00:32:55,590
So what we're going to do next is we're going to continue building on our understanding of the multilayer

393
00:32:55,590 --> 00:33:03,030
perception for the time being but we will focus more on how tensor flow actually works on what a tensor

394
00:33:03,030 --> 00:33:09,240
actually is and how to set up your layers and your weights intensive flow and how to batch your data

395
00:33:09,450 --> 00:33:11,330
during your training session.

396
00:33:11,400 --> 00:33:15,970
Plus we're going to be exploring tensor board on a whole new level.

397
00:33:16,050 --> 00:33:21,570
So well done for persevering through this challenging module and I'm looking forward to seeing you in

398
00:33:21,570 --> 00:33:23,190
the next one.

399
00:33:23,190 --> 00:33:24,870
Until then take care.