1
00:00:00,770 --> 00:00:02,560
So where are we now.

2
00:00:02,600 --> 00:00:05,420
What do we need to do next in this lesson.

3
00:00:05,420 --> 00:00:09,000
I want to talk about the workflow that's coming up for us.

4
00:00:09,050 --> 00:00:13,530
We're going to write the data pre processing logic for our Web site.

5
00:00:13,640 --> 00:00:16,230
And this lesson is going to be your roadmap.

6
00:00:16,250 --> 00:00:20,690
It's going to be your guide to the code that we're going to write shortly.

7
00:00:20,690 --> 00:00:26,570
This will make it a lot easier to follow the logic that we're gonna write in JavaScript over the next

8
00:00:26,570 --> 00:00:27,750
lessons.

9
00:00:27,770 --> 00:00:34,280
Now why do we need to pre process the data when a user draws on our canvas on our Web site.

10
00:00:34,430 --> 00:00:40,500
And the answer is is that the data for the amnesty data set was processed in a particular way.

11
00:00:40,760 --> 00:00:44,960
And then what we did was we trained our neural network on this data.

12
00:00:45,710 --> 00:00:51,500
So if we want to make sensible predictions using our hand drawn digits what we actually need to do is

13
00:00:51,500 --> 00:00:59,650
we need to pre process the inputs in exactly the same way as the data that was fed into our neural network.

14
00:01:00,350 --> 00:01:05,940
Otherwise our neural network will be confused and will give us nonsense predictions.

15
00:01:05,960 --> 00:01:11,780
So that raises the question how was the amnesty data pre processed.

16
00:01:11,780 --> 00:01:19,720
Well we already talked about a few things namely the fact that the images were 28 pixels wide and 28

17
00:01:19,730 --> 00:01:26,510
pixels high and that they were also black and white images meaning they only had one color channel and

18
00:01:26,510 --> 00:01:30,260
there were seven hundred and eighty four pixel values.

19
00:01:30,350 --> 00:01:36,590
However there's actually more that we need to understand about these individual hands on digits for

20
00:01:36,590 --> 00:01:44,360
example the actual number that we see in the image itself is only around 20 pixels wide and 20 pixels

21
00:01:44,360 --> 00:01:45,300
high.

22
00:01:45,380 --> 00:01:52,700
And that means that on the left side and on the right side of this digit are around four pixels of padding

23
00:01:53,630 --> 00:02:00,050
and there are also four pixels of padding at the bottom and at the top of the image and the digit inside

24
00:02:00,050 --> 00:02:05,220
this image is actually centered and it's centered in a particular way.

25
00:02:05,300 --> 00:02:13,540
So the researchers who created the endless dataset actually used each digits center of mass to center

26
00:02:13,550 --> 00:02:14,870
the digit.

27
00:02:14,870 --> 00:02:21,950
Now what is the center of mass the way that I like to think about it is through one of these bird toys.

28
00:02:21,950 --> 00:02:27,980
I'm sure you've seen one of these before but these are these plastic birds that you're able to balance

29
00:02:28,280 --> 00:02:30,460
on a fingertip.

30
00:02:30,470 --> 00:02:36,140
And the reason is is that even though it looks like the bird is about to fall off the center of mass

31
00:02:36,260 --> 00:02:39,620
of this little toy is on the beak.

32
00:02:39,620 --> 00:02:45,500
Now the same concept applies when you try to balance by standing on one leg or try to do some fancy

33
00:02:45,500 --> 00:02:51,890
break dancing move as long as your center of mass is not too far forward or too far backwards you remain

34
00:02:51,890 --> 00:02:52,940
balanced.

35
00:02:53,150 --> 00:02:58,790
So intuitively wearily familiar with this concept the thing that we have to understand though is that

36
00:02:58,790 --> 00:03:05,060
the center of mass doesn't just apply to three dimensions and our world that we're familiar with it

37
00:03:05,060 --> 00:03:08,000
also applies to two dimensions as well.

38
00:03:08,000 --> 00:03:15,980
So for example the center of mass in a square is going to be right in the middle but the center of mass

39
00:03:15,980 --> 00:03:20,050
for this triangle here is actually not bang in the middle.

40
00:03:20,120 --> 00:03:22,900
It's actually slightly below the middle.

41
00:03:22,910 --> 00:03:26,790
So this means the center of mass is something that we have to work out.

42
00:03:27,020 --> 00:03:29,290
It's something that we can't always see visually.

43
00:03:29,360 --> 00:03:31,120
It's something we have to calculate.

44
00:03:31,340 --> 00:03:34,280
Now suppose we have this number to hit this hand drawn digit.

45
00:03:34,490 --> 00:03:39,590
So what the researchers did was they worked out the center of mass for each of these hand drawn digits

46
00:03:40,250 --> 00:03:47,120
and then they use the center of mass to actually center the hand drawn digit inside their image.

47
00:03:47,390 --> 00:03:54,200
And that's pretty much all there is to understand about the key aspects of this M.A. dataset.

48
00:03:54,230 --> 00:04:01,040
So given that we know how these images were pre processed what are the key pre processing steps for

49
00:04:01,040 --> 00:04:02,270
us going to be.

50
00:04:02,510 --> 00:04:09,950
How do we get from a drawing on our canvas to a tensor that we can use to make a prediction.

51
00:04:11,060 --> 00:04:14,360
Well we spoke about a few key requirements already.

52
00:04:14,480 --> 00:04:16,430
We need to resize the image.

53
00:04:16,430 --> 00:04:20,300
We're going to have to make sure our image is black and white and we're going to make sure it's just

54
00:04:20,300 --> 00:04:26,360
got one color channel and then we're also gonna have to send to the image using the center of mass.

55
00:04:26,360 --> 00:04:31,970
Now I know this sounds like a lot of work and it sounds like we have to do a lot of computation with

56
00:04:31,970 --> 00:04:39,450
JavaScript from scratch but lucky for us there's actually a tool that can make all of this so much easier.

57
00:04:39,650 --> 00:04:50,000
And this tool is called Open CV open computer vision and this tool will help us with all our image processing.

58
00:04:50,000 --> 00:04:56,840
Now let me give you an overview of the entire workflow that we're going to code up in JavaScript using

59
00:04:57,290 --> 00:04:58,450
open CV.

60
00:04:58,460 --> 00:05:00,540
This is gonna be our roadmap.

61
00:05:00,760 --> 00:05:02,440
So where do we start.

62
00:05:02,440 --> 00:05:06,440
Well first we have to draw the image on the Web site.

63
00:05:06,700 --> 00:05:15,550
And then what we have to do is we have to take this drawing and loaded into open CV after that we're

64
00:05:15,550 --> 00:05:21,610
going to convert this image to black and white and reduce the number of color channels to 1.

65
00:05:21,800 --> 00:05:31,370
And now we will use open CV to find the contours of the digit that was drawn on this canvas and by controls

66
00:05:31,660 --> 00:05:36,670
I really mean like a border or an outline of this hand drawn digit.

67
00:05:36,710 --> 00:05:41,360
The reason we're getting these contours is because we're gonna be using them for the next step because

68
00:05:41,360 --> 00:05:49,550
it's the contours that will allow us to work out the bounding rectangle of the image using the bounding

69
00:05:49,550 --> 00:05:50,460
rectangle.

70
00:05:50,460 --> 00:05:57,840
We can now crop the image and discard the parts of the canvas where the user didn't draw anything next

71
00:05:58,140 --> 00:06:05,040
we're gonna work out how to scale down our image because what we want is we want the longest edge of

72
00:06:05,040 --> 00:06:07,980
the image to be 20 pixels long.

73
00:06:08,100 --> 00:06:13,800
So what we need to do here is we need to work out the new size we need to work out the proportions of

74
00:06:13,800 --> 00:06:18,600
this image the target size after scaling if you will.

75
00:06:18,600 --> 00:06:24,390
And once we know that we can use this information to actually resize the image.

76
00:06:24,510 --> 00:06:31,170
So the next step is doing the actual resizing at this point the long edge will be 20 pixels.

77
00:06:31,170 --> 00:06:38,020
And the short edge will be something less than 20 pixels depending on what was drawn on the canvas after

78
00:06:38,020 --> 00:06:38,600
that.

79
00:06:38,740 --> 00:06:45,130
We're going to be adding some padding to the image so we'll be adding around four pixels to each side

80
00:06:45,370 --> 00:06:51,220
to make the whole image 28 by 28 pixels again for the next step.

81
00:06:51,220 --> 00:06:57,610
What we're gonna do is we're going to find the center of mass in the image open CV makes this really

82
00:06:57,610 --> 00:06:58,490
really easy.

83
00:06:58,480 --> 00:07:03,920
There is no need for us to remember any complex mathematical formulas or do the math by hand.

84
00:07:04,870 --> 00:07:11,530
Now we can use the center of mass to shift the image so that it is centered in the same way that the

85
00:07:11,530 --> 00:07:14,040
researchers have shifted their images.

86
00:07:14,170 --> 00:07:18,640
Again this is very important for the accuracy of our predictions.

87
00:07:18,760 --> 00:07:20,840
Our training dataset had this done to it.

88
00:07:20,890 --> 00:07:26,620
So we also need to emulate this when we're doing it with our live data that we're getting from our Web

89
00:07:26,620 --> 00:07:27,400
site.

90
00:07:27,400 --> 00:07:33,880
Otherwise our neural network will get very very confused and give us very terrible predictions after

91
00:07:33,880 --> 00:07:35,250
we've shifted our image.

92
00:07:35,260 --> 00:07:39,550
The next step is normalizing the pixel values.

93
00:07:39,580 --> 00:07:40,990
Now what I mean by that.

94
00:07:41,350 --> 00:07:47,560
Each of our black and white pixels is gonna have some value between 0 and 255.

95
00:07:48,100 --> 00:07:55,650
What normalizing the pixel values means is we don't want the values to be between 0 and 255.

96
00:07:55,690 --> 00:08:00,160
What we want the values to be is between 0 and 1.

97
00:08:00,160 --> 00:08:01,730
So how do we do that.

98
00:08:01,780 --> 00:08:02,640
Well simple.

99
00:08:02,650 --> 00:08:07,800
We just divide all the pixel values by two hundred and fifty five.

100
00:08:07,900 --> 00:08:11,300
And again we're doing this because this was done to our training dataset.

101
00:08:11,320 --> 00:08:13,130
So we need to be consistent.

102
00:08:13,240 --> 00:08:19,370
The last step is actually creating a tensor from the pixel values in the image.

103
00:08:19,450 --> 00:08:24,180
This tensor will have the shape one by seven hundred and eighty four.

104
00:08:24,220 --> 00:08:31,810
Just like in our training dataset and we can use this tensor to make a prediction and that's it.

105
00:08:32,230 --> 00:08:37,810
That's the overview of the entire data pre processing workflow.

106
00:08:37,810 --> 00:08:45,340
And as you can see there's quite a few steps involved and learning how to pre process data prior to

107
00:08:45,340 --> 00:08:50,170
feeding it into a machine learning algorithm is one of the things that has the heart of this course

108
00:08:50,860 --> 00:08:56,290
and the reason this is that in reality around 90 percent of the work in the real world involves data

109
00:08:56,320 --> 00:09:02,380
processing and about 10 percent of the work will involve actually running the algorithm running the

110
00:09:02,380 --> 00:09:06,340
actual algorithm is the easiest part in machine learning.

111
00:09:06,340 --> 00:09:10,650
Now without further ado let's dive straight into open CV.

112
00:09:10,720 --> 00:09:11,890
I'll see you on the next lesson.
