1
00:00:00,690 --> 00:00:02,980
Let's start by importing our peanuts.

2
00:00:05,040 --> 00:00:12,060
As we've discussed earlier, we are going to use defection and my NIST dataset to classify images of

3
00:00:12,060 --> 00:00:21,730
fashion objects such as trousers, Gord's boots, etc. Fashion Emini USD is a very popular dataset.

4
00:00:23,310 --> 00:00:29,470
It is relatively small and is used to verify that an algorithm books as expected or not.

5
00:00:31,500 --> 00:00:40,130
Fashion amnesty consists of a training set of 60000 examples and a test set of intelligent examples,

6
00:00:41,720 --> 00:00:46,700
each example is a 28 by 28 grayscale image.

7
00:00:47,330 --> 00:00:54,980
That is, it has twenty eight pixels by twenty eight pixels dimensions, and it is a black and white

8
00:00:55,070 --> 00:00:55,490
image.

9
00:00:58,950 --> 00:01:04,090
With each image, there is an associated label of 10 glasses.

10
00:01:06,340 --> 00:01:08,840
I'll tell you the images after we import the dataset.

11
00:01:10,300 --> 00:01:13,510
Let's run this line of code to import data, say.

12
00:01:20,310 --> 00:01:23,550
You know, there are multiple ways to import data here.

13
00:01:23,840 --> 00:01:27,130
We are using these in big data set that comes with Deke.

14
00:01:27,160 --> 00:01:28,080
Get us Labidi.

15
00:01:29,100 --> 00:01:33,450
So if they get us back, it is not and start this line more book.

16
00:01:34,470 --> 00:01:41,200
If it is installed, you will get the dataset imported into this video building, which is fashion and

17
00:01:41,230 --> 00:01:42,750
does quote Ammonite SD.

18
00:01:46,790 --> 00:01:51,890
You can see on the right in the enrollment video bill window here exactly does it.

19
00:01:54,280 --> 00:01:56,340
Next, view this dataset by clicking on it.

20
00:02:00,640 --> 00:02:05,140
Here you can see that this dataset has to pass, train and test.

21
00:02:06,580 --> 00:02:11,110
This means that it is already divided into two parts of training and testing.

22
00:02:12,730 --> 00:02:14,470
We do not need to do this separately.

23
00:02:15,310 --> 00:02:22,330
However, if you want to learn how to separate any dataset into train and test, which is not in this

24
00:02:22,330 --> 00:02:25,610
format, please take the opening section of this course.

25
00:02:26,640 --> 00:02:29,320
There you will find a lecture titled This Train Split.

26
00:02:30,610 --> 00:02:33,080
With that, you will be able to split any dataset.

27
00:02:34,470 --> 00:02:36,020
Martin's Head.

28
00:02:36,190 --> 00:02:37,810
Our dataset is already split.

29
00:02:39,430 --> 00:02:45,910
Let's go for the train set for that has two parts X and Y.

30
00:02:46,900 --> 00:02:52,060
X is the set of predictive variables and Y is the list of output values.

31
00:02:52,720 --> 00:02:55,110
That is the class of deflection object.

32
00:02:58,940 --> 00:03:08,890
You can see the structure of X and Y also her X is a set of 60000 images, which are 28 pixel by 28

33
00:03:08,900 --> 00:03:09,300
pixel.

34
00:03:10,760 --> 00:03:16,800
So for each image, we have a value between zero and 255.

35
00:03:18,080 --> 00:03:21,530
If the value is zero, that pixel is black.

36
00:03:22,460 --> 00:03:25,780
If it is 255, that bookseller's white.

37
00:03:27,440 --> 00:03:33,170
So each individual pixels data for all the 60000 images is stored.

38
00:03:33,260 --> 00:03:34,430
And this expert even.

39
00:03:37,580 --> 00:03:46,440
Similarly, why has dick glassware loose of 60000 images, for example, the first image as the glass.

40
00:03:46,790 --> 00:03:50,090
Nine what this 90 percent.

41
00:03:50,390 --> 00:03:57,920
Look at that in something similar to the green that we have data only differences in the training set.

42
00:03:57,950 --> 00:04:00,810
We have 60000 images, data in plastic.

43
00:04:01,130 --> 00:04:02,660
We have 10000 data.

44
00:04:04,250 --> 00:04:06,800
We will use just train data to print model.

45
00:04:07,790 --> 00:04:14,230
And later on, we will predictive y values for this desk using the X values of this test.

46
00:04:15,920 --> 00:04:22,550
Then we will compare the actual Y values in this test set with the predicted Y values from our model

47
00:04:23,150 --> 00:04:25,400
to find out the accuracy of our model.

48
00:04:26,960 --> 00:04:28,150
Now let's go back to our code.

49
00:04:30,300 --> 00:04:35,120
We'll be assigning the X and Y dream values to separate variables.

50
00:04:36,500 --> 00:04:42,470
To do that, this line of code is the standard way in which we assign value to a variable.

51
00:04:43,820 --> 00:04:46,740
You can run this line also and it will give you the same result.

52
00:04:47,000 --> 00:04:54,740
It will assign the X value of the training set of fashion m NASD variable into the train images.

53
00:04:54,800 --> 00:04:58,190
We will, however, get us.

54
00:04:58,250 --> 00:05:00,590
Allows us to do that in a different way.

55
00:05:01,460 --> 00:05:08,180
In this format, you can assign the two variables brain images and train labels at the same time.

56
00:05:09,690 --> 00:05:16,760
So if you run this line of code, this will assign the X values of brain to print images and derive

57
00:05:16,760 --> 00:05:18,980
value of frame to train labels.

58
00:05:20,690 --> 00:05:21,840
Next on this line, of course.

59
00:05:21,970 --> 00:05:31,470
Now you can see that we have a train images variable and then labeled variable, then images has the

60
00:05:31,490 --> 00:05:33,950
X part and bring labels has the bypass.

61
00:05:35,990 --> 00:05:38,420
Same goes with the test images and test labels.

62
00:05:39,020 --> 00:05:39,380
Next one.

63
00:05:39,410 --> 00:05:40,130
This code also.

64
00:05:42,470 --> 00:05:49,300
And we have two more variables here, although we have seen the structure of training data and test

65
00:05:49,310 --> 00:05:49,670
data.

66
00:05:50,360 --> 00:05:56,990
If you still want to check out the structure of these new variables, you can order these two lines

67
00:05:56,990 --> 00:05:57,460
of code.

68
00:05:58,900 --> 00:05:59,350
Them.

69
00:05:59,800 --> 00:06:07,130
And within decades, variable name gives you the dimension of this variable to this variable has three

70
00:06:07,130 --> 00:06:07,760
dimensions.

71
00:06:07,910 --> 00:06:15,210
First is the 60000 values of different images and then 28 across 28.

72
00:06:15,450 --> 00:06:22,270
For all the individual pixels, if you're on the SDR command, which gives you structure, there'll

73
00:06:22,280 --> 00:06:26,720
be some additional information that it has integer type of values.

74
00:06:28,670 --> 00:06:31,120
And the initial few values are diddle diddle, diddle diddle.

75
00:06:33,050 --> 00:06:40,210
So both of these are used for the same thing to understand what is the structure of this variable that

76
00:06:40,220 --> 00:06:40,580
we have.

77
00:06:42,260 --> 00:06:47,100
Now let me show you the images so that you get a feel of what kind of data we have here.

78
00:06:49,160 --> 00:06:54,530
We can store the information of one inmate into a variable called F Object.

79
00:06:56,240 --> 00:07:05,150
So when I done this line of code, it will assign the information of the fifth image, all the pixels

80
00:07:05,750 --> 00:07:06,680
into this object.

81
00:07:07,220 --> 00:07:08,150
We just f object.

82
00:07:09,490 --> 00:07:10,310
It's done this.

83
00:07:12,230 --> 00:07:16,430
You can see that F object is a indeed cross to indicate two dimensional.

84
00:07:17,300 --> 00:07:20,750
Containing all depicts a leader of this 5th image.

85
00:07:22,700 --> 00:07:30,870
Now, if you want to block this image, you can then this line of code which has blood function and

86
00:07:30,980 --> 00:07:31,790
block function.

87
00:07:31,880 --> 00:07:39,560
We are telling that we have to block this variable as a Rasta image and studying it is basically up

88
00:07:39,580 --> 00:07:40,700
pixilated image.

89
00:07:41,510 --> 00:07:43,010
So then we've done this line of code.

90
00:07:43,520 --> 00:07:45,800
So here you can see the image on the right.

91
00:07:46,760 --> 00:07:50,150
It's a small twenty eight cross, twenty eight pixel image.

92
00:07:50,450 --> 00:07:55,370
So the image quality is not good, but you can make out the object.

93
00:07:56,210 --> 00:08:00,320
It probably looks like or top if you want to check what it is.

94
00:08:00,530 --> 00:08:05,050
We need to see the image label which is stored in the green label.

95
00:08:05,160 --> 00:08:05,510
We will.

96
00:08:09,070 --> 00:08:14,570
In the train level variable, we saw that the values are in the coded format, that is, it does return

97
00:08:14,570 --> 00:08:16,340
from zero to nine.

98
00:08:18,290 --> 00:08:23,260
So to get the actual name of the class, we first create a class name.

99
00:08:23,390 --> 00:08:25,430
Eddie, this.

100
00:08:25,490 --> 00:08:30,900
Eddie contains the list of names in the order in which we have coded these names.

101
00:08:31,670 --> 00:08:33,420
So zero stands for T.

102
00:08:35,570 --> 00:08:42,020
So if you see nine hit, nine stands for ankle boot, two stands for fluid.

103
00:08:42,540 --> 00:08:43,550
It starts with zero.

104
00:08:44,120 --> 00:08:45,170
This is the second element.

105
00:08:45,680 --> 00:08:46,790
This is the ninth element.

106
00:08:48,560 --> 00:08:57,050
Once we have created this array, we can find out the name of this object, which will take fifth image

107
00:08:58,060 --> 00:09:00,680
in the training labels variable.

108
00:09:02,750 --> 00:09:10,360
So the label of the 50 image plus one, because the recording started with zero, so we just want the

109
00:09:10,370 --> 00:09:14,810
plus Vernetta element from this Eddie.

110
00:09:16,760 --> 00:09:24,110
So let's first create this Eddie and now find out the name of this fifth image.

111
00:09:25,520 --> 00:09:29,090
You can see that the fifty mate is a t shirt slash top.

112
00:09:32,170 --> 00:09:34,720
You can check this again.

113
00:09:34,910 --> 00:09:35,770
But again, that image.

114
00:09:36,370 --> 00:09:43,760
So let's try it out for 90 minutes on this Gomaa 19 Blätter.

115
00:09:43,870 --> 00:09:47,950
I mean, this looks like a sandal.

116
00:09:50,100 --> 00:09:51,220
Not if we take the.

117
00:09:55,180 --> 00:10:00,860
To be a big dick and kick delivered, it comes out to Sandon.

118
00:10:03,610 --> 00:10:04,660
So this is our data.

119
00:10:06,220 --> 00:10:13,240
We have created four variables green images contains all the predictive variables.

120
00:10:13,990 --> 00:10:18,610
Green labels contains the output variable using these two variables.

121
00:10:18,640 --> 00:10:20,260
We will be bringing our model.

122
00:10:21,770 --> 00:10:26,700
Then we will be using that model to predict on the test images.

123
00:10:27,250 --> 00:10:29,890
And we will compare the predictions of the test labels.

124
00:10:32,980 --> 00:10:38,740
The last thing I'm going to discuss in this video is normalization of data.

125
00:10:40,450 --> 00:10:46,780
When we have heterogeneous data, learning model takes a lot of time to converge to handle this problem.

126
00:10:47,110 --> 00:10:50,920
We do normalization of beta to normalize data.

127
00:10:51,310 --> 00:10:53,770
Usually a general formalize.

128
00:10:54,430 --> 00:11:01,630
We subtract the mean of that variable from the even and divided by the standard deviation.

129
00:11:02,920 --> 00:11:04,330
So this is the general formula.

130
00:11:05,260 --> 00:11:14,020
But since our training data is not that heterogeneous, every value is of a pixel having a value between

131
00:11:14,020 --> 00:11:15,310
zero to 255.

132
00:11:17,410 --> 00:11:22,760
So we can just divide all the values in the pixels by 255.

133
00:11:23,800 --> 00:11:26,410
This will result in values between zero to one.

134
00:11:27,520 --> 00:11:30,870
And we can input these values into our training model.

135
00:11:32,230 --> 00:11:36,760
So normalization is required when we have different types of variables in our dataset.

136
00:11:37,630 --> 00:11:38,590
If that is the case.

137
00:11:38,890 --> 00:11:43,930
Use this formula to normalize here since our model is already very homogeneous.

138
00:11:44,470 --> 00:11:49,660
We can just divide the the pixel values by the highest value together.

139
00:11:50,080 --> 00:11:51,370
Simple normalized value.

140
00:11:53,530 --> 00:11:59,130
Now using these draine and test values will be creating a model in the next value.