1
00:00:00,930 --> 00:00:05,720
Since we've built a model which is able to make predictions we've done some on the test data set.

2
00:00:05,730 --> 00:00:07,560
We've exported them.

3
00:00:07,560 --> 00:00:13,230
Now the people you share these predictions with or in fact yourself I know in my case I'm very very

4
00:00:13,230 --> 00:00:20,060
interested might be curious of which parts of the data led to these predictions.

5
00:00:20,070 --> 00:00:26,970
So this is where feature importance comes in feature importance.

6
00:00:26,970 --> 00:00:34,920
In other words it seeks to figure out which different attributes of the data were most important when

7
00:00:34,920 --> 00:00:38,790
it comes to predicting the target variable or in our case the sale.

8
00:00:39,090 --> 00:00:47,940
Let's write that down a feature important seeks to figure out which different attributes of the data

9
00:00:48,420 --> 00:00:56,280
were most important when it comes to predicting.

10
00:00:56,280 --> 00:00:58,940
Let's put this in bold target variable.

11
00:00:59,010 --> 00:01:02,530
So in our case it's the sale price.

12
00:01:02,610 --> 00:01:03,110
All right.

13
00:01:03,360 --> 00:01:05,240
So how might we do this.

14
00:01:05,310 --> 00:01:07,380
We know we're using a random forest regress.

15
00:01:07,470 --> 00:01:13,500
We've seen in a previous video in the classification project as well as the socket Line project some

16
00:01:13,500 --> 00:01:16,840
psychic line models have the attribute feature responses.

17
00:01:17,070 --> 00:01:20,700
But if we didn't know if we're using a model that we weren't familiar with maybe we could go something

18
00:01:20,700 --> 00:01:26,220
like this so random forest regress are feature important

19
00:01:28,680 --> 00:01:35,770
beautiful so maybe this will give us some results if we were to go and explore these.

20
00:01:35,770 --> 00:01:41,310
We've got the psychic loan documentation that's always helpful so we're looking in here.

21
00:01:41,310 --> 00:01:46,840
We might search for where it knew what we want feature importance is return the feature importance is

22
00:01:46,840 --> 00:01:48,940
the higher the more important the feature.

23
00:01:48,970 --> 00:01:50,260
Let's see what that actually does.

24
00:01:50,260 --> 00:01:52,280
So let's take our model.

25
00:01:52,300 --> 00:01:57,020
So find feature importance of our best model.

26
00:01:57,610 --> 00:02:04,530
So we'll take our ideal model and we'll find a feature in IS.

27
00:02:04,700 --> 00:02:11,310
Well now that returns a fairly large array whole bunch of different values here some of them are zero.

28
00:02:11,410 --> 00:02:12,960
Some of them are pretty low.

29
00:02:13,090 --> 00:02:14,760
Tend to the negative six there.

30
00:02:15,520 --> 00:02:17,140
But what does this have in common.

31
00:02:17,140 --> 00:02:19,440
Let's check out the length of this.

32
00:02:19,450 --> 00:02:23,050
What does this have in common with how training dataset

33
00:02:26,180 --> 00:02:28,240
hundred to one hundred and two.

34
00:02:28,350 --> 00:02:40,040
Are we mining fur is that we've got 102 columns here now we've got 102 values here so we're getting

35
00:02:40,040 --> 00:02:42,110
a value for every feature.

36
00:02:42,110 --> 00:02:49,810
So sales I'd say would map to this and machine day would map to this.

37
00:02:49,810 --> 00:02:58,670
Now we could make a dictionary by going ex trained up columns and intertwine these values but I don't

38
00:02:58,820 --> 00:02:59,920
know about you.

39
00:02:59,920 --> 00:03:01,770
I prefer to see things visually.

40
00:03:01,810 --> 00:03:07,040
So let's make a what do we make a helper function which helps us do that.

41
00:03:07,210 --> 00:03:15,430
So we might make a helper function help a function for plotting feature importance so we can see it

42
00:03:15,700 --> 00:03:16,970
visually.

43
00:03:17,030 --> 00:03:27,340
So go to here death plot features and we might take a list of columns and a list of importance is and

44
00:03:27,340 --> 00:03:28,780
we're going to set an end to 20.

45
00:03:28,780 --> 00:03:32,370
Now this will make sense in a second because we have 102 values here.

46
00:03:32,380 --> 00:03:37,480
But realistically we want to look at a plot that we only want the top 20 values and I mean that's why

47
00:03:37,480 --> 00:03:42,780
we set it so we can adjust it with this little function here helper functions are great.

48
00:03:42,850 --> 00:03:47,940
I should have started using functions earlier writing functions saves a lot of time put it that way.

49
00:03:48,070 --> 00:03:52,930
I used to just write the same line of code over and over and over again in different cells but I've

50
00:03:52,930 --> 00:03:56,870
started to get into the habit of writing functions for different uses.

51
00:03:56,890 --> 00:04:02,800
So what we might do is use a little panties trick here called chaining and what that means is just simply

52
00:04:02,800 --> 00:04:10,420
putting a number of different Panda's functions in brackets and we'll see that in a second because we're

53
00:04:10,420 --> 00:04:22,150
going to make a data frame called The F and it'll have two columns features can be columns and feature

54
00:04:22,750 --> 00:04:27,880
important things can be important says right.

55
00:04:27,880 --> 00:04:35,710
So we're just creating a data frame here and then we're going to call sort value so see here these are

56
00:04:35,710 --> 00:04:42,560
still within the brackets needs to hear what this means is is going to do PDA or data frame and Dot

57
00:04:42,560 --> 00:04:52,640
sort values in one hit and we want to sort it by feature in performances and we want ascending equal

58
00:04:52,640 --> 00:04:57,250
to False so we want it to go from highest to lowest yet.

59
00:04:57,250 --> 00:05:11,070
That makes sense and then we want to reset the index drop equals true right actually this bracket should

60
00:05:11,070 --> 00:05:12,210
be down here.

61
00:05:12,330 --> 00:05:18,290
So because you've got a bracket here and here and two dots that means it's just going to do these three

62
00:05:18,300 --> 00:05:20,520
panda steps in one hit.

63
00:05:20,520 --> 00:05:25,580
And now we need to plot so plot the diaphragm we've created.

64
00:05:25,950 --> 00:05:33,960
So what we might do is go fig ax instantiate a plot peyote dot subplots and then we might do a horizontal

65
00:05:33,960 --> 00:05:39,440
bar because in my experience plotting features it looks really good on a horizontal bar.

66
00:05:39,450 --> 00:05:42,270
And now we only want up to n.

67
00:05:42,360 --> 00:05:43,460
That's why we have n there.

68
00:05:43,470 --> 00:05:46,440
So that's gonna be the first 20 examples.

69
00:05:46,440 --> 00:05:55,350
And the same with feature important says we only want up to 20 so the first 20 then we're going to set

70
00:05:55,350 --> 00:06:05,690
the Y label so acts dot set y label to features because it's a horizontal bar X and Y are rearranged

71
00:06:06,480 --> 00:06:14,730
and we go ax dot set X label add some communication to it and put feature importance down the bottom.

72
00:06:14,730 --> 00:06:16,830
Let's see what this looks like actually.

73
00:06:16,930 --> 00:06:22,920
So we're gonna help a function we just run it there or we have to do to call this is go plot features

74
00:06:23,790 --> 00:06:25,740
and then we'll pass it column.

75
00:06:25,740 --> 00:06:29,650
So we just want the columns from X train because that's a variable there.

76
00:06:29,820 --> 00:06:41,640
And then the feature importance is are just the ideal model dot feature importance is let's see what

77
00:06:41,640 --> 00:06:44,820
this looks like okay.

78
00:06:45,290 --> 00:06:46,760
So we got the features on the left here.

79
00:06:46,770 --> 00:06:47,910
Their value here.

80
00:06:47,970 --> 00:06:53,510
I want the top one the most valuable feature to be at the top.

81
00:06:53,580 --> 00:06:56,370
So what we might do is ax darts.

82
00:06:56,360 --> 00:06:58,880
How do we do it invert.

83
00:06:59,160 --> 00:07:00,720
Can we invert.

84
00:07:00,990 --> 00:07:04,530
I think it's X don't invert y axis.

85
00:07:04,580 --> 00:07:07,250
Maybe that's not a function.

86
00:07:07,490 --> 00:07:12,950
If in doubt run the code invert y axis and that didn't work.

87
00:07:12,950 --> 00:07:17,020
Maybe it is a function bomb.

88
00:07:17,210 --> 00:07:18,740
If in doubt run the code.

89
00:07:18,740 --> 00:07:19,880
Trust your instincts right.

90
00:07:19,880 --> 00:07:23,440
That's how you learn things you practice by running it without looking up.

91
00:07:23,720 --> 00:07:25,370
If in doubt you can always look it up right.

92
00:07:25,370 --> 00:07:29,450
You can always ask how to invert our horizontal bar plot map plot lib.

93
00:07:29,450 --> 00:07:30,390
Google should be out to help.

94
00:07:30,740 --> 00:07:31,120
All right.

95
00:07:31,130 --> 00:07:33,280
So what can we infer from this.

96
00:07:33,560 --> 00:07:35,110
Says a fair bit going on.

97
00:07:35,150 --> 00:07:39,950
We've got about 20 different features here so it's saying that year nine.

98
00:07:39,970 --> 00:07:44,060
Let's have a look at our X train go ahead.

99
00:07:44,140 --> 00:07:46,110
Let's get that in here.

100
00:07:46,150 --> 00:07:52,150
So saying that a year that the bulldozer was made is the most important feature based on the ideal model.

101
00:07:52,150 --> 00:07:54,070
And in the product size what is that.

102
00:07:54,070 --> 00:08:01,960
Let's look at our data dictionary product size product size.

103
00:08:01,960 --> 00:08:03,340
Don't know what this is.

104
00:08:03,340 --> 00:08:04,330
OK that's great.

105
00:08:06,210 --> 00:08:10,510
The size grouping for a product group subsets with product group.

106
00:08:10,570 --> 00:08:14,050
This is an example of where you maybe need to do some more research and figure out what features are

107
00:08:14,050 --> 00:08:18,430
what if you've got something like this in your data dictionary might have to reach out to like a subject

108
00:08:18,430 --> 00:08:21,970
matter expert or we could just go X train

109
00:08:28,190 --> 00:08:29,330
product size

110
00:08:32,370 --> 00:08:35,430
maybe we check the value counts

111
00:08:38,830 --> 00:08:39,280
okay.

112
00:08:39,290 --> 00:08:42,930
So there's one two three four five six different sizes.

113
00:08:43,040 --> 00:08:46,640
Now zero is gonna be missing a lot of missing values there.

114
00:08:47,200 --> 00:08:49,620
All right but maybe this doesn't really tell us much.

115
00:08:49,620 --> 00:08:52,880
So maybe we could do it on our original data frame.

116
00:08:52,880 --> 00:08:53,560
There we go.

117
00:08:54,630 --> 00:08:54,950
All right.

118
00:08:54,960 --> 00:08:59,670
So the product size or that kind of makes sense that the product size is influencing the sale price

119
00:09:00,000 --> 00:09:01,900
sale year enclosure.

120
00:09:01,950 --> 00:09:06,750
Also doing a fair bit of damage there enclosure value counts.

121
00:09:06,810 --> 00:09:07,110
Okay.

122
00:09:07,140 --> 00:09:08,450
So that's.

123
00:09:08,490 --> 00:09:08,890
Mm hmm.

124
00:09:08,910 --> 00:09:13,970
That doesn't really make much sense to me so then we go back to the data dictionary and we find and

125
00:09:14,070 --> 00:09:20,420
closure machine configuration does machine have an enclosed cab or not.

126
00:09:20,430 --> 00:09:20,850
All right.

127
00:09:21,090 --> 00:09:25,570
So maybe we'd need a need to figure out what these different values might mean.

128
00:09:25,770 --> 00:09:31,560
But this is a kind of exploration you'd probably do towards the end of after you've built a model ride

129
00:09:31,560 --> 00:09:36,810
is bringing this to someone after you've made some predictions you might take this in a sort of a presentation.

130
00:09:36,810 --> 00:09:41,400
So maybe you're meeting with a client or meeting with another team member or something you're going

131
00:09:41,400 --> 00:09:46,190
Hey this is what the most important features our model has found.

132
00:09:46,200 --> 00:09:50,970
Does this agree with your sort of intuition with what someone already knows about the data.

133
00:09:50,970 --> 00:09:56,730
Does this make sense if a model is using these features to derive its predictions.

134
00:09:56,730 --> 00:10:03,750
Does this make sense or this information here might influence how you go about collecting data in the

135
00:10:03,750 --> 00:10:11,010
future on your bulldozer sales so you might put a bit more effort into the values here that are contributing

136
00:10:11,010 --> 00:10:15,410
most to predicting the sale price of a bulldozer in the future.

137
00:10:15,450 --> 00:10:19,790
The last probably question that I'll leave you with is we've kind of just glazed over it a little bit

138
00:10:19,830 --> 00:10:25,500
but I want you to do a little bit of research is why my knowing the feature importance is of a train

139
00:10:25,500 --> 00:10:26,850
model be helpful.

140
00:10:26,850 --> 00:10:27,960
That's a finishing question.

141
00:10:29,010 --> 00:10:35,250
So question to finish is going to involve some research.

142
00:10:35,250 --> 00:10:47,730
Why might knowing the feature importance is of a trained machine learning model be helpful.

143
00:10:51,020 --> 00:10:55,700
So that's gonna finish off this project to finish off with a little question here what you might want

144
00:10:55,700 --> 00:11:01,820
to try and do is if you've followed along with all of these steps here you might want to see how far

145
00:11:01,820 --> 00:11:04,760
you can you can go with hyper parameter tuning.

146
00:11:05,480 --> 00:11:09,740
So something like this maybe you could leave this runway for a while on your computer see what it finds

147
00:11:09,740 --> 00:11:20,040
out see if you can improve the the valid R and S L E score and see how far you'd get up on the leaderboard.

148
00:11:20,100 --> 00:11:26,160
Now if you do reach a point where you're not really improving with a random forest model what a final

149
00:11:26,160 --> 00:11:38,370
extension may also be is final challenge is what other machine learning models.

150
00:11:38,380 --> 00:11:47,560
Could you try on our dataset and what that might involve is something like searching for the psychic

151
00:11:47,590 --> 00:11:53,320
loan machine learning that so what we've done is we've gone through this and we've gone to regression

152
00:11:53,320 --> 00:11:58,450
problem we found ensemble regresses and we've used a random forest maybe you want to try another one

153
00:11:59,020 --> 00:12:05,110
of these regression models using the format of data that we've worked through in this notebook so I'll

154
00:12:05,110 --> 00:12:07,690
leave a little hint here so hint

155
00:12:13,540 --> 00:12:31,930
check out the regression section of this map or try to look at something like cat boost dot II or x

156
00:12:32,080 --> 00:12:34,390
g boost dot II.

157
00:12:35,260 --> 00:12:39,880
So these are two extra resources so extra curricular of course they're optional at this point because

158
00:12:39,880 --> 00:12:43,930
once we've started to work through these sort of projects now we've gone end to end on a full machine

159
00:12:43,930 --> 00:12:49,930
learning project your next steps are kind of trying to figure things out on your own taking what you've

160
00:12:49,930 --> 00:12:51,660
learnt from here and then expanding on that.

161
00:12:51,670 --> 00:12:56,050
That's the kind of information that's a kind of knowledge that that is really going to help out.

162
00:12:56,100 --> 00:13:01,840
Right so rather than sort of always going through projects like this it's doing some research and trying

163
00:13:01,840 --> 00:13:06,150
out new things remember step six of our little framework here.

164
00:13:06,220 --> 00:13:09,340
Experimentation that's the challenge I leave to you.

165
00:13:09,610 --> 00:13:16,080
But that being said if you have made it this far you have gone through the entire project.

166
00:13:16,090 --> 00:13:17,560
Congratulations.

167
00:13:17,560 --> 00:13:28,660
We've just gone end to end on a regression problem using machine learning how cool is that so have a

168
00:13:28,660 --> 00:13:33,430
look in the notebook see what improvements you can make if you have any questions whatsoever.

169
00:13:33,640 --> 00:13:34,740
Feel free to ask them.

170
00:13:34,750 --> 00:13:40,150
Leave them in the discord hat or in the Udemy interface somewhere where you can leave a question and

171
00:13:40,450 --> 00:13:41,650
we'll see what we can do from there.

172
00:13:41,680 --> 00:13:43,270
But congratulations.

173
00:13:43,300 --> 00:13:48,400
Working through a first end end regression project how exciting.

174
00:13:48,400 --> 00:13:49,020
All the best.