1
00:00:00,866 --> 00:00:01,133
Hello and

2
00:00:01,133 --> 00:00:04,133
welcome back to the course on Deep
Natural Language Processing.

3
00:00:04,166 --> 00:00:07,166
Today we're looking at the Bag of Words
model.

4
00:00:07,300 --> 00:00:10,166
first thing I'd like us to look at
is an email.

5
00:00:10,166 --> 00:00:12,866
An email I received just a few days ago.

6
00:00:12,866 --> 00:00:13,833
So here we go.

7
00:00:13,833 --> 00:00:18,633
the email is about a catch up, and,
my friend is asking, hello, Carole.

8
00:00:18,633 --> 00:00:21,600
Checking if you're back in Oz
or Sans for Australia.

9
00:00:21,600 --> 00:00:25,500
Let me know if you're around
and keen to sync on how things are going.

10
00:00:25,600 --> 00:00:27,933
I deffo as a definitely,

11
00:00:27,933 --> 00:00:30,733
could use some of your creative
thinking to help with mine.

12
00:00:30,733 --> 00:00:32,166
Cheers, Ava.

13
00:00:32,166 --> 00:00:36,600
And so, 
what I'd like us to pay attention to.

14
00:00:36,600 --> 00:00:40,133
First of all, of course, you can see that
I sent this email to myself, but,

15
00:00:40,700 --> 00:00:43,633
that's just because
I wanted to keep my a friend.

16
00:00:45,900 --> 00:00:49,066
actually is because I already replied to
the email, and then I wanted to reset it.

17
00:00:49,066 --> 00:00:52,633
And also I wanted to keep my friend, 
keep his privacy.

18
00:00:52,933 --> 00:00:54,233
But this is a real email.

19
00:00:54,233 --> 00:00:58,466
This is the exact text that I got
literally a couple of days ago.

20
00:00:58,800 --> 00:01:01,566
And, the titles would be different,
but I just called.

21
00:01:01,566 --> 00:01:04,566
I changed it to catch up. And so,

22
00:01:04,966 --> 00:01:08,933
what is interesting about this,
we're going to be looking at how,

23
00:01:08,933 --> 00:01:12,733
we can apply natural language
processing to this email

24
00:01:12,733 --> 00:01:14,233
in the next couple of tutorials,

25
00:01:14,233 --> 00:01:17,566
and it will help us work
with a real life example.

26
00:01:18,000 --> 00:01:23,600
And then the other thing is that you here,
you can see in Google,

27
00:01:23,966 --> 00:01:28,366
the Gmail app for iPhone,

28
00:01:28,366 --> 00:01:31,366
you can see that
it's giving me some suggestions.

29
00:01:31,733 --> 00:01:32,866
Very interesting.

30
00:01:32,866 --> 00:01:36,233
It's saying I it's already doing
some quick replies that I can use.

31
00:01:36,233 --> 00:01:39,933
It can be yes, I'm around and back
or sorry, I'm not very interesting.

32
00:01:39,933 --> 00:01:43,866
So let's keep that in mind
and we will come back to this later.

33
00:01:44,300 --> 00:01:46,500
In the meantime,
text of the email is here.

34
00:01:46,500 --> 00:01:48,533
What can we do with it.

35
00:01:48,533 --> 00:01:48,900
All right.

36
00:01:48,900 --> 00:01:51,633
So first things we're going to start off
simple.

37
00:01:51,633 --> 00:01:53,700
We're going to create a model.

38
00:01:53,700 --> 00:01:54,166
We're going to

39
00:01:54,166 --> 00:01:59,033
look at how we can create a model
that will give us an A yes no response.

40
00:01:59,033 --> 00:02:00,866
Because that's one of those questions.

41
00:02:00,866 --> 00:02:03,600
the question is
are you back in Australia?

42
00:02:03,600 --> 00:02:05,833
Let me know if you're old
and keen to think so. Yes.

43
00:02:05,833 --> 00:02:09,100
No, of course
it's better to have a long response.

44
00:02:09,100 --> 00:02:12,000
And that's that's the social norm.

45
00:02:12,000 --> 00:02:17,233
And, it's it's, added the etiquette
to, like, converse with people,

46
00:02:17,233 --> 00:02:21,033
not just say yes, no, but even let's
try to get a yes no response.

47
00:02:21,033 --> 00:02:22,500
Let's see how we would go about that.

48
00:02:22,500 --> 00:02:24,766
Because that's the first step into NLP.

49
00:02:24,766 --> 00:02:28,700
And then further on we will see how
we can expand that even more.

50
00:02:29,500 --> 00:02:29,866
All right.

51
00:02:29,866 --> 00:02:35,000
So we're going to start off
with with a vector a vector or a

52
00:02:35,400 --> 00:02:40,200
just like an array, a full of zeros.

53
00:02:40,266 --> 00:02:42,266
Yeah. Let's call it a vector.
So these are like that.

54
00:02:42,266 --> 00:02:45,266
So just 0000 is how many zeros.

55
00:02:45,433 --> 00:02:47,400
Well, a lot of zeros.

56
00:02:47,400 --> 00:02:50,466
20,000 elements in total.

57
00:02:50,466 --> 00:02:52,266
20,000. Why is that.

58
00:02:52,266 --> 00:02:55,566
Well it's because of the way
that we're building as well.

59
00:02:55,900 --> 00:02:58,966
20,000 is the number of words

60
00:02:59,400 --> 00:03:04,566
that are commonly used by the average
native English language speakers.

61
00:03:04,566 --> 00:03:08,733
So here's a, quick search on Google
how many words in the English.

62
00:03:09,000 --> 00:03:10,366
So that's the search I took.

63
00:03:10,366 --> 00:03:13,700
I came up with how many words are there
in the English language?

64
00:03:13,866 --> 00:03:16,800
171,476 words.

65
00:03:16,800 --> 00:03:20,266
That's how many entries in the Oxford
Dictionary, plus some obsolete words,

66
00:03:20,733 --> 00:03:22,500
plus derivative words.

67
00:03:22,500 --> 00:03:24,200
Yeah. And so on. But also,

68
00:03:25,500 --> 00:03:28,600
people also you can see Google's
giving us suggestion

69
00:03:28,966 --> 00:03:31,966
that more subtle adult native, test takers

70
00:03:31,966 --> 00:03:35,233
range from 20 to 30 20 to 35,000 words.

71
00:03:35,300 --> 00:03:39,400
Average native test takers of age
eight or, you know, 10,000 words, average

72
00:03:39,400 --> 00:03:42,900
native test takers, a four or,
you know, 5000 words.

73
00:03:43,133 --> 00:03:47,000
An adult native test takers learn almost
whatever

74
00:03:47,866 --> 00:03:50,866
the science is going into so much detail.

75
00:03:51,466 --> 00:03:54,466
but the interesting thing here is that,

76
00:03:56,133 --> 00:04:00,000
how many like, what I wanted to point out,
first of all, 20,000.

77
00:04:00,000 --> 00:04:02,600
And you will see why exactly
we use this number, not more.

78
00:04:02,600 --> 00:04:08,033
what I wanted to point out is how many
words are there in the English language.

79
00:04:08,033 --> 00:04:12,300
Even this in its own is actually, Google
is applying natural language processing.

80
00:04:12,300 --> 00:04:16,233
It's it's looking at what we wrote and
and then is also,

81
00:04:16,800 --> 00:04:19,000
checking, other similar answers.

82
00:04:19,000 --> 00:04:22,000
How many boards in the English language
does that other person,

83
00:04:22,133 --> 00:04:24,566
the average person, know.
So that's not the question to ask.

84
00:04:24,566 --> 00:04:27,500
But it came up with that. 
Then it came up with many other questions.

85
00:04:27,500 --> 00:04:31,966
So you can see that the irony is that
even in this search on its own,

86
00:04:32,366 --> 00:04:36,266
we're already falling
victim of natural language processing.

87
00:04:37,000 --> 00:04:38,400
even though that wasn't our intention,

88
00:04:38,400 --> 00:04:40,033
that's not
what we're going to be talking about.

89
00:04:40,033 --> 00:04:42,466
But it's just funny
that it came up anyway.

90
00:04:42,466 --> 00:04:45,533
So 20,000 words and, fun fact,

91
00:04:46,166 --> 00:04:49,066
is that we actually use,

92
00:04:49,066 --> 00:04:51,900
about 3000 words

93
00:04:51,900 --> 00:04:56,633
out of those 171,476 words,
we only used 3000 words,

94
00:04:56,633 --> 00:05:01,100
not just in, conversational language,
but you can see here,

95
00:05:01,333 --> 00:05:04,300
a vocabulary of just 3000 words
provides coverage

96
00:05:04,300 --> 00:05:07,300
for around 95% of common texts,

97
00:05:07,666 --> 00:05:11,166
95% of, common text that I like.

98
00:05:11,166 --> 00:05:13,733
I'm assuming that's
including books and stuff like that.

99
00:05:13,733 --> 00:05:18,400
So if you do the math, it's
why only use 1.75%

100
00:05:18,400 --> 00:05:21,433
of the total number of words
in the English language?

101
00:05:21,733 --> 00:05:26,100
So as you can see,
even that 3000 like our 20,000

102
00:05:26,100 --> 00:05:31,566
is more than even the 3000 that covers
95% of the situation.

103
00:05:31,566 --> 00:05:33,133
So we're pretty good.

104
00:05:33,133 --> 00:05:37,766
We're definitely covered
if we say that our vocabulary,

105
00:05:38,000 --> 00:05:43,866
all possible words that we can encounter
is going to fit into a vector of 20,000.

106
00:05:43,866 --> 00:05:46,866
So every basically what we're saying,
this is important.

107
00:05:47,166 --> 00:05:47,666
What we're saying

108
00:05:47,666 --> 00:05:52,666
is that every word in the English language
has a position somewhere on this vector.

109
00:05:52,666 --> 00:05:55,833
So for example, this
the word f could have this position.

110
00:05:55,833 --> 00:06:00,866
So if you count 123456
the seventh position in our custom

111
00:06:00,933 --> 00:06:05,966
made vector, is that word events
always going to be on that position?

112
00:06:05,966 --> 00:06:07,933
That's very crucial for this.

113
00:06:07,933 --> 00:06:10,900
For instance the word badminton
let's just say like that

114
00:06:10,900 --> 00:06:13,166
we can construct this vector
any way we want.

115
00:06:13,166 --> 00:06:15,000
The word
badminton could be on this position.

116
00:06:15,000 --> 00:06:16,633
It's always going to be on this position.

117
00:06:16,633 --> 00:06:18,433
And the word table
is going to be on this position.

118
00:06:18,433 --> 00:06:21,333
And this is like how this bag of words
model works.

119
00:06:21,333 --> 00:06:26,333
So, just keep in mind that once you like,
once we've taken all 20,000 words,

120
00:06:27,133 --> 00:06:30,300
and then we've assigned them a space,
that's where they that's

121
00:06:30,800 --> 00:06:34,600
what they will this like space
and this vector will be associated with.

122
00:06:34,600 --> 00:06:35,666
They'll be associated with the word.

123
00:06:35,666 --> 00:06:38,933
But if this will be associated
with the word badminton, this will

124
00:06:38,933 --> 00:06:41,000
this position will be associated
with the word table.

125
00:06:42,700 --> 00:06:44,333
and the other thing is

126
00:06:44,333 --> 00:06:47,333
here you can see I've grayed out
the first two and the last one,

127
00:06:47,333 --> 00:06:51,200
first two are going to be reserved
for source and iOS.

128
00:06:51,200 --> 00:06:55,266
So stands for start of sentence,
iOS stands for end of sentence.

129
00:06:55,766 --> 00:06:58,733
And the last one will be reserved
for special words.

130
00:06:58,733 --> 00:07:01,366
And that's for those words
that you're wondering about.

131
00:07:01,366 --> 00:07:04,200
I can
I can hear your brain churning right now.

132
00:07:04,200 --> 00:07:09,400
What about those other 150,000 words
that we didn't take into account?

133
00:07:09,400 --> 00:07:10,466
What if they come up?

134
00:07:10,466 --> 00:07:13,266
Well, if they come up,
we're going to just associate them

135
00:07:13,266 --> 00:07:16,766
with this, with this last thing,
this last element.

136
00:07:16,766 --> 00:07:18,033
We can just throw them all in there.

137
00:07:18,033 --> 00:07:21,033
Any kind of words
that we can recognize in the 20,000,

138
00:07:21,100 --> 00:07:24,000
we cannot throw
them into that lost element.

139
00:07:25,100 --> 00:07:25,433
All right.

140
00:07:25,433 --> 00:07:27,600
So let's go back to our email text.

141
00:07:27,600 --> 00:07:30,233
Here it is. Hello, Carol. Checking
if you're back in Oz.

142
00:07:30,233 --> 00:07:33,000
Let me know if you are around
etc. etc. etc..

143
00:07:33,000 --> 00:07:33,833
Cheers.

144
00:07:33,833 --> 00:07:37,633
V and so let's see

145
00:07:37,633 --> 00:07:41,833
how this can be put into our bag of words.

146
00:07:41,833 --> 00:07:43,566
If you've probably noticed by now

147
00:07:43,566 --> 00:07:46,566
that this is our bag of words
that we're constructing here.

148
00:07:46,566 --> 00:07:50,433
So now we're going to throw the text
into this bag of words.

149
00:07:50,933 --> 00:07:51,700
How's that going to happen?

150
00:07:51,700 --> 00:07:54,933
I'm just going to throw it in and then
I'll just I'll explain how it happens.

151
00:07:54,933 --> 00:07:57,933
So there it is. That's the result.

152
00:07:58,100 --> 00:07:59,900
It that's it.

153
00:07:59,900 --> 00:08:02,133
It all of course
depends on how we construct our vector.

154
00:08:02,133 --> 00:08:05,100
But this is our result
in the way we construct our vector.

155
00:08:05,100 --> 00:08:07,566
And let's let's look at this. way.

156
00:08:07,566 --> 00:08:12,266
So we've as we've discussed previously,
we took the 20,000 words

157
00:08:12,266 --> 00:08:15,266
and we associated each position
with, a word.

158
00:08:15,300 --> 00:08:19,366
And now we go through our, text and find

159
00:08:19,366 --> 00:08:23,700
and then like, increase the counter
in each position of the associated word.

160
00:08:23,700 --> 00:08:29,133
So hello, let's say, in our vector,
it is in position number five

161
00:08:29,400 --> 00:08:30,966
because we only have one. Hello.

162
00:08:30,966 --> 00:08:34,600
In this whole email
we're going to put a one here.

163
00:08:35,000 --> 00:08:38,000
Cairo is definitely not
an English language word.

164
00:08:38,033 --> 00:08:40,833
So we're going to have to put it
into there.

165
00:08:40,833 --> 00:08:44,733
And the reason why there's three here
is because we have Cairo.

166
00:08:45,333 --> 00:08:47,900
Then also and V

167
00:08:47,900 --> 00:08:51,233
those are non-English language words
not among those 20,000.

168
00:08:51,266 --> 00:08:52,466
They're all going to go here.

169
00:08:53,700 --> 00:08:56,400
Then we've got the comma surprise.

170
00:08:56,400 --> 00:08:57,966
The comma also has a position.

171
00:08:57,966 --> 00:08:59,966
Let's say it was in position number.

172
00:08:59,966 --> 00:09:02,333
So 36789.

173
00:09:02,333 --> 00:09:04,733
So the ninth position is associated
with a comma

174
00:09:04,733 --> 00:09:06,733
because we have one comma in our email.

175
00:09:06,733 --> 00:09:09,600
Oh actually we have two commas okay.
So this should be a two.

176
00:09:09,600 --> 00:09:11,333
But let's let's not think about that.

177
00:09:11,333 --> 00:09:14,100
Let's let's forget about that comma.

178
00:09:14,100 --> 00:09:14,933
I didn't notice it.

179
00:09:14,933 --> 00:09:19,566
So assuming we have one comma
in our email, this is a one checking.

180
00:09:19,966 --> 00:09:24,600
And let's say that this, this, 
element is associated with our checking.

181
00:09:24,600 --> 00:09:25,933
This is a one because there's only one.

182
00:09:25,933 --> 00:09:31,333
We're checking if it's a two
because we have two ifs in our email.

183
00:09:31,633 --> 00:09:35,033
So it's going to be A2U is a two
because we have two

184
00:09:35,033 --> 00:09:38,300
used in our email including,
the rest of the text.

185
00:09:38,300 --> 00:09:40,433
I don't think there's any more
use in there and so on.

186
00:09:40,433 --> 00:09:44,100
So that's
basically how we fill this bag of words.

187
00:09:44,100 --> 00:09:47,100
We just put in the, the,

188
00:09:47,100 --> 00:09:50,300
quantity of words for every position's
pretty straightforward.

189
00:09:50,300 --> 00:09:51,366
We're just,

190
00:09:51,366 --> 00:09:55,033
filling in this vector, as you can
see, is going to be quite a sparse vector.

191
00:09:55,033 --> 00:09:58,466
It is going to be lots of zeros, 
almost 20,000 zeros.

192
00:09:58,466 --> 00:10:00,066
And some of the words
are going to be filled in.

193
00:10:01,533 --> 00:10:02,266
okay.

194
00:10:02,266 --> 00:10:03,600
So what is our goal.

195
00:10:03,600 --> 00:10:07,333
So our goal as we discussed
before is to come up with a reply

196
00:10:07,333 --> 00:10:12,433
yes or no to this email,
which is now in the form of a vector.

197
00:10:12,933 --> 00:10:14,266
And how are we going to do that?

198
00:10:14,266 --> 00:10:16,433
Well, we're going to do it
through training data.

199
00:10:16,433 --> 00:10:18,366
So we're going to look at all
of the emails

200
00:10:18,366 --> 00:10:22,133
that I have reply to
because this is us training a model

201
00:10:22,333 --> 00:10:26,266
to reply to my emails
or in your case, in anybody's

202
00:10:26,266 --> 00:10:29,466
case, it's going to be training the model
or to reply to their emails.

203
00:10:29,766 --> 00:10:32,000
We're going to look at training data.
We're going to need some training data.

204
00:10:32,000 --> 00:10:35,000
I'm going to fish it
out of the inbox or outbox.

205
00:10:35,233 --> 00:10:37,500
so let's say let's look at a couple.

206
00:10:37,500 --> 00:10:39,266
So here we've got hey mate.

207
00:10:39,266 --> 00:10:44,300
Have you read about Hinton's capsule
networks and general reply to that. No.

208
00:10:45,066 --> 00:10:47,733
so we're going to use that
as a training example.

209
00:10:47,733 --> 00:10:48,166
Next one.

210
00:10:48,166 --> 00:10:50,300
Did you like that recipe
I sent you last week?

211
00:10:50,300 --> 00:10:52,833
The result? The answer was yes.

212
00:10:52,833 --> 00:10:54,000
it was a good recipe, I guess.

213
00:10:54,000 --> 00:10:57,366
So there we go. So now we have two. Three.

214
00:10:57,666 --> 00:10:59,666
Hi, Carol.
Are you coming to dinner tonight?

215
00:10:59,666 --> 00:11:01,066
Yes, dear.

216
00:11:01,066 --> 00:11:04,100
Carol, would you like to service your car
with us again? No.

217
00:11:04,700 --> 00:11:07,666
Are you coming to Australia
in December? Yes.

218
00:11:07,666 --> 00:11:08,400
And so on.

219
00:11:08,400 --> 00:11:11,766
So ideally we would have tens or hundreds

220
00:11:11,766 --> 00:11:14,833
of thousands of emails
like that and responses like that. Yes.

221
00:11:14,833 --> 00:11:16,000
No responses.

222
00:11:16,000 --> 00:11:19,333
I of course, would be like
a lot of groundwork to get that data

223
00:11:19,333 --> 00:11:22,333
because we usually don't just respond
yes no to emails.

224
00:11:22,333 --> 00:11:27,666
So we'd have to look at this answer
and understand what was the sentiment.

225
00:11:27,700 --> 00:11:30,700
The sentiment was no.
What was the overall?

226
00:11:30,900 --> 00:11:33,133
Was it a
yes or no? No. Yes or no? It's on.

227
00:11:34,233 --> 00:11:36,833
of course, it's kind of more of a
theoretical example.

228
00:11:36,833 --> 00:11:40,500
Nobody's going to do this for their own
inbox, but nevertheless the point stands.

229
00:11:40,866 --> 00:11:42,433
So how would we train?

230
00:11:42,433 --> 00:11:44,466
How would we use this training data?

231
00:11:44,466 --> 00:11:47,466
We would use a similar principle
and convert each one of those emails

232
00:11:47,466 --> 00:11:48,533
to a vector.

233
00:11:48,533 --> 00:11:53,900
in this and again
each vector would be 20,000 elements long.

234
00:11:53,900 --> 00:11:59,400
So yeah, I just threw some numbers in here
to to get the point across.

235
00:11:59,666 --> 00:12:00,766
It's not exactly accurate.

236
00:12:00,766 --> 00:12:04,533
But so we have these vectors
like lots and lots, lots of vectors.

237
00:12:04,766 --> 00:12:06,333
Lots and lots and lots of responses.

238
00:12:06,333 --> 00:12:07,933
Yes and no.

239
00:12:07,933 --> 00:12:09,033
And yeah.

240
00:12:09,033 --> 00:12:12,100
So now what we're going to do is we're

241
00:12:12,100 --> 00:12:15,100
going to,

242
00:12:15,700 --> 00:12:16,766
apply a model.

243
00:12:16,766 --> 00:12:18,600
Once we have all this data
we're going to apply model.

244
00:12:18,600 --> 00:12:22,033
So one of the models
we can apply to create our bag of words,

245
00:12:22,766 --> 00:12:26,266
or one of the algorithms
we can apply to create our bag of words

246
00:12:26,266 --> 00:12:28,800
model is the logistic regression.

247
00:12:28,800 --> 00:12:31,600
So we apply the logistic regression
to our yes

248
00:12:31,600 --> 00:12:35,166
no responses to these
to this information that we have.

249
00:12:35,666 --> 00:12:37,966
and then

250
00:12:39,100 --> 00:12:42,100
once we have that model, once
we've separated.

251
00:12:42,300 --> 00:12:45,033
So we know we kind of like we've modeled

252
00:12:45,033 --> 00:12:48,033
what goes like what goes into a yes.

253
00:12:48,166 --> 00:12:49,733
Like what?

254
00:12:49,733 --> 00:12:53,100
What is likely to yield a yes,
what is like the T of the null

255
00:12:53,100 --> 00:12:56,133
and the, border between them.

256
00:12:56,366 --> 00:13:02,600
Then we can feed our actual, email
that we got into this model

257
00:13:03,100 --> 00:13:06,066
and then get a response service,
for instance. Yes.

258
00:13:06,066 --> 00:13:06,700
And that's it.

259
00:13:06,700 --> 00:13:09,200
So we use all the training data
to create a model.

260
00:13:09,200 --> 00:13:14,200
We feed in our, actual email,
which this is important,

261
00:13:14,466 --> 00:13:16,066
which has exactly the same format.

262
00:13:16,066 --> 00:13:20,933
So you can see that every input here,
every

263
00:13:20,933 --> 00:13:23,300
every time we were training the data,

264
00:13:23,300 --> 00:13:28,166
the independent variable, 
the independent variable

265
00:13:28,166 --> 00:13:31,933
vector always had the same length 20,000
and always had the same format.

266
00:13:31,933 --> 00:13:35,466
So we know that this position
always corresponds to a certain word.

267
00:13:35,933 --> 00:13:37,633
This position is always a certain word.

268
00:13:39,066 --> 00:13:41,033
This position let's say one, two, three.

269
00:13:41,033 --> 00:13:44,200
Which why was where was it. 1234567.

270
00:13:44,500 --> 00:13:45,933
right.

271
00:13:45,933 --> 00:13:48,700
So this was one. No,
this one is the if. Right.

272
00:13:48,700 --> 00:13:51,266
So this corresponds to F
or something like that.

273
00:13:51,266 --> 00:13:53,300
So we know that it's, it's
the same format.

274
00:13:53,300 --> 00:13:55,000
It's always the same length 20,000.

275
00:13:55,000 --> 00:13:58,000
So we can safely city in this vector
into there.

276
00:13:58,300 --> 00:14:00,200
It's got the same number of features.

277
00:14:00,200 --> 00:14:01,566
we get an answer.

278
00:14:01,566 --> 00:14:03,033
So for instance we get yes.

279
00:14:03,033 --> 00:14:04,933
So and then we can like look back.

280
00:14:04,933 --> 00:14:07,500
Oh what did the actual email say.
It said hello Carol. Check it.

281
00:14:07,500 --> 00:14:08,133
Oh okay.

282
00:14:08,133 --> 00:14:11,133
So based on my training, I would have,

283
00:14:11,500 --> 00:14:14,466
most likely reply to this with a yes.

284
00:14:14,466 --> 00:14:15,433
Interesting.

285
00:14:15,433 --> 00:14:18,200
The other approach that we can take here
on, first of all, let's

286
00:14:18,200 --> 00:14:20,533
put this on our diagram.
There's our diagram.

287
00:14:20,533 --> 00:14:24,866
And that's a natural language processing
algorithm which is called Bag of Words

288
00:14:25,500 --> 00:14:26,633
sits over there.

289
00:14:26,633 --> 00:14:29,833
the other approach
that we could apply here or take here is

290
00:14:29,833 --> 00:14:35,466
we could instead of a logistic regression,
we could use a, neural network.

291
00:14:35,800 --> 00:14:38,000
We could because we have a vector. Right.

292
00:14:38,000 --> 00:14:42,233
So we have all these vectors
we could feed them into as an input layer,

293
00:14:42,233 --> 00:14:46,766
like over 20,000 neurons
into our, neural network.

294
00:14:46,833 --> 00:14:47,533
They would go through.

295
00:14:47,533 --> 00:14:50,533
We want to hidden layer two hidden
on those as many hidden layers as we want,

296
00:14:50,733 --> 00:14:53,600
our own decision on how to structure it.

297
00:14:53,600 --> 00:14:56,833
And then bam, we've got an output
layer and tells us yes or no.

298
00:14:56,833 --> 00:14:58,633
And so we again, which is all this data

299
00:14:58,633 --> 00:15:03,000
that we have here, all our millions
and millions and millions of emails

300
00:15:03,000 --> 00:15:06,900
and responses, would use that
to train our neural networks

301
00:15:07,233 --> 00:15:11,466
all through backpropagation and,
stochastic gradient descent.

302
00:15:11,466 --> 00:15:14,533
All the weights would be updated and bam,
we have an answer.

303
00:15:14,533 --> 00:15:15,900
So not bam, we have an answer.

304
00:15:15,900 --> 00:15:18,000
So we would use these answers here

305
00:15:18,000 --> 00:15:22,100
to train that, I would use the pairs like
the vector and the answer vector answer.

306
00:15:22,100 --> 00:15:23,533
So to minimize the error

307
00:15:23,533 --> 00:15:27,100
stochastic gradient descent
backpropagation updated weights. Bam.

308
00:15:27,100 --> 00:15:28,133
We have a neural network.

309
00:15:28,133 --> 00:15:32,333
It's all trained up
now we feed in our vector here

310
00:15:32,366 --> 00:15:35,366
which represents our new email
into the neural network.

311
00:15:35,366 --> 00:15:38,300
And voila, we get our answer.

312
00:15:38,300 --> 00:15:43,100
And in this case, might also be yes,
they might yield different result.

313
00:15:43,100 --> 00:15:47,066
But if the model is constructed,
well, more or less

314
00:15:47,066 --> 00:15:52,000
it should be coming up with similar,
or the same answers most of the time.

315
00:15:52,500 --> 00:15:55,200
And so in this case, we've got a deep
natural language

316
00:15:55,200 --> 00:15:58,200
process going on by the input
emphasis right there.

317
00:15:58,200 --> 00:16:01,133
We've got a deep natural language
processing algorithm.

318
00:16:01,133 --> 00:16:03,566
Right.
Because we're using a neural network.

319
00:16:03,566 --> 00:16:05,533
And that, is different.

320
00:16:05,533 --> 00:16:10,766
So in both cases the bag of words model,
in one case it's an NLP bag of words.

321
00:16:10,766 --> 00:16:14,033
In other cases a deep NLP bag of words.

322
00:16:14,866 --> 00:16:17,500
but in both cases
it is still a bag of words.

323
00:16:17,500 --> 00:16:21,200
And it has its own limitations
and it has its own.

324
00:16:22,233 --> 00:16:22,600
yeah,

325
00:16:22,600 --> 00:16:26,100
limitations
and issues that are not that great.

326
00:16:26,100 --> 00:16:29,766
And so I'll point out, one of them right
now is that the response is very simple.

327
00:16:29,766 --> 00:16:31,933
It's just a yes or no, right.

328
00:16:31,933 --> 00:16:33,966
Like we want something more sophisticated.

329
00:16:33,966 --> 00:16:35,433
We want like a conversation.

330
00:16:35,433 --> 00:16:37,566
Can't really have a conversation,
can't really build a chat bot

331
00:16:37,566 --> 00:16:39,300
if you're just going to be saying yes
no all the time.

332
00:16:39,300 --> 00:16:41,033
So that's one of the limitations.

333
00:16:41,033 --> 00:16:43,033
We'll talk about some more of them.

334
00:16:43,033 --> 00:16:45,066
in upcoming tutorial.

335
00:16:45,066 --> 00:16:48,466
And we'll also see how to overcome
those limitations

336
00:16:48,466 --> 00:16:51,800
and what models await us, in the future.

337
00:16:52,200 --> 00:16:53,866
And, I hope you enjoyed this tutorial.

338
00:16:53,866 --> 00:16:56,833
I really enjoyed going through all of this
with you together,

339
00:16:56,833 --> 00:16:59,166
and I can't wait to see you next time.

340
00:16:59,166 --> 00:17:02,766
Until then, enjoy
natural language processing.