1 00:00:00,866 --> 00:00:01,133 Hello and 2 00:00:01,133 --> 00:00:04,133 welcome back to the course on Deep Natural Language Processing. 3 00:00:04,166 --> 00:00:07,166 Today we're looking at the Bag of Words model. 4 00:00:07,300 --> 00:00:10,166 first thing I'd like us to look at is an email. 5 00:00:10,166 --> 00:00:12,866 An email I received just a few days ago. 6 00:00:12,866 --> 00:00:13,833 So here we go. 7 00:00:13,833 --> 00:00:18,633 the email is about a catch up, and, my friend is asking, hello, Carole. 8 00:00:18,633 --> 00:00:21,600 Checking if you're back in Oz or Sans for Australia. 9 00:00:21,600 --> 00:00:25,500 Let me know if you're around and keen to sync on how things are going. 10 00:00:25,600 --> 00:00:27,933 I deffo as a definitely, 11 00:00:27,933 --> 00:00:30,733 could use some of your creative thinking to help with mine. 12 00:00:30,733 --> 00:00:32,166 Cheers, Ava. 13 00:00:32,166 --> 00:00:36,600 And so, what I'd like us to pay attention to. 14 00:00:36,600 --> 00:00:40,133 First of all, of course, you can see that I sent this email to myself, but, 15 00:00:40,700 --> 00:00:43,633 that's just because I wanted to keep my a friend. 16 00:00:45,900 --> 00:00:49,066 actually is because I already replied to the email, and then I wanted to reset it. 17 00:00:49,066 --> 00:00:52,633 And also I wanted to keep my friend, keep his privacy. 18 00:00:52,933 --> 00:00:54,233 But this is a real email. 19 00:00:54,233 --> 00:00:58,466 This is the exact text that I got literally a couple of days ago. 20 00:00:58,800 --> 00:01:01,566 And, the titles would be different, but I just called. 21 00:01:01,566 --> 00:01:04,566 I changed it to catch up. And so, 22 00:01:04,966 --> 00:01:08,933 what is interesting about this, we're going to be looking at how, 23 00:01:08,933 --> 00:01:12,733 we can apply natural language processing to this email 24 00:01:12,733 --> 00:01:14,233 in the next couple of tutorials, 25 00:01:14,233 --> 00:01:17,566 and it will help us work with a real life example. 26 00:01:18,000 --> 00:01:23,600 And then the other thing is that you here, you can see in Google, 27 00:01:23,966 --> 00:01:28,366 the Gmail app for iPhone, 28 00:01:28,366 --> 00:01:31,366 you can see that it's giving me some suggestions. 29 00:01:31,733 --> 00:01:32,866 Very interesting. 30 00:01:32,866 --> 00:01:36,233 It's saying I it's already doing some quick replies that I can use. 31 00:01:36,233 --> 00:01:39,933 It can be yes, I'm around and back or sorry, I'm not very interesting. 32 00:01:39,933 --> 00:01:43,866 So let's keep that in mind and we will come back to this later. 33 00:01:44,300 --> 00:01:46,500 In the meantime, text of the email is here. 34 00:01:46,500 --> 00:01:48,533 What can we do with it. 35 00:01:48,533 --> 00:01:48,900 All right. 36 00:01:48,900 --> 00:01:51,633 So first things we're going to start off simple. 37 00:01:51,633 --> 00:01:53,700 We're going to create a model. 38 00:01:53,700 --> 00:01:54,166 We're going to 39 00:01:54,166 --> 00:01:59,033 look at how we can create a model that will give us an A yes no response. 40 00:01:59,033 --> 00:02:00,866 Because that's one of those questions. 41 00:02:00,866 --> 00:02:03,600 the question is are you back in Australia? 42 00:02:03,600 --> 00:02:05,833 Let me know if you're old and keen to think so. Yes. 43 00:02:05,833 --> 00:02:09,100 No, of course it's better to have a long response. 44 00:02:09,100 --> 00:02:12,000 And that's that's the social norm. 45 00:02:12,000 --> 00:02:17,233 And, it's it's, added the etiquette to, like, converse with people, 46 00:02:17,233 --> 00:02:21,033 not just say yes, no, but even let's try to get a yes no response. 47 00:02:21,033 --> 00:02:22,500 Let's see how we would go about that. 48 00:02:22,500 --> 00:02:24,766 Because that's the first step into NLP. 49 00:02:24,766 --> 00:02:28,700 And then further on we will see how we can expand that even more. 50 00:02:29,500 --> 00:02:29,866 All right. 51 00:02:29,866 --> 00:02:35,000 So we're going to start off with with a vector a vector or a 52 00:02:35,400 --> 00:02:40,200 just like an array, a full of zeros. 53 00:02:40,266 --> 00:02:42,266 Yeah. Let's call it a vector. So these are like that. 54 00:02:42,266 --> 00:02:45,266 So just 0000 is how many zeros. 55 00:02:45,433 --> 00:02:47,400 Well, a lot of zeros. 56 00:02:47,400 --> 00:02:50,466 20,000 elements in total. 57 00:02:50,466 --> 00:02:52,266 20,000. Why is that. 58 00:02:52,266 --> 00:02:55,566 Well it's because of the way that we're building as well. 59 00:02:55,900 --> 00:02:58,966 20,000 is the number of words 60 00:02:59,400 --> 00:03:04,566 that are commonly used by the average native English language speakers. 61 00:03:04,566 --> 00:03:08,733 So here's a, quick search on Google how many words in the English. 62 00:03:09,000 --> 00:03:10,366 So that's the search I took. 63 00:03:10,366 --> 00:03:13,700 I came up with how many words are there in the English language? 64 00:03:13,866 --> 00:03:16,800 171,476 words. 65 00:03:16,800 --> 00:03:20,266 That's how many entries in the Oxford Dictionary, plus some obsolete words, 66 00:03:20,733 --> 00:03:22,500 plus derivative words. 67 00:03:22,500 --> 00:03:24,200 Yeah. And so on. But also, 68 00:03:25,500 --> 00:03:28,600 people also you can see Google's giving us suggestion 69 00:03:28,966 --> 00:03:31,966 that more subtle adult native, test takers 70 00:03:31,966 --> 00:03:35,233 range from 20 to 30 20 to 35,000 words. 71 00:03:35,300 --> 00:03:39,400 Average native test takers of age eight or, you know, 10,000 words, average 72 00:03:39,400 --> 00:03:42,900 native test takers, a four or, you know, 5000 words. 73 00:03:43,133 --> 00:03:47,000 An adult native test takers learn almost whatever 74 00:03:47,866 --> 00:03:50,866 the science is going into so much detail. 75 00:03:51,466 --> 00:03:54,466 but the interesting thing here is that, 76 00:03:56,133 --> 00:04:00,000 how many like, what I wanted to point out, first of all, 20,000. 77 00:04:00,000 --> 00:04:02,600 And you will see why exactly we use this number, not more. 78 00:04:02,600 --> 00:04:08,033 what I wanted to point out is how many words are there in the English language. 79 00:04:08,033 --> 00:04:12,300 Even this in its own is actually, Google is applying natural language processing. 80 00:04:12,300 --> 00:04:16,233 It's it's looking at what we wrote and and then is also, 81 00:04:16,800 --> 00:04:19,000 checking, other similar answers. 82 00:04:19,000 --> 00:04:22,000 How many boards in the English language does that other person, 83 00:04:22,133 --> 00:04:24,566 the average person, know. So that's not the question to ask. 84 00:04:24,566 --> 00:04:27,500 But it came up with that. Then it came up with many other questions. 85 00:04:27,500 --> 00:04:31,966 So you can see that the irony is that even in this search on its own, 86 00:04:32,366 --> 00:04:36,266 we're already falling victim of natural language processing. 87 00:04:37,000 --> 00:04:38,400 even though that wasn't our intention, 88 00:04:38,400 --> 00:04:40,033 that's not what we're going to be talking about. 89 00:04:40,033 --> 00:04:42,466 But it's just funny that it came up anyway. 90 00:04:42,466 --> 00:04:45,533 So 20,000 words and, fun fact, 91 00:04:46,166 --> 00:04:49,066 is that we actually use, 92 00:04:49,066 --> 00:04:51,900 about 3000 words 93 00:04:51,900 --> 00:04:56,633 out of those 171,476 words, we only used 3000 words, 94 00:04:56,633 --> 00:05:01,100 not just in, conversational language, but you can see here, 95 00:05:01,333 --> 00:05:04,300 a vocabulary of just 3000 words provides coverage 96 00:05:04,300 --> 00:05:07,300 for around 95% of common texts, 97 00:05:07,666 --> 00:05:11,166 95% of, common text that I like. 98 00:05:11,166 --> 00:05:13,733 I'm assuming that's including books and stuff like that. 99 00:05:13,733 --> 00:05:18,400 So if you do the math, it's why only use 1.75% 100 00:05:18,400 --> 00:05:21,433 of the total number of words in the English language? 101 00:05:21,733 --> 00:05:26,100 So as you can see, even that 3000 like our 20,000 102 00:05:26,100 --> 00:05:31,566 is more than even the 3000 that covers 95% of the situation. 103 00:05:31,566 --> 00:05:33,133 So we're pretty good. 104 00:05:33,133 --> 00:05:37,766 We're definitely covered if we say that our vocabulary, 105 00:05:38,000 --> 00:05:43,866 all possible words that we can encounter is going to fit into a vector of 20,000. 106 00:05:43,866 --> 00:05:46,866 So every basically what we're saying, this is important. 107 00:05:47,166 --> 00:05:47,666 What we're saying 108 00:05:47,666 --> 00:05:52,666 is that every word in the English language has a position somewhere on this vector. 109 00:05:52,666 --> 00:05:55,833 So for example, this the word f could have this position. 110 00:05:55,833 --> 00:06:00,866 So if you count 123456 the seventh position in our custom 111 00:06:00,933 --> 00:06:05,966 made vector, is that word events always going to be on that position? 112 00:06:05,966 --> 00:06:07,933 That's very crucial for this. 113 00:06:07,933 --> 00:06:10,900 For instance the word badminton let's just say like that 114 00:06:10,900 --> 00:06:13,166 we can construct this vector any way we want. 115 00:06:13,166 --> 00:06:15,000 The word badminton could be on this position. 116 00:06:15,000 --> 00:06:16,633 It's always going to be on this position. 117 00:06:16,633 --> 00:06:18,433 And the word table is going to be on this position. 118 00:06:18,433 --> 00:06:21,333 And this is like how this bag of words model works. 119 00:06:21,333 --> 00:06:26,333 So, just keep in mind that once you like, once we've taken all 20,000 words, 120 00:06:27,133 --> 00:06:30,300 and then we've assigned them a space, that's where they that's 121 00:06:30,800 --> 00:06:34,600 what they will this like space and this vector will be associated with. 122 00:06:34,600 --> 00:06:35,666 They'll be associated with the word. 123 00:06:35,666 --> 00:06:38,933 But if this will be associated with the word badminton, this will 124 00:06:38,933 --> 00:06:41,000 this position will be associated with the word table. 125 00:06:42,700 --> 00:06:44,333 and the other thing is 126 00:06:44,333 --> 00:06:47,333 here you can see I've grayed out the first two and the last one, 127 00:06:47,333 --> 00:06:51,200 first two are going to be reserved for source and iOS. 128 00:06:51,200 --> 00:06:55,266 So stands for start of sentence, iOS stands for end of sentence. 129 00:06:55,766 --> 00:06:58,733 And the last one will be reserved for special words. 130 00:06:58,733 --> 00:07:01,366 And that's for those words that you're wondering about. 131 00:07:01,366 --> 00:07:04,200 I can I can hear your brain churning right now. 132 00:07:04,200 --> 00:07:09,400 What about those other 150,000 words that we didn't take into account? 133 00:07:09,400 --> 00:07:10,466 What if they come up? 134 00:07:10,466 --> 00:07:13,266 Well, if they come up, we're going to just associate them 135 00:07:13,266 --> 00:07:16,766 with this, with this last thing, this last element. 136 00:07:16,766 --> 00:07:18,033 We can just throw them all in there. 137 00:07:18,033 --> 00:07:21,033 Any kind of words that we can recognize in the 20,000, 138 00:07:21,100 --> 00:07:24,000 we cannot throw them into that lost element. 139 00:07:25,100 --> 00:07:25,433 All right. 140 00:07:25,433 --> 00:07:27,600 So let's go back to our email text. 141 00:07:27,600 --> 00:07:30,233 Here it is. Hello, Carol. Checking if you're back in Oz. 142 00:07:30,233 --> 00:07:33,000 Let me know if you are around etc. etc. etc.. 143 00:07:33,000 --> 00:07:33,833 Cheers. 144 00:07:33,833 --> 00:07:37,633 V and so let's see 145 00:07:37,633 --> 00:07:41,833 how this can be put into our bag of words. 146 00:07:41,833 --> 00:07:43,566 If you've probably noticed by now 147 00:07:43,566 --> 00:07:46,566 that this is our bag of words that we're constructing here. 148 00:07:46,566 --> 00:07:50,433 So now we're going to throw the text into this bag of words. 149 00:07:50,933 --> 00:07:51,700 How's that going to happen? 150 00:07:51,700 --> 00:07:54,933 I'm just going to throw it in and then I'll just I'll explain how it happens. 151 00:07:54,933 --> 00:07:57,933 So there it is. That's the result. 152 00:07:58,100 --> 00:07:59,900 It that's it. 153 00:07:59,900 --> 00:08:02,133 It all of course depends on how we construct our vector. 154 00:08:02,133 --> 00:08:05,100 But this is our result in the way we construct our vector. 155 00:08:05,100 --> 00:08:07,566 And let's let's look at this. way. 156 00:08:07,566 --> 00:08:12,266 So we've as we've discussed previously, we took the 20,000 words 157 00:08:12,266 --> 00:08:15,266 and we associated each position with, a word. 158 00:08:15,300 --> 00:08:19,366 And now we go through our, text and find 159 00:08:19,366 --> 00:08:23,700 and then like, increase the counter in each position of the associated word. 160 00:08:23,700 --> 00:08:29,133 So hello, let's say, in our vector, it is in position number five 161 00:08:29,400 --> 00:08:30,966 because we only have one. Hello. 162 00:08:30,966 --> 00:08:34,600 In this whole email we're going to put a one here. 163 00:08:35,000 --> 00:08:38,000 Cairo is definitely not an English language word. 164 00:08:38,033 --> 00:08:40,833 So we're going to have to put it into there. 165 00:08:40,833 --> 00:08:44,733 And the reason why there's three here is because we have Cairo. 166 00:08:45,333 --> 00:08:47,900 Then also and V 167 00:08:47,900 --> 00:08:51,233 those are non-English language words not among those 20,000. 168 00:08:51,266 --> 00:08:52,466 They're all going to go here. 169 00:08:53,700 --> 00:08:56,400 Then we've got the comma surprise. 170 00:08:56,400 --> 00:08:57,966 The comma also has a position. 171 00:08:57,966 --> 00:08:59,966 Let's say it was in position number. 172 00:08:59,966 --> 00:09:02,333 So 36789. 173 00:09:02,333 --> 00:09:04,733 So the ninth position is associated with a comma 174 00:09:04,733 --> 00:09:06,733 because we have one comma in our email. 175 00:09:06,733 --> 00:09:09,600 Oh actually we have two commas okay. So this should be a two. 176 00:09:09,600 --> 00:09:11,333 But let's let's not think about that. 177 00:09:11,333 --> 00:09:14,100 Let's let's forget about that comma. 178 00:09:14,100 --> 00:09:14,933 I didn't notice it. 179 00:09:14,933 --> 00:09:19,566 So assuming we have one comma in our email, this is a one checking. 180 00:09:19,966 --> 00:09:24,600 And let's say that this, this, element is associated with our checking. 181 00:09:24,600 --> 00:09:25,933 This is a one because there's only one. 182 00:09:25,933 --> 00:09:31,333 We're checking if it's a two because we have two ifs in our email. 183 00:09:31,633 --> 00:09:35,033 So it's going to be A2U is a two because we have two 184 00:09:35,033 --> 00:09:38,300 used in our email including, the rest of the text. 185 00:09:38,300 --> 00:09:40,433 I don't think there's any more use in there and so on. 186 00:09:40,433 --> 00:09:44,100 So that's basically how we fill this bag of words. 187 00:09:44,100 --> 00:09:47,100 We just put in the, the, 188 00:09:47,100 --> 00:09:50,300 quantity of words for every position's pretty straightforward. 189 00:09:50,300 --> 00:09:51,366 We're just, 190 00:09:51,366 --> 00:09:55,033 filling in this vector, as you can see, is going to be quite a sparse vector. 191 00:09:55,033 --> 00:09:58,466 It is going to be lots of zeros, almost 20,000 zeros. 192 00:09:58,466 --> 00:10:00,066 And some of the words are going to be filled in. 193 00:10:01,533 --> 00:10:02,266 okay. 194 00:10:02,266 --> 00:10:03,600 So what is our goal. 195 00:10:03,600 --> 00:10:07,333 So our goal as we discussed before is to come up with a reply 196 00:10:07,333 --> 00:10:12,433 yes or no to this email, which is now in the form of a vector. 197 00:10:12,933 --> 00:10:14,266 And how are we going to do that? 198 00:10:14,266 --> 00:10:16,433 Well, we're going to do it through training data. 199 00:10:16,433 --> 00:10:18,366 So we're going to look at all of the emails 200 00:10:18,366 --> 00:10:22,133 that I have reply to because this is us training a model 201 00:10:22,333 --> 00:10:26,266 to reply to my emails or in your case, in anybody's 202 00:10:26,266 --> 00:10:29,466 case, it's going to be training the model or to reply to their emails. 203 00:10:29,766 --> 00:10:32,000 We're going to look at training data. We're going to need some training data. 204 00:10:32,000 --> 00:10:35,000 I'm going to fish it out of the inbox or outbox. 205 00:10:35,233 --> 00:10:37,500 so let's say let's look at a couple. 206 00:10:37,500 --> 00:10:39,266 So here we've got hey mate. 207 00:10:39,266 --> 00:10:44,300 Have you read about Hinton's capsule networks and general reply to that. No. 208 00:10:45,066 --> 00:10:47,733 so we're going to use that as a training example. 209 00:10:47,733 --> 00:10:48,166 Next one. 210 00:10:48,166 --> 00:10:50,300 Did you like that recipe I sent you last week? 211 00:10:50,300 --> 00:10:52,833 The result? The answer was yes. 212 00:10:52,833 --> 00:10:54,000 it was a good recipe, I guess. 213 00:10:54,000 --> 00:10:57,366 So there we go. So now we have two. Three. 214 00:10:57,666 --> 00:10:59,666 Hi, Carol. Are you coming to dinner tonight? 215 00:10:59,666 --> 00:11:01,066 Yes, dear. 216 00:11:01,066 --> 00:11:04,100 Carol, would you like to service your car with us again? No. 217 00:11:04,700 --> 00:11:07,666 Are you coming to Australia in December? Yes. 218 00:11:07,666 --> 00:11:08,400 And so on. 219 00:11:08,400 --> 00:11:11,766 So ideally we would have tens or hundreds 220 00:11:11,766 --> 00:11:14,833 of thousands of emails like that and responses like that. Yes. 221 00:11:14,833 --> 00:11:16,000 No responses. 222 00:11:16,000 --> 00:11:19,333 I of course, would be like a lot of groundwork to get that data 223 00:11:19,333 --> 00:11:22,333 because we usually don't just respond yes no to emails. 224 00:11:22,333 --> 00:11:27,666 So we'd have to look at this answer and understand what was the sentiment. 225 00:11:27,700 --> 00:11:30,700 The sentiment was no. What was the overall? 226 00:11:30,900 --> 00:11:33,133 Was it a yes or no? No. Yes or no? It's on. 227 00:11:34,233 --> 00:11:36,833 of course, it's kind of more of a theoretical example. 228 00:11:36,833 --> 00:11:40,500 Nobody's going to do this for their own inbox, but nevertheless the point stands. 229 00:11:40,866 --> 00:11:42,433 So how would we train? 230 00:11:42,433 --> 00:11:44,466 How would we use this training data? 231 00:11:44,466 --> 00:11:47,466 We would use a similar principle and convert each one of those emails 232 00:11:47,466 --> 00:11:48,533 to a vector. 233 00:11:48,533 --> 00:11:53,900 in this and again each vector would be 20,000 elements long. 234 00:11:53,900 --> 00:11:59,400 So yeah, I just threw some numbers in here to to get the point across. 235 00:11:59,666 --> 00:12:00,766 It's not exactly accurate. 236 00:12:00,766 --> 00:12:04,533 But so we have these vectors like lots and lots, lots of vectors. 237 00:12:04,766 --> 00:12:06,333 Lots and lots and lots of responses. 238 00:12:06,333 --> 00:12:07,933 Yes and no. 239 00:12:07,933 --> 00:12:09,033 And yeah. 240 00:12:09,033 --> 00:12:12,100 So now what we're going to do is we're 241 00:12:12,100 --> 00:12:15,100 going to, 242 00:12:15,700 --> 00:12:16,766 apply a model. 243 00:12:16,766 --> 00:12:18,600 Once we have all this data we're going to apply model. 244 00:12:18,600 --> 00:12:22,033 So one of the models we can apply to create our bag of words, 245 00:12:22,766 --> 00:12:26,266 or one of the algorithms we can apply to create our bag of words 246 00:12:26,266 --> 00:12:28,800 model is the logistic regression. 247 00:12:28,800 --> 00:12:31,600 So we apply the logistic regression to our yes 248 00:12:31,600 --> 00:12:35,166 no responses to these to this information that we have. 249 00:12:35,666 --> 00:12:37,966 and then 250 00:12:39,100 --> 00:12:42,100 once we have that model, once we've separated. 251 00:12:42,300 --> 00:12:45,033 So we know we kind of like we've modeled 252 00:12:45,033 --> 00:12:48,033 what goes like what goes into a yes. 253 00:12:48,166 --> 00:12:49,733 Like what? 254 00:12:49,733 --> 00:12:53,100 What is likely to yield a yes, what is like the T of the null 255 00:12:53,100 --> 00:12:56,133 and the, border between them. 256 00:12:56,366 --> 00:13:02,600 Then we can feed our actual, email that we got into this model 257 00:13:03,100 --> 00:13:06,066 and then get a response service, for instance. Yes. 258 00:13:06,066 --> 00:13:06,700 And that's it. 259 00:13:06,700 --> 00:13:09,200 So we use all the training data to create a model. 260 00:13:09,200 --> 00:13:14,200 We feed in our, actual email, which this is important, 261 00:13:14,466 --> 00:13:16,066 which has exactly the same format. 262 00:13:16,066 --> 00:13:20,933 So you can see that every input here, every 263 00:13:20,933 --> 00:13:23,300 every time we were training the data, 264 00:13:23,300 --> 00:13:28,166 the independent variable, the independent variable 265 00:13:28,166 --> 00:13:31,933 vector always had the same length 20,000 and always had the same format. 266 00:13:31,933 --> 00:13:35,466 So we know that this position always corresponds to a certain word. 267 00:13:35,933 --> 00:13:37,633 This position is always a certain word. 268 00:13:39,066 --> 00:13:41,033 This position let's say one, two, three. 269 00:13:41,033 --> 00:13:44,200 Which why was where was it. 1234567. 270 00:13:44,500 --> 00:13:45,933 right. 271 00:13:45,933 --> 00:13:48,700 So this was one. No, this one is the if. Right. 272 00:13:48,700 --> 00:13:51,266 So this corresponds to F or something like that. 273 00:13:51,266 --> 00:13:53,300 So we know that it's, it's the same format. 274 00:13:53,300 --> 00:13:55,000 It's always the same length 20,000. 275 00:13:55,000 --> 00:13:58,000 So we can safely city in this vector into there. 276 00:13:58,300 --> 00:14:00,200 It's got the same number of features. 277 00:14:00,200 --> 00:14:01,566 we get an answer. 278 00:14:01,566 --> 00:14:03,033 So for instance we get yes. 279 00:14:03,033 --> 00:14:04,933 So and then we can like look back. 280 00:14:04,933 --> 00:14:07,500 Oh what did the actual email say. It said hello Carol. Check it. 281 00:14:07,500 --> 00:14:08,133 Oh okay. 282 00:14:08,133 --> 00:14:11,133 So based on my training, I would have, 283 00:14:11,500 --> 00:14:14,466 most likely reply to this with a yes. 284 00:14:14,466 --> 00:14:15,433 Interesting. 285 00:14:15,433 --> 00:14:18,200 The other approach that we can take here on, first of all, let's 286 00:14:18,200 --> 00:14:20,533 put this on our diagram. There's our diagram. 287 00:14:20,533 --> 00:14:24,866 And that's a natural language processing algorithm which is called Bag of Words 288 00:14:25,500 --> 00:14:26,633 sits over there. 289 00:14:26,633 --> 00:14:29,833 the other approach that we could apply here or take here is 290 00:14:29,833 --> 00:14:35,466 we could instead of a logistic regression, we could use a, neural network. 291 00:14:35,800 --> 00:14:38,000 We could because we have a vector. Right. 292 00:14:38,000 --> 00:14:42,233 So we have all these vectors we could feed them into as an input layer, 293 00:14:42,233 --> 00:14:46,766 like over 20,000 neurons into our, neural network. 294 00:14:46,833 --> 00:14:47,533 They would go through. 295 00:14:47,533 --> 00:14:50,533 We want to hidden layer two hidden on those as many hidden layers as we want, 296 00:14:50,733 --> 00:14:53,600 our own decision on how to structure it. 297 00:14:53,600 --> 00:14:56,833 And then bam, we've got an output layer and tells us yes or no. 298 00:14:56,833 --> 00:14:58,633 And so we again, which is all this data 299 00:14:58,633 --> 00:15:03,000 that we have here, all our millions and millions and millions of emails 300 00:15:03,000 --> 00:15:06,900 and responses, would use that to train our neural networks 301 00:15:07,233 --> 00:15:11,466 all through backpropagation and, stochastic gradient descent. 302 00:15:11,466 --> 00:15:14,533 All the weights would be updated and bam, we have an answer. 303 00:15:14,533 --> 00:15:15,900 So not bam, we have an answer. 304 00:15:15,900 --> 00:15:18,000 So we would use these answers here 305 00:15:18,000 --> 00:15:22,100 to train that, I would use the pairs like the vector and the answer vector answer. 306 00:15:22,100 --> 00:15:23,533 So to minimize the error 307 00:15:23,533 --> 00:15:27,100 stochastic gradient descent backpropagation updated weights. Bam. 308 00:15:27,100 --> 00:15:28,133 We have a neural network. 309 00:15:28,133 --> 00:15:32,333 It's all trained up now we feed in our vector here 310 00:15:32,366 --> 00:15:35,366 which represents our new email into the neural network. 311 00:15:35,366 --> 00:15:38,300 And voila, we get our answer. 312 00:15:38,300 --> 00:15:43,100 And in this case, might also be yes, they might yield different result. 313 00:15:43,100 --> 00:15:47,066 But if the model is constructed, well, more or less 314 00:15:47,066 --> 00:15:52,000 it should be coming up with similar, or the same answers most of the time. 315 00:15:52,500 --> 00:15:55,200 And so in this case, we've got a deep natural language 316 00:15:55,200 --> 00:15:58,200 process going on by the input emphasis right there. 317 00:15:58,200 --> 00:16:01,133 We've got a deep natural language processing algorithm. 318 00:16:01,133 --> 00:16:03,566 Right. Because we're using a neural network. 319 00:16:03,566 --> 00:16:05,533 And that, is different. 320 00:16:05,533 --> 00:16:10,766 So in both cases the bag of words model, in one case it's an NLP bag of words. 321 00:16:10,766 --> 00:16:14,033 In other cases a deep NLP bag of words. 322 00:16:14,866 --> 00:16:17,500 but in both cases it is still a bag of words. 323 00:16:17,500 --> 00:16:21,200 And it has its own limitations and it has its own. 324 00:16:22,233 --> 00:16:22,600 yeah, 325 00:16:22,600 --> 00:16:26,100 limitations and issues that are not that great. 326 00:16:26,100 --> 00:16:29,766 And so I'll point out, one of them right now is that the response is very simple. 327 00:16:29,766 --> 00:16:31,933 It's just a yes or no, right. 328 00:16:31,933 --> 00:16:33,966 Like we want something more sophisticated. 329 00:16:33,966 --> 00:16:35,433 We want like a conversation. 330 00:16:35,433 --> 00:16:37,566 Can't really have a conversation, can't really build a chat bot 331 00:16:37,566 --> 00:16:39,300 if you're just going to be saying yes no all the time. 332 00:16:39,300 --> 00:16:41,033 So that's one of the limitations. 333 00:16:41,033 --> 00:16:43,033 We'll talk about some more of them. 334 00:16:43,033 --> 00:16:45,066 in upcoming tutorial. 335 00:16:45,066 --> 00:16:48,466 And we'll also see how to overcome those limitations 336 00:16:48,466 --> 00:16:51,800 and what models await us, in the future. 337 00:16:52,200 --> 00:16:53,866 And, I hope you enjoyed this tutorial. 338 00:16:53,866 --> 00:16:56,833 I really enjoyed going through all of this with you together, 339 00:16:56,833 --> 00:16:59,166 and I can't wait to see you next time. 340 00:16:59,166 --> 00:17:02,766 Until then, enjoy natural language processing.