1 00:00:00,866 --> 00:00:02,966 Hello and welcome back to the course on Machine Learning. 2 00:00:02,966 --> 00:00:06,600 Today we're talking about the intuition behind the a priori algorithm. 3 00:00:06,900 --> 00:00:08,733 So let's get started. 4 00:00:08,733 --> 00:00:11,300 And we're going to get started by talking about a story. 5 00:00:11,300 --> 00:00:15,266 It's a somewhat legend of data science 6 00:00:15,266 --> 00:00:18,900 or a legend that is, quite well known in data science. 7 00:00:18,900 --> 00:00:21,833 And you may have heard of, this, legend. 8 00:00:21,833 --> 00:00:23,866 It's not a myth. It's actually happened. 9 00:00:23,866 --> 00:00:28,000 But, as you know, things when they happened a long time ago 10 00:00:28,200 --> 00:00:32,233 and then time passes and the facts get the story. 11 00:00:32,233 --> 00:00:34,800 But I'll tell you my story of this legend. 12 00:00:34,800 --> 00:00:37,900 And, it might not be exactly correct, but this is how I, 13 00:00:38,533 --> 00:00:41,533 know about and how I've heard about it. 14 00:00:41,700 --> 00:00:45,933 So, what do you think the commonality is between these two products? 15 00:00:46,433 --> 00:00:50,000 Pampers or diapers and beer? 16 00:00:50,166 --> 00:00:53,700 What do you think they have in common, and why are they 17 00:00:53,700 --> 00:00:56,733 part of this, or, urban legend? 18 00:00:56,866 --> 00:00:59,166 why are they part of this data science lesson? 19 00:00:59,166 --> 00:01:01,400 Well, as the story goes, 20 00:01:02,533 --> 00:01:05,400 a company we're not going to name the company, 21 00:01:05,400 --> 00:01:10,866 but a company that is actually actually like a, convenience store, 22 00:01:11,300 --> 00:01:15,133 did some analytics around the, 23 00:01:15,900 --> 00:01:18,166 products that people are purchasing. 24 00:01:18,166 --> 00:01:20,800 And so they were looking at, 25 00:01:20,800 --> 00:01:23,833 you know, what people are checking out with what are the commonalities? 26 00:01:23,833 --> 00:01:27,433 And they analyzed thousands and thousands and thousands of, transactions. 27 00:01:27,433 --> 00:01:31,800 So thousands of people who actually checked out, if not tens of thousands and, 28 00:01:32,233 --> 00:01:35,233 they found a very interesting thing that, 29 00:01:35,766 --> 00:01:39,033 very often during certain times of the day 30 00:01:39,033 --> 00:01:44,433 when people shop in the afternoon, between six and, 9 p.m., 31 00:01:44,966 --> 00:01:49,933 people who buy, diapers also buy beer. 32 00:01:50,500 --> 00:01:53,500 And it was like, out of the blue, completely out of the blue. 33 00:01:53,766 --> 00:01:56,966 like how why these two products are completely not connected. 34 00:01:56,966 --> 00:01:57,666 Right. 35 00:01:57,666 --> 00:02:00,933 why would somebody buy, beer when they're buying diapers? 36 00:02:00,933 --> 00:02:03,533 Or why buy diapers when they're buying beer? Right. 37 00:02:03,533 --> 00:02:07,566 So, that was the fact that they came across in the data 38 00:02:08,533 --> 00:02:10,733 and, the explanation to this 39 00:02:10,733 --> 00:02:13,733 fact, one of the plausible explanations is that, 40 00:02:14,700 --> 00:02:17,566 in the afternoons or in the evenings when, 41 00:02:17,566 --> 00:02:22,666 the husband gets home, and they're like him, 42 00:02:22,833 --> 00:02:25,833 the husband and the wife are taking care of the their baby. 43 00:02:26,533 --> 00:02:28,800 they sometimes find that they run out of diapers. 44 00:02:28,800 --> 00:02:30,933 And who has to go pick up the diapers? 45 00:02:30,933 --> 00:02:33,566 Well, the husband has to go pick up the diapers, right? 46 00:02:33,566 --> 00:02:35,866 Or the wife sends the husband to go pick up the diapers. 47 00:02:35,866 --> 00:02:37,733 And while he's picking up the diapers, 48 00:02:37,733 --> 00:02:40,233 because it's really after hours after work, 49 00:02:40,233 --> 00:02:42,166 he also he's already in the convenience store. 50 00:02:42,166 --> 00:02:44,333 He also picks up some beer. Right. 51 00:02:44,333 --> 00:02:46,800 And so that is a plausible explanation. 52 00:02:46,800 --> 00:02:49,800 Might be the case, might not be the case, but sounds pretty reasonable 53 00:02:50,233 --> 00:02:51,900 and based on that. 54 00:02:51,900 --> 00:02:54,900 So that's something that you can't really think of just by yourself. 55 00:02:54,900 --> 00:02:56,366 But that comes from the data. Right. 56 00:02:56,366 --> 00:03:01,100 And based on that you can decide how to arrange products in your store. 57 00:03:01,100 --> 00:03:01,300 Right. 58 00:03:01,300 --> 00:03:03,800 So some stores might decide to put these two products 59 00:03:03,800 --> 00:03:07,533 closer to entice people to buy a beer when they're buying diapers. 60 00:03:07,533 --> 00:03:08,866 But actually, a lot of stores 61 00:03:09,833 --> 00:03:10,833 do the opposite. 62 00:03:10,833 --> 00:03:16,466 There are a lot of stores, separate, beer and diapers, right? 63 00:03:16,466 --> 00:03:19,866 Just like they try to separate and you'll probably notice 64 00:03:19,866 --> 00:03:23,100 this from your convenience store that they try to separate, 65 00:03:23,733 --> 00:03:26,300 bread and milk as far as possible. Why? 66 00:03:26,300 --> 00:03:28,066 Because that way. 67 00:03:28,066 --> 00:03:31,066 Yeah, they already know that these two products are bored together. 68 00:03:31,200 --> 00:03:36,300 And so you actually have to walk through the whole store to pick up, 69 00:03:36,533 --> 00:03:38,866 you know, you've picked up your bread and then to get to the milk, 70 00:03:38,866 --> 00:03:42,300 you have to get all the way through the whole store to the completely opposite 71 00:03:42,600 --> 00:03:44,100 corner of the store. 72 00:03:44,100 --> 00:03:47,566 So as you're walking through the store, you see more other products 73 00:03:47,566 --> 00:03:49,533 and you're more likely to pick up 74 00:03:49,533 --> 00:03:52,666 an additional item that you weren't actually planning on buying 75 00:03:52,666 --> 00:03:54,100 when you got to the store in the first place. 76 00:03:54,100 --> 00:03:57,700 So there's a lot of interesting marketing tactics that are used based on this data. 77 00:03:57,700 --> 00:04:00,133 But the question is, how do you get to this data? 78 00:04:00,133 --> 00:04:03,300 And one of the ways to get to it is that a priori algorithm. 79 00:04:04,000 --> 00:04:07,000 So let's talk about a priori in a bit more detail. Now. 80 00:04:07,733 --> 00:04:08,133 All right. 81 00:04:08,133 --> 00:04:13,200 So a priori is about people who bought something, also bought something else, 82 00:04:13,200 --> 00:04:16,000 or who watched something, also watch something else 83 00:04:16,000 --> 00:04:17,966 or who did something also did something else. 84 00:04:17,966 --> 00:04:21,766 So, it analyzes and this whole association, 85 00:04:22,266 --> 00:04:26,300 rule learning a part of the course is all about analyzing when things, 86 00:04:26,933 --> 00:04:31,166 come in pairs or in triplicate or in, in C, like not in sequence, 87 00:04:31,166 --> 00:04:34,433 but they are combined together for some reason, 88 00:04:35,100 --> 00:04:39,900 looking for those, rules and those ways that this happens. 89 00:04:40,866 --> 00:04:41,166 All right. 90 00:04:41,166 --> 00:04:42,400 So let's have a look. 91 00:04:42,400 --> 00:04:44,966 for instance, movie recommendation. Right. 92 00:04:44,966 --> 00:04:48,733 So you've got user IDs, you've got movies that the people liked. 93 00:04:49,133 --> 00:04:53,100 Movie one, two, three, four, movie one and two for the second person, and so on. 94 00:04:53,600 --> 00:04:57,833 And from here, just by looking at it, even without not knowing anything about, 95 00:04:58,333 --> 00:05:01,800 association rule learning or a priori, all the a priori algorithm, 96 00:05:01,966 --> 00:05:05,733 you can really tell that, there are some potential rules 97 00:05:05,733 --> 00:05:08,800 that can come out of this that, for instance, everybody who watches movie 98 00:05:08,800 --> 00:05:13,066 one, not everybody, but it is likely that people who watch movie one will, 99 00:05:13,066 --> 00:05:16,533 or who like movie one will also like movie number two. 100 00:05:17,000 --> 00:05:17,900 And people who like 101 00:05:17,900 --> 00:05:21,933 movie number two are quite likely to also like movie number four. 102 00:05:22,366 --> 00:05:26,133 And people who like movie number one are also quite likely to like movie 103 00:05:26,133 --> 00:05:27,166 number three. 104 00:05:27,166 --> 00:05:28,700 So there you can 105 00:05:28,700 --> 00:05:30,100 you can come up with lots of different 106 00:05:30,100 --> 00:05:32,000 potential rules, but some are going to be stronger, 107 00:05:32,000 --> 00:05:35,000 some are going to be weaker, and we want to find the very strong ones 108 00:05:35,333 --> 00:05:39,600 in order to build our business decisions or our other decisions, 109 00:05:40,166 --> 00:05:43,200 on those rules that we can see in the data. 110 00:05:43,200 --> 00:05:43,400 Right. 111 00:05:43,400 --> 00:05:47,933 We don't have to go and ask people, hey, do you like movie number one? 112 00:05:47,933 --> 00:05:49,933 And would you like movie number two because of that? 113 00:05:49,933 --> 00:05:52,500 Do you like movie number two or what is your taste and preference? 114 00:05:52,500 --> 00:05:55,866 We can see these things from the data and we want to extract this information. 115 00:05:55,866 --> 00:05:59,166 And as long as, you know, we have a large enough sample size, 116 00:05:59,233 --> 00:06:03,533 know if it's not just like five people, if it's 50,000 or, 500,000 people 117 00:06:03,533 --> 00:06:07,466 that we're analyzing, we can come up with quite some quite solid rules. 118 00:06:08,633 --> 00:06:08,966 All right. 119 00:06:08,966 --> 00:06:11,700 So, here's 120 00:06:11,700 --> 00:06:14,600 another example where we've got a market basket. 121 00:06:14,600 --> 00:06:20,200 So, example of, people who, buy a grocery, not just groceries, 122 00:06:20,200 --> 00:06:24,733 but this small kind of like a, restaurant or a, takeaway place. 123 00:06:25,000 --> 00:06:28,966 And here you can see there's a link, obviously, in burgers and French fries, 124 00:06:29,166 --> 00:06:31,333 interesting vegetables and fruits and people 125 00:06:31,333 --> 00:06:33,300 trying to be healthy burgers, French fries and ketchup. 126 00:06:33,300 --> 00:06:35,900 So again, these are potential rules, not necessarily 127 00:06:35,900 --> 00:06:37,366 the ones that we're going to take away from data. 128 00:06:37,366 --> 00:06:40,366 This is just an example of something that you might observe, 129 00:06:40,566 --> 00:06:43,566 visually just by looking at this data set. 130 00:06:43,900 --> 00:06:44,233 All right. 131 00:06:44,233 --> 00:06:47,100 So how does that apriori algorithm work? 132 00:06:47,100 --> 00:06:49,866 Well, the apriori algorithm has three parts to it. 133 00:06:49,866 --> 00:06:53,100 It has got the support, the confidence and the lift. 134 00:06:53,533 --> 00:06:55,266 So we're going to start off with the support. 135 00:06:55,266 --> 00:06:58,700 and you will see that it's, it's very similar to 136 00:06:58,700 --> 00:06:59,900 something we've already discussed. 137 00:06:59,900 --> 00:07:04,800 It's very similar to the way we talked about the intuition for the Bayesian, 138 00:07:05,033 --> 00:07:09,100 for the Naive Bayes, classifiers. 139 00:07:09,400 --> 00:07:10,633 So let's have a look here. 140 00:07:10,633 --> 00:07:15,033 We've got movie recommendations, support for movie. is 141 00:07:16,300 --> 00:07:19,066 the number is defined as the number of users, 142 00:07:19,066 --> 00:07:23,600 who watched movie M divided by the total number of users. 143 00:07:23,600 --> 00:07:24,400 Right. 144 00:07:24,400 --> 00:07:26,700 And Market basket optimization, same thing. 145 00:07:26,700 --> 00:07:29,400 number of transactions containing. 146 00:07:29,400 --> 00:07:32,566 So an item I divided by the total number of transactions. 147 00:07:32,900 --> 00:07:35,233 Let's, have a look at an illustration here. 148 00:07:35,233 --> 00:07:39,133 We've got 100 people, so we've got five rows 149 00:07:39,133 --> 00:07:42,266 and 20 columns of human beings. 150 00:07:42,266 --> 00:07:45,266 and, 151 00:07:45,466 --> 00:07:47,633 let's see how many of them, 152 00:07:47,633 --> 00:07:50,433 let's say we're talking about a movie 153 00:07:50,433 --> 00:07:54,600 and I'm going to, give an example of one of my favorite movies, Ex Machina. 154 00:07:54,600 --> 00:07:56,933 And if you haven't seen it, definitely check it out. 155 00:07:56,933 --> 00:07:59,000 It's all about AI and machine learning. 156 00:07:59,000 --> 00:08:03,700 So let's say let's see how many of these people have actually seen Ex Machina. 157 00:08:04,133 --> 00:08:05,000 So there we go. 158 00:08:05,000 --> 00:08:10,466 There's ten people who have seen Ex Machina right out of 100. 159 00:08:10,600 --> 00:08:11,600 So what does that mean? 160 00:08:11,600 --> 00:08:14,866 That means our support here is 10% quit. 161 00:08:15,000 --> 00:08:15,766 Okay. 162 00:08:15,766 --> 00:08:17,566 Now let's move on to step two. 163 00:08:17,566 --> 00:08:20,400 Step two is we need to find the confidence. 164 00:08:20,400 --> 00:08:21,300 What is the confidence? 165 00:08:21,300 --> 00:08:24,966 Well, confidence is, defined as the number. 166 00:08:25,000 --> 00:08:25,966 Let's go for movies. 167 00:08:25,966 --> 00:08:30,000 So the number of, people who have seen, movies 168 00:08:30,000 --> 00:08:33,266 M1 and M2 divided by the number of people have seen a movie M1. 169 00:08:33,266 --> 00:08:36,800 So here we're going to assume that we we're testing a rule. 170 00:08:36,800 --> 00:08:41,366 We're testing a rule that, let's say people who have seen interstellar, right? 171 00:08:41,366 --> 00:08:46,033 Where we have a hypothesis that, says that people have seen interstellar, 172 00:08:46,166 --> 00:08:49,966 they are also or have, liked interstellar 173 00:08:50,500 --> 00:08:54,833 are also, likely to like, mixed machine. 174 00:08:54,833 --> 00:08:56,000 Oh, let's let's even go. 175 00:08:56,000 --> 00:08:59,066 We've seen that people have seen interstellar are also likely 176 00:08:59,066 --> 00:09:02,066 to have seen, Ex Machina. 177 00:09:02,100 --> 00:09:06,266 So basically here movie number one, M1 is going to be 178 00:09:06,266 --> 00:09:10,433 the, interstellar movie, 179 00:09:11,200 --> 00:09:14,233 the one that we're saying, okay, so we're going to take 180 00:09:14,233 --> 00:09:16,033 everybody who's seen interstellar and we're going to check 181 00:09:16,033 --> 00:09:17,633 how many of them have seen Ex Machina. 182 00:09:17,633 --> 00:09:19,900 And that's exactly what we're doing here. 183 00:09:19,900 --> 00:09:21,900 And Market Basket optimization, same thing. 184 00:09:21,900 --> 00:09:24,733 You can think of an example of French fries and burgers, for instance. 185 00:09:24,733 --> 00:09:26,100 People have had burgers. 186 00:09:26,100 --> 00:09:29,000 We've ordered burgers also likely to order French fries. 187 00:09:29,000 --> 00:09:32,833 So, at the top you would have people have ordered burgers and French fries. 188 00:09:33,066 --> 00:09:36,400 And at the bottom you have people who have ordered burgers only, 189 00:09:37,066 --> 00:09:39,600 who have ordered burgers, regardless of whether they've ordered 190 00:09:39,600 --> 00:09:40,900 French fries or not. 191 00:09:40,900 --> 00:09:43,900 much easier to talk about this with an illustration. 192 00:09:44,200 --> 00:09:48,366 let's say those great people in colored in green 193 00:09:48,900 --> 00:09:51,400 are the ones who have seen interstellar, right? 194 00:09:51,400 --> 00:09:54,333 Who have, watch this movie. 195 00:09:54,333 --> 00:09:57,166 Now we want to know, not out of a whole population, 196 00:09:57,166 --> 00:10:00,166 but out of just those people who have seen interstellar. 197 00:10:00,533 --> 00:10:02,933 How many of them have seen Ex Machina? 198 00:10:02,933 --> 00:10:07,166 So out of them, we have seven people who have also seen Ex Machina. 199 00:10:07,166 --> 00:10:09,733 So there's only seven people who have seen both movies. 200 00:10:09,733 --> 00:10:11,566 That's what we're after. 201 00:10:11,566 --> 00:10:16,333 And so our confidence is going to be seven divided by 40, just by definition. 202 00:10:16,333 --> 00:10:18,133 This is how it's calculated. 203 00:10:18,133 --> 00:10:20,666 40 people have seen, interstellar 204 00:10:20,666 --> 00:10:24,833 and seven people out of those 40 have actually also seen Ex Machina. 205 00:10:24,833 --> 00:10:28,066 So, the conference here is 17.5%. 206 00:10:28,966 --> 00:10:29,733 Good. 207 00:10:29,733 --> 00:10:33,066 And the next part or the third and last step is the lift. 208 00:10:33,066 --> 00:10:34,033 And what is the lift? 209 00:10:34,033 --> 00:10:35,066 Lift is very simple. 210 00:10:35,066 --> 00:10:41,400 Again, is going to be very similar to, what we had in the naive Bayes, naive 211 00:10:41,400 --> 00:10:45,866 Bayesian classifiers, in that algorithm when we were discussing it. 212 00:10:46,200 --> 00:10:50,566 So the lift is basically the confidence divided by the support. 213 00:10:51,500 --> 00:10:55,200 so what we calculated in step two divided by what we calculated in step one. 214 00:10:55,366 --> 00:10:57,833 And let's just to talk about it in the illustration, 215 00:10:57,833 --> 00:11:00,300 because it's going to make way more sense that way. 216 00:11:00,300 --> 00:11:03,166 so here's our population. 217 00:11:03,166 --> 00:11:06,166 Those people in green are the ones who have seen interstellar. 218 00:11:06,600 --> 00:11:10,233 And all of these people in red are the ones who have seen Ex Machina. 219 00:11:10,233 --> 00:11:13,366 So basically our lift is all right. 220 00:11:13,366 --> 00:11:17,366 So if we just randomly, right, randomly suggest 221 00:11:17,366 --> 00:11:20,433 to a person to watch Ex Machina, right. 222 00:11:20,800 --> 00:11:26,266 what is the likelihood that they will, you know, that it's a movie for them. 223 00:11:26,266 --> 00:11:28,766 It's a movie that's not in this population. 224 00:11:28,766 --> 00:11:30,366 Like out of the out of this population. 225 00:11:30,366 --> 00:11:33,400 We know that out of 100 people, only ten actually works. 226 00:11:33,433 --> 00:11:34,200 What shakes machine. 227 00:11:34,200 --> 00:11:37,733 And we're going to assume that watched and like are interchangeable terms here. 228 00:11:37,733 --> 00:11:40,300 So we're going to assume that if they if they didn't watch it, 229 00:11:40,300 --> 00:11:41,666 they're not going to like it anyway. 230 00:11:41,666 --> 00:11:44,666 So if we take another random, 231 00:11:44,800 --> 00:11:50,500 this population and then, what is the likelihood 232 00:11:50,533 --> 00:11:51,700 that if we recommend 233 00:11:51,700 --> 00:11:55,200 to a random person in that population, that brand new population, 234 00:11:55,833 --> 00:11:59,033 we recommend that, the Ex Machina movie, what is the likelihood 235 00:11:59,033 --> 00:12:00,300 that they will like it? 236 00:12:00,300 --> 00:12:03,866 Well, the likelihood is, 10%, right? 237 00:12:03,866 --> 00:12:07,733 Because we only out of 100 people, only ten of them actually liked that movie. 238 00:12:08,100 --> 00:12:13,933 But now the question is, can we improve that result by using some prior knowledge? 239 00:12:13,933 --> 00:12:15,733 That's why the algorithm is called a priori. 240 00:12:17,566 --> 00:12:19,533 in that new population, let's 241 00:12:19,533 --> 00:12:24,133 only recommend Ex Machina to people who have already seen interstellar. 242 00:12:24,133 --> 00:12:27,133 So people who are marked as green in this population. 243 00:12:27,166 --> 00:12:29,133 So we will only find out. 244 00:12:29,133 --> 00:12:30,700 We will only ask, have you seen interstellar? 245 00:12:30,700 --> 00:12:32,700 If they have, then we'll recommend Ex Machina. 246 00:12:32,700 --> 00:12:36,833 What is the likelihood that a person will actually like Ex Machina 247 00:12:37,400 --> 00:12:38,500 if we recommend them that way? 248 00:12:38,500 --> 00:12:42,466 Well, in that case, the likelihood, as we've calculated out of the green people 249 00:12:42,466 --> 00:12:49,200 only, not only of the green people, 17.5% actually elect ex machina. 250 00:12:49,533 --> 00:12:55,133 So the lift is the improvement in your prediction. 251 00:12:55,133 --> 00:12:58,133 So your original prediction, your original predictions 10%. 252 00:12:58,200 --> 00:12:58,433 Right. 253 00:12:58,433 --> 00:13:01,333 If you just randomly take a person out of your new population 254 00:13:01,333 --> 00:13:04,466 and recommend them Ex Machina, they'll like it with a likelihood of 10%. 255 00:13:04,900 --> 00:13:09,400 If you first ask the question, have you seen and liked interstellar? 256 00:13:09,866 --> 00:13:13,766 If they say yes and then you recommend Ex Machina, the likelihood 257 00:13:13,766 --> 00:13:16,966 of a successful recommendation there is 17.5%. 258 00:13:17,133 --> 00:13:20,600 So the lift is by definition 1.75. 259 00:13:21,466 --> 00:13:21,900 There we go. 260 00:13:21,900 --> 00:13:25,500 That is what, the lift is defined as 261 00:13:26,366 --> 00:13:29,933 and that's pretty much the whole apriori algorithm. 262 00:13:29,933 --> 00:13:31,500 That's the steps that it involves. 263 00:13:31,500 --> 00:13:34,000 And now we're just going to put it all together, 264 00:13:34,000 --> 00:13:39,033 in, in this one kind of, step by step process. 265 00:13:39,033 --> 00:13:43,566 So step one, you need to set up a minimum support and confidence. 266 00:13:43,566 --> 00:13:43,766 Right? 267 00:13:43,766 --> 00:13:48,633 So you won't want to only, because there's so many different recommendations. 268 00:13:48,633 --> 00:13:50,866 Right. We only looked at, one example. 269 00:13:50,866 --> 00:13:53,566 well, one specific example to simplify things, we talked about, 270 00:13:53,566 --> 00:13:56,700 Ex Machina and interstellar. 271 00:13:56,700 --> 00:13:59,966 But as you can see in the examples before that you could have like 272 00:13:59,966 --> 00:14:04,433 100 different movies and the different combinations, like a priori 273 00:14:04,633 --> 00:14:07,633 is actually quite a slow algorithm because it just goes through, 274 00:14:07,933 --> 00:14:11,333 all of these different algorithms or all of these different combinations. 275 00:14:11,333 --> 00:14:14,500 So it says, what if movie one is a good, recommendation 276 00:14:14,500 --> 00:14:18,000 for movie two or movie one means a personal like movie two. 277 00:14:18,000 --> 00:14:21,533 Movie one means a personal, like movie three and movie one, movie four. 278 00:14:21,533 --> 00:14:23,000 And then it actually combines more. 279 00:14:23,000 --> 00:14:26,800 It says movie one and movie two might mean that personal, like movie three 280 00:14:27,000 --> 00:14:27,400 and so on. 281 00:14:27,400 --> 00:14:31,200 And so it actually combines lots and lots and lots of not just pairs, not triplets. 282 00:14:31,933 --> 00:14:34,633 like it, combines four, five, 283 00:14:34,633 --> 00:14:37,733 six, seven items in one, in one set and so on. 284 00:14:38,566 --> 00:14:41,666 And yeah, so it gets quite big. 285 00:14:41,666 --> 00:14:44,666 And therefore you need to set some kind of limitations. 286 00:14:45,033 --> 00:14:46,866 so you need to set a minimum support. 287 00:14:46,866 --> 00:14:50,400 For instance, you might not want to look at products 288 00:14:50,400 --> 00:14:55,266 that are, that have a support of less than 20%. 289 00:14:55,266 --> 00:14:59,400 You might not even want to consider them because you don't want to waste your time, 290 00:14:59,900 --> 00:15:03,733 building a model for something that is only has 291 00:15:03,733 --> 00:15:06,733 a, success rate of 20% on its own. 292 00:15:06,733 --> 00:15:09,733 Right? So. Or you might limit at 5%. 293 00:15:10,133 --> 00:15:13,433 then you, you might want to also limit a confidence. 294 00:15:13,433 --> 00:15:17,533 So in our example the confidence was 17.5%. 295 00:15:17,533 --> 00:15:18,133 Right. 296 00:15:18,133 --> 00:15:22,466 that somebody who was somebody who liked one movie will like the other one, 297 00:15:22,600 --> 00:15:28,500 maybe you might want to limit it, to, you know, anything less than 12%, 298 00:15:28,500 --> 00:15:31,500 you don't want to look at it because it's not a strong enough, 299 00:15:31,966 --> 00:15:35,866 factor for you is not a strong enough rule for you, because there is going 300 00:15:35,866 --> 00:15:39,200 to be so many different rules on the output of this algorithm. 301 00:15:40,100 --> 00:15:42,266 you already know that you'll have much stronger ones. 302 00:15:42,266 --> 00:15:46,100 So you don't want to consider anything that's less than 12% or 20% or, 303 00:15:46,633 --> 00:15:49,900 whatever percentage you decide to set for you in that specific scenario. 304 00:15:50,533 --> 00:15:53,900 then once you've set those, then you take all the subsets in, 305 00:15:54,000 --> 00:15:57,266 transactions having higher support than minimum, 306 00:15:57,266 --> 00:15:59,766 then the minimum support take all the rules of the subset 307 00:15:59,766 --> 00:16:01,500 having high confidence and minimum confidence. 308 00:16:01,500 --> 00:16:03,900 Basically apply those two minimums that you've said. 309 00:16:03,900 --> 00:16:08,433 And then at the end, of course, you sort the rules by the decreasing lift. 310 00:16:08,433 --> 00:16:10,033 So that's where the lift comes in. 311 00:16:10,033 --> 00:16:12,333 The rule with the highest lift 312 00:16:12,333 --> 00:16:16,466 given these criteria is going to be the strongest rule. 313 00:16:16,500 --> 00:16:19,300 And that's the one you might want to look into first. Right. 314 00:16:19,300 --> 00:16:23,233 Something like I don't know if a person buys a burger and French fries, 315 00:16:23,233 --> 00:16:27,133 then they're likely to buy, tomato sauce or ketchup as well. 316 00:16:27,866 --> 00:16:31,433 And because, you know and and that some of that sometimes it makes sense, right. 317 00:16:31,433 --> 00:16:32,366 Because you need ketchup 318 00:16:32,366 --> 00:16:36,533 to a lot of people like to eat ketchup with their burgers and French fries. 319 00:16:36,533 --> 00:16:39,233 So basically you find the ones with the highest lift, 320 00:16:39,233 --> 00:16:42,433 and those are the ones in your top ten or top five, and those are the ones 321 00:16:42,433 --> 00:16:46,033 that you consider for actually implementing a business decision. 322 00:16:47,000 --> 00:16:48,666 and basing it on them. 323 00:16:48,666 --> 00:16:51,666 So that's pretty much how the apriori algorithm works. 324 00:16:51,800 --> 00:16:54,100 it was quite a long story. 325 00:16:54,100 --> 00:16:56,700 Well, I thought we had some some good fun here. 326 00:16:56,700 --> 00:16:59,866 There's there's another example that I wanted to share with you. 327 00:17:00,300 --> 00:17:03,300 Oh, okay. 328 00:17:03,300 --> 00:17:07,433 So just wanted to mention that recommender systems like things like, 329 00:17:07,466 --> 00:17:10,833 companies like Amazon News and others and Netflix and so on. 330 00:17:11,566 --> 00:17:15,300 there like a good there would be a, there would be a good, 331 00:17:15,700 --> 00:17:19,166 example for using a priori. 332 00:17:19,266 --> 00:17:21,000 I probably would be good there. 333 00:17:21,000 --> 00:17:23,500 But of course, they are much more sophisticated. 334 00:17:23,500 --> 00:17:27,833 They're not just a priori, they actually use combinations or, very, 335 00:17:28,966 --> 00:17:31,966 specific or specifically designed algorithms. So, 336 00:17:33,033 --> 00:17:35,866 I just don't want you to be confused that a priori 337 00:17:35,866 --> 00:17:37,433 that means that everything uses apriori. 338 00:17:37,433 --> 00:17:40,433 Apriori is just a basic, kind of, 339 00:17:41,000 --> 00:17:43,366 straightforward approach to, to solving this problem. 340 00:17:43,366 --> 00:17:47,133 And it's a good example of, you know, how it can be done. 341 00:17:47,133 --> 00:17:49,933 But of course, there are other ways of doing it. 342 00:17:49,933 --> 00:17:52,600 And for instance, you know, we'll look at the 343 00:17:52,600 --> 00:17:54,700 we'll look at some of the methods and in fact, 344 00:17:54,700 --> 00:17:57,700 some of the methods that we already use can be used to build, 345 00:17:57,900 --> 00:17:59,433 recommender systems as well. 346 00:17:59,433 --> 00:17:59,733 All right. 347 00:17:59,733 --> 00:18:02,433 So on that note, thank you for your attention. 348 00:18:02,433 --> 00:18:06,333 And off we go to had to look at how 349 00:18:06,333 --> 00:18:10,133 we can code a priori in, R and Python. 350 00:18:10,133 --> 00:18:11,466 And I'll see you here next time. 351 00:18:11,466 --> 00:18:13,133 Until then, happy analyzing.