1
00:00:00,866 --> 00:00:02,966
Hello and welcome back to the course
on Machine Learning.

2
00:00:02,966 --> 00:00:06,600
Today we're talking about the intuition
behind the a priori algorithm.

3
00:00:06,900 --> 00:00:08,733
So let's get started.

4
00:00:08,733 --> 00:00:11,300
And we're going to get started
by talking about a story.

5
00:00:11,300 --> 00:00:15,266
It's a somewhat legend of data science

6
00:00:15,266 --> 00:00:18,900
or a legend that is, quite
well known in data science.

7
00:00:18,900 --> 00:00:21,833
And you may have heard of, this, legend.

8
00:00:21,833 --> 00:00:23,866
It's not a myth. It's actually happened.

9
00:00:23,866 --> 00:00:28,000
But, as you know, things
when they happened a long time ago

10
00:00:28,200 --> 00:00:32,233
and then time passes
and the facts get the story.

11
00:00:32,233 --> 00:00:34,800
But I'll tell you my story of this legend.

12
00:00:34,800 --> 00:00:37,900
And, it might not be exactly correct,
but this is how I,

13
00:00:38,533 --> 00:00:41,533
know about and how I've heard about it.

14
00:00:41,700 --> 00:00:45,933
So, what do you think the commonality
is between these two products?

15
00:00:46,433 --> 00:00:50,000
Pampers or diapers and beer?

16
00:00:50,166 --> 00:00:53,700
What do you think they have in common,
and why are they

17
00:00:53,700 --> 00:00:56,733
part of this, or, urban legend?

18
00:00:56,866 --> 00:00:59,166
why are they part of this data science
lesson?

19
00:00:59,166 --> 00:01:01,400
Well, as the story goes,

20
00:01:02,533 --> 00:01:05,400
a company
we're not going to name the company,

21
00:01:05,400 --> 00:01:10,866
but a company that is actually actually
like a, convenience store,

22
00:01:11,300 --> 00:01:15,133
did some analytics around the,

23
00:01:15,900 --> 00:01:18,166
products that people are purchasing.

24
00:01:18,166 --> 00:01:20,800
And so they were looking at,

25
00:01:20,800 --> 00:01:23,833
you know, what people are checking out
with what are the commonalities?

26
00:01:23,833 --> 00:01:27,433
And they analyzed thousands and thousands
and thousands of, transactions.

27
00:01:27,433 --> 00:01:31,800
So thousands of people who actually
checked out, if not tens of thousands and,

28
00:01:32,233 --> 00:01:35,233
they found a very interesting thing that,

29
00:01:35,766 --> 00:01:39,033
very often
during certain times of the day

30
00:01:39,033 --> 00:01:44,433
when people shop in the afternoon,
between six and, 9 p.m.,

31
00:01:44,966 --> 00:01:49,933
people who buy, diapers also buy beer.

32
00:01:50,500 --> 00:01:53,500
And it was like, out of the blue,
completely out of the blue.

33
00:01:53,766 --> 00:01:56,966
like how why these two products
are completely not connected.

34
00:01:56,966 --> 00:01:57,666
Right.

35
00:01:57,666 --> 00:02:00,933
why would somebody buy, beer
when they're buying diapers?

36
00:02:00,933 --> 00:02:03,533
Or why buy diapers
when they're buying beer? Right.

37
00:02:03,533 --> 00:02:07,566
So, that was the fact that
they came across in the data

38
00:02:08,533 --> 00:02:10,733
and, the explanation to this

39
00:02:10,733 --> 00:02:13,733
fact,
one of the plausible explanations is that,

40
00:02:14,700 --> 00:02:17,566
in the afternoons or in the evenings when,

41
00:02:17,566 --> 00:02:22,666
the husband gets home, 
and they're like him,

42
00:02:22,833 --> 00:02:25,833
the husband and the wife
are taking care of the their baby.

43
00:02:26,533 --> 00:02:28,800
they sometimes find that
they run out of diapers.

44
00:02:28,800 --> 00:02:30,933
And who has to go pick up the diapers?

45
00:02:30,933 --> 00:02:33,566
Well, the husband has to go
pick up the diapers, right?

46
00:02:33,566 --> 00:02:35,866
Or the wife sends the husband
to go pick up the diapers.

47
00:02:35,866 --> 00:02:37,733
And while he's picking up the diapers,

48
00:02:37,733 --> 00:02:40,233
because it's really after hours
after work,

49
00:02:40,233 --> 00:02:42,166
he also he's already
in the convenience store.

50
00:02:42,166 --> 00:02:44,333
He also picks up some beer. Right.

51
00:02:44,333 --> 00:02:46,800
And so that is a plausible explanation.

52
00:02:46,800 --> 00:02:49,800
Might be the case, might not be the case,
but sounds pretty reasonable

53
00:02:50,233 --> 00:02:51,900
and based on that.

54
00:02:51,900 --> 00:02:54,900
So that's something that you can't really
think of just by yourself.

55
00:02:54,900 --> 00:02:56,366
But that comes from the data. Right.

56
00:02:56,366 --> 00:03:01,100
And based on that you can decide
how to arrange products in your store.

57
00:03:01,100 --> 00:03:01,300
Right.

58
00:03:01,300 --> 00:03:03,800
So some stores might decide
to put these two products

59
00:03:03,800 --> 00:03:07,533
closer to entice people to buy a beer
when they're buying diapers.

60
00:03:07,533 --> 00:03:08,866
But actually, a lot of stores

61
00:03:09,833 --> 00:03:10,833
do the opposite.

62
00:03:10,833 --> 00:03:16,466
There are a lot of stores, 
separate, beer and diapers, right?

63
00:03:16,466 --> 00:03:19,866
Just like they try to separate
and you'll probably notice

64
00:03:19,866 --> 00:03:23,100
this from your convenience store
that they try to separate,

65
00:03:23,733 --> 00:03:26,300
bread and milk as far as possible. Why?

66
00:03:26,300 --> 00:03:28,066
Because that way.

67
00:03:28,066 --> 00:03:31,066
Yeah, they already know that
these two products are bored together.

68
00:03:31,200 --> 00:03:36,300
And so you actually have to walk
through the whole store to pick up,

69
00:03:36,533 --> 00:03:38,866
you know, you've picked up your bread
and then to get to the milk,

70
00:03:38,866 --> 00:03:42,300
you have to get all the way through
the whole store to the completely opposite

71
00:03:42,600 --> 00:03:44,100
corner of the store.

72
00:03:44,100 --> 00:03:47,566
So as you're walking through the store,
you see more other products

73
00:03:47,566 --> 00:03:49,533
and you're more likely to pick up

74
00:03:49,533 --> 00:03:52,666
an additional item that you weren't
actually planning on buying

75
00:03:52,666 --> 00:03:54,100
when you got to the store
in the first place.

76
00:03:54,100 --> 00:03:57,700
So there's a lot of interesting marketing
tactics that are used based on this data.

77
00:03:57,700 --> 00:04:00,133
But the question is,
how do you get to this data?

78
00:04:00,133 --> 00:04:03,300
And one of the ways to get to it
is that a priori algorithm.

79
00:04:04,000 --> 00:04:07,000
So let's talk about a priori
in a bit more detail. Now.

80
00:04:07,733 --> 00:04:08,133
All right.

81
00:04:08,133 --> 00:04:13,200
So a priori is about people who bought
something, also bought something else,

82
00:04:13,200 --> 00:04:16,000
or who watched something,
also watch something else

83
00:04:16,000 --> 00:04:17,966
or who did something
also did something else.

84
00:04:17,966 --> 00:04:21,766
So, it analyzes
and this whole association,

85
00:04:22,266 --> 00:04:26,300
rule learning a part of the course
is all about analyzing when things,

86
00:04:26,933 --> 00:04:31,166
come in pairs or in triplicate
or in, in C, like not in sequence,

87
00:04:31,166 --> 00:04:34,433
but they are combined together
for some reason,

88
00:04:35,100 --> 00:04:39,900
looking for those, rules
and those ways that this happens.

89
00:04:40,866 --> 00:04:41,166
All right.

90
00:04:41,166 --> 00:04:42,400
So let's have a look.

91
00:04:42,400 --> 00:04:44,966
for instance, movie
recommendation. Right.

92
00:04:44,966 --> 00:04:48,733
So you've got user IDs,
you've got movies that the people liked.

93
00:04:49,133 --> 00:04:53,100
Movie one, two, three, four, movie one
and two for the second person, and so on.

94
00:04:53,600 --> 00:04:57,833
And from here, just by looking at it,
even without not knowing anything about,

95
00:04:58,333 --> 00:05:01,800
association rule learning or a priori,
all the a priori algorithm,

96
00:05:01,966 --> 00:05:05,733
you can really tell that,
there are some potential rules

97
00:05:05,733 --> 00:05:08,800
that can come out of this that,
for instance, everybody who watches movie

98
00:05:08,800 --> 00:05:13,066
one, not everybody, but it is likely
that people who watch movie one will,

99
00:05:13,066 --> 00:05:16,533
or who like movie
one will also like movie number two.

100
00:05:17,000 --> 00:05:17,900
And people who like

101
00:05:17,900 --> 00:05:21,933
movie number two are quite likely
to also like movie number four.

102
00:05:22,366 --> 00:05:26,133
And people who like movie number one
are also quite likely to like movie

103
00:05:26,133 --> 00:05:27,166
number three.

104
00:05:27,166 --> 00:05:28,700
So there you can

105
00:05:28,700 --> 00:05:30,100
you can come up with lots of different

106
00:05:30,100 --> 00:05:32,000
potential rules,
but some are going to be stronger,

107
00:05:32,000 --> 00:05:35,000
some are going to be weaker,
and we want to find the very strong ones

108
00:05:35,333 --> 00:05:39,600
in order to build our business decisions
or our other decisions,

109
00:05:40,166 --> 00:05:43,200
on those rules
that we can see in the data.

110
00:05:43,200 --> 00:05:43,400
Right.

111
00:05:43,400 --> 00:05:47,933
We don't have to go and ask people, hey,
do you like movie number one?

112
00:05:47,933 --> 00:05:49,933
And would you like movie number two
because of that?

113
00:05:49,933 --> 00:05:52,500
Do you like movie number two
or what is your taste and preference?

114
00:05:52,500 --> 00:05:55,866
We can see these things from the data
and we want to extract this information.

115
00:05:55,866 --> 00:05:59,166
And as long as, you know,
we have a large enough sample size,

116
00:05:59,233 --> 00:06:03,533
know if it's not just like five people,
if it's 50,000 or, 500,000 people

117
00:06:03,533 --> 00:06:07,466
that we're analyzing, we can come up with
quite some quite solid rules.

118
00:06:08,633 --> 00:06:08,966
All right.

119
00:06:08,966 --> 00:06:11,700
So, here's

120
00:06:11,700 --> 00:06:14,600
another example
where we've got a market basket.

121
00:06:14,600 --> 00:06:20,200
So, example of, people who, buy
a grocery, not just groceries,

122
00:06:20,200 --> 00:06:24,733
but this small kind of like a, restaurant
or a, takeaway place.

123
00:06:25,000 --> 00:06:28,966
And here you can see there's a link,
obviously, in burgers and French fries,

124
00:06:29,166 --> 00:06:31,333
interesting vegetables and fruits
and people

125
00:06:31,333 --> 00:06:33,300
trying to be healthy burgers,
French fries and ketchup.

126
00:06:33,300 --> 00:06:35,900
So again, these are potential rules,
not necessarily

127
00:06:35,900 --> 00:06:37,366
the ones that we're going
to take away from data.

128
00:06:37,366 --> 00:06:40,366
This is just an example of something
that you might observe,

129
00:06:40,566 --> 00:06:43,566
visually
just by looking at this data set.

130
00:06:43,900 --> 00:06:44,233
All right.

131
00:06:44,233 --> 00:06:47,100
So how does that apriori algorithm work?

132
00:06:47,100 --> 00:06:49,866
Well, the apriori algorithm
has three parts to it.

133
00:06:49,866 --> 00:06:53,100
It has got the support,
the confidence and the lift.

134
00:06:53,533 --> 00:06:55,266
So we're going to start off
with the support.

135
00:06:55,266 --> 00:06:58,700
and you will see that it's, it's
very similar to

136
00:06:58,700 --> 00:06:59,900
something we've already discussed.

137
00:06:59,900 --> 00:07:04,800
It's very similar to the way we talked
about the intuition for the Bayesian,

138
00:07:05,033 --> 00:07:09,100
for the Naive Bayes, classifiers.

139
00:07:09,400 --> 00:07:10,633
So let's have a look here.

140
00:07:10,633 --> 00:07:15,033
We've got movie
recommendations, support for movie. is

141
00:07:16,300 --> 00:07:19,066
the number
is defined as the number of users,

142
00:07:19,066 --> 00:07:23,600
who watched movie
M divided by the total number of users.

143
00:07:23,600 --> 00:07:24,400
Right.

144
00:07:24,400 --> 00:07:26,700
And Market basket
optimization, same thing.

145
00:07:26,700 --> 00:07:29,400
number of transactions containing.

146
00:07:29,400 --> 00:07:32,566
So an item I divided by the total number
of transactions.

147
00:07:32,900 --> 00:07:35,233
Let's, have a look at an illustration
here.

148
00:07:35,233 --> 00:07:39,133
We've got 100 people,
so we've got five rows

149
00:07:39,133 --> 00:07:42,266
and 20 columns of human beings.

150
00:07:42,266 --> 00:07:45,266
and,

151
00:07:45,466 --> 00:07:47,633
let's see how many of them,

152
00:07:47,633 --> 00:07:50,433
let's say we're talking about a movie

153
00:07:50,433 --> 00:07:54,600
and I'm going to, give an example
of one of my favorite movies, Ex Machina.

154
00:07:54,600 --> 00:07:56,933
And if you haven't seen it, definitely
check it out.

155
00:07:56,933 --> 00:07:59,000
It's all about AI and machine learning.

156
00:07:59,000 --> 00:08:03,700
So let's say let's see how many of these
people have actually seen Ex Machina.

157
00:08:04,133 --> 00:08:05,000
So there we go.

158
00:08:05,000 --> 00:08:10,466
There's ten people
who have seen Ex Machina right out of 100.

159
00:08:10,600 --> 00:08:11,600
So what does that mean?

160
00:08:11,600 --> 00:08:14,866
That means our support here is 10% quit.

161
00:08:15,000 --> 00:08:15,766
Okay.

162
00:08:15,766 --> 00:08:17,566
Now let's move on to step two.

163
00:08:17,566 --> 00:08:20,400
Step two is
we need to find the confidence.

164
00:08:20,400 --> 00:08:21,300
What is the confidence?

165
00:08:21,300 --> 00:08:24,966
Well, confidence
is, defined as the number.

166
00:08:25,000 --> 00:08:25,966
Let's go for movies.

167
00:08:25,966 --> 00:08:30,000
So the number of,
people who have seen, movies

168
00:08:30,000 --> 00:08:33,266
M1 and M2 divided by
the number of people have seen a movie M1.

169
00:08:33,266 --> 00:08:36,800
So here we're going to assume
that we we're testing a rule.

170
00:08:36,800 --> 00:08:41,366
We're testing a rule that, let's say
people who have seen interstellar, right?

171
00:08:41,366 --> 00:08:46,033
Where we have a hypothesis that, says
that people have seen interstellar,

172
00:08:46,166 --> 00:08:49,966
they are also or have, liked interstellar

173
00:08:50,500 --> 00:08:54,833
are also, likely to like, mixed machine.

174
00:08:54,833 --> 00:08:56,000
Oh, let's let's even go.

175
00:08:56,000 --> 00:08:59,066
We've seen that people have seen
interstellar are also likely

176
00:08:59,066 --> 00:09:02,066
to have seen, Ex Machina.

177
00:09:02,100 --> 00:09:06,266
So basically here
movie number one, M1 is going to be

178
00:09:06,266 --> 00:09:10,433
the, interstellar movie,

179
00:09:11,200 --> 00:09:14,233
the one that we're saying, okay,
so we're going to take

180
00:09:14,233 --> 00:09:16,033
everybody who's seen interstellar
and we're going to check

181
00:09:16,033 --> 00:09:17,633
how many of them have seen Ex Machina.

182
00:09:17,633 --> 00:09:19,900
And that's exactly what we're doing here.

183
00:09:19,900 --> 00:09:21,900
And Market Basket
optimization, same thing.

184
00:09:21,900 --> 00:09:24,733
You can think of an example of French
fries and burgers, for instance.

185
00:09:24,733 --> 00:09:26,100
People have had burgers.

186
00:09:26,100 --> 00:09:29,000
We've ordered burgers
also likely to order French fries.

187
00:09:29,000 --> 00:09:32,833
So, at the top you would have people
have ordered burgers and French fries.

188
00:09:33,066 --> 00:09:36,400
And at the bottom you have people
who have ordered burgers only,

189
00:09:37,066 --> 00:09:39,600
who have ordered burgers,
regardless of whether they've ordered

190
00:09:39,600 --> 00:09:40,900
French fries or not.

191
00:09:40,900 --> 00:09:43,900
much easier to talk about this
with an illustration.

192
00:09:44,200 --> 00:09:48,366
let's say
those great people in colored in green

193
00:09:48,900 --> 00:09:51,400
are the ones who have seen interstellar,
right?

194
00:09:51,400 --> 00:09:54,333
Who have, watch this movie.

195
00:09:54,333 --> 00:09:57,166
Now we want to know,
not out of a whole population,

196
00:09:57,166 --> 00:10:00,166
but out of just
those people who have seen interstellar.

197
00:10:00,533 --> 00:10:02,933
How many of them have seen Ex Machina?

198
00:10:02,933 --> 00:10:07,166
So out of them, we have seven people
who have also seen Ex Machina.

199
00:10:07,166 --> 00:10:09,733
So there's only seven people
who have seen both movies.

200
00:10:09,733 --> 00:10:11,566
That's what we're after.

201
00:10:11,566 --> 00:10:16,333
And so our confidence is going to be seven
divided by 40, just by definition.

202
00:10:16,333 --> 00:10:18,133
This is how it's calculated.

203
00:10:18,133 --> 00:10:20,666
40 people have seen, interstellar

204
00:10:20,666 --> 00:10:24,833
and seven people out of those 40
have actually also seen Ex Machina.

205
00:10:24,833 --> 00:10:28,066
So, the conference here is 17.5%.

206
00:10:28,966 --> 00:10:29,733
Good.

207
00:10:29,733 --> 00:10:33,066
And the next part
or the third and last step is the lift.

208
00:10:33,066 --> 00:10:34,033
And what is the lift?

209
00:10:34,033 --> 00:10:35,066
Lift is very simple.

210
00:10:35,066 --> 00:10:41,400
Again, is going to be very similar to,
what we had in the naive Bayes, naive

211
00:10:41,400 --> 00:10:45,866
Bayesian classifiers, in that algorithm
when we were discussing it.

212
00:10:46,200 --> 00:10:50,566
So the lift is basically
the confidence divided by the support.

213
00:10:51,500 --> 00:10:55,200
so what we calculated in step two divided
by what we calculated in step one.

214
00:10:55,366 --> 00:10:57,833
And let's just to talk about it
in the illustration,

215
00:10:57,833 --> 00:11:00,300
because it's going to make way more sense
that way.

216
00:11:00,300 --> 00:11:03,166
so here's our population.

217
00:11:03,166 --> 00:11:06,166
Those people in green are
the ones who have seen interstellar.

218
00:11:06,600 --> 00:11:10,233
And all of these people in red
are the ones who have seen Ex Machina.

219
00:11:10,233 --> 00:11:13,366
So basically our lift is all right.

220
00:11:13,366 --> 00:11:17,366
So if we just randomly, right,
randomly suggest

221
00:11:17,366 --> 00:11:20,433
to a person to watch Ex Machina, right.

222
00:11:20,800 --> 00:11:26,266
what is the likelihood that they will,
you know, that it's a movie for them.

223
00:11:26,266 --> 00:11:28,766
It's a movie
that's not in this population.

224
00:11:28,766 --> 00:11:30,366
Like out of the out of this population.

225
00:11:30,366 --> 00:11:33,400
We know that out of 100 people,
only ten actually works.

226
00:11:33,433 --> 00:11:34,200
What shakes machine.

227
00:11:34,200 --> 00:11:37,733
And we're going to assume that watched
and like are interchangeable terms here.

228
00:11:37,733 --> 00:11:40,300
So we're going to assume
that if they if they didn't watch it,

229
00:11:40,300 --> 00:11:41,666
they're not going to like it anyway.

230
00:11:41,666 --> 00:11:44,666
So if we take another random,

231
00:11:44,800 --> 00:11:50,500
this population and then, 
what is the likelihood

232
00:11:50,533 --> 00:11:51,700
that if we recommend

233
00:11:51,700 --> 00:11:55,200
to a random person in that population,
that brand new population,

234
00:11:55,833 --> 00:11:59,033
we recommend that, the Ex Machina movie,
what is the likelihood

235
00:11:59,033 --> 00:12:00,300
that they will like it?

236
00:12:00,300 --> 00:12:03,866
Well, the likelihood is, 10%, right?

237
00:12:03,866 --> 00:12:07,733
Because we only out of 100 people, only
ten of them actually liked that movie.

238
00:12:08,100 --> 00:12:13,933
But now the question is, can we improve
that result by using some prior knowledge?

239
00:12:13,933 --> 00:12:15,733
That's
why the algorithm is called a priori.

240
00:12:17,566 --> 00:12:19,533
in that new population, let's

241
00:12:19,533 --> 00:12:24,133
only recommend Ex Machina to people
who have already seen interstellar.

242
00:12:24,133 --> 00:12:27,133
So people who are marked
as green in this population.

243
00:12:27,166 --> 00:12:29,133
So we will only find out.

244
00:12:29,133 --> 00:12:30,700
We will only ask,
have you seen interstellar?

245
00:12:30,700 --> 00:12:32,700
If they have,
then we'll recommend Ex Machina.

246
00:12:32,700 --> 00:12:36,833
What is the likelihood that a person
will actually like Ex Machina

247
00:12:37,400 --> 00:12:38,500
if we recommend them that way?

248
00:12:38,500 --> 00:12:42,466
Well, in that case, the likelihood, as
we've calculated out of the green people

249
00:12:42,466 --> 00:12:49,200
only, not only of the green people, 17.5%
actually elect ex machina.

250
00:12:49,533 --> 00:12:55,133
So the lift is the improvement
in your prediction.

251
00:12:55,133 --> 00:12:58,133
So your original prediction,
your original predictions 10%.

252
00:12:58,200 --> 00:12:58,433
Right.

253
00:12:58,433 --> 00:13:01,333
If you just randomly take a person
out of your new population

254
00:13:01,333 --> 00:13:04,466
and recommend them Ex Machina,
they'll like it with a likelihood of 10%.

255
00:13:04,900 --> 00:13:09,400
If you first ask the question,
have you seen and liked interstellar?

256
00:13:09,866 --> 00:13:13,766
If they say yes and then you recommend Ex
Machina, the likelihood

257
00:13:13,766 --> 00:13:16,966
of a successful recommendation
there is 17.5%.

258
00:13:17,133 --> 00:13:20,600
So the lift is by definition 1.75.

259
00:13:21,466 --> 00:13:21,900
There we go.

260
00:13:21,900 --> 00:13:25,500
That is what, the lift is defined as

261
00:13:26,366 --> 00:13:29,933
and that's pretty much the whole
apriori algorithm.

262
00:13:29,933 --> 00:13:31,500
That's the steps that it involves.

263
00:13:31,500 --> 00:13:34,000
And now we're
just going to put it all together,

264
00:13:34,000 --> 00:13:39,033
in, in this one
kind of, step by step process.

265
00:13:39,033 --> 00:13:43,566
So step one, you need to set up a minimum
support and confidence.

266
00:13:43,566 --> 00:13:43,766
Right?

267
00:13:43,766 --> 00:13:48,633
So you won't want to only, because
there's so many different recommendations.

268
00:13:48,633 --> 00:13:50,866
Right. We only looked at, one example.

269
00:13:50,866 --> 00:13:53,566
well, one specific example
to simplify things, we talked about,

270
00:13:53,566 --> 00:13:56,700
Ex Machina and interstellar.

271
00:13:56,700 --> 00:13:59,966
But as you can see in the examples
before that you could have like

272
00:13:59,966 --> 00:14:04,433
100 different movies and the different
combinations, like a priori

273
00:14:04,633 --> 00:14:07,633
is actually quite a slow algorithm
because it just goes through,

274
00:14:07,933 --> 00:14:11,333
all of these different algorithms
or all of these different combinations.

275
00:14:11,333 --> 00:14:14,500
So it says, what
if movie one is a good, recommendation

276
00:14:14,500 --> 00:14:18,000
for movie two or movie
one means a personal like movie two.

277
00:14:18,000 --> 00:14:21,533
Movie one means a personal, like movie
three and movie one, movie four.

278
00:14:21,533 --> 00:14:23,000
And then it actually combines more.

279
00:14:23,000 --> 00:14:26,800
It says movie one and movie two might mean
that personal, like movie three

280
00:14:27,000 --> 00:14:27,400
and so on.

281
00:14:27,400 --> 00:14:31,200
And so it actually combines lots and lots
and lots of not just pairs, not triplets.

282
00:14:31,933 --> 00:14:34,633
like it, combines four, five,

283
00:14:34,633 --> 00:14:37,733
six, seven items in one, in one set
and so on.

284
00:14:38,566 --> 00:14:41,666
And yeah, so it gets quite big.

285
00:14:41,666 --> 00:14:44,666
And therefore
you need to set some kind of limitations.

286
00:14:45,033 --> 00:14:46,866
so you need to set a minimum support.

287
00:14:46,866 --> 00:14:50,400
For instance,
you might not want to look at products

288
00:14:50,400 --> 00:14:55,266
that are,
that have a support of less than 20%.

289
00:14:55,266 --> 00:14:59,400
You might not even want to consider them
because you don't want to waste your time,

290
00:14:59,900 --> 00:15:03,733
building a model
for something that is only has

291
00:15:03,733 --> 00:15:06,733
a, success rate of 20% on its own.

292
00:15:06,733 --> 00:15:09,733
Right? So. Or you might limit at 5%.

293
00:15:10,133 --> 00:15:13,433
then you,
you might want to also limit a confidence.

294
00:15:13,433 --> 00:15:17,533
So in our example
the confidence was 17.5%.

295
00:15:17,533 --> 00:15:18,133
Right.

296
00:15:18,133 --> 00:15:22,466
that somebody who was somebody who liked
one movie will like the other one,

297
00:15:22,600 --> 00:15:28,500
maybe you might want to limit it, to, 
you know, anything less than 12%,

298
00:15:28,500 --> 00:15:31,500
you don't want to look at it
because it's not a strong enough,

299
00:15:31,966 --> 00:15:35,866
factor for you is not a strong enough
rule for you, because there is going

300
00:15:35,866 --> 00:15:39,200
to be so many different rules
on the output of this algorithm.

301
00:15:40,100 --> 00:15:42,266
you already know that
you'll have much stronger ones.

302
00:15:42,266 --> 00:15:46,100
So you don't want to consider anything
that's less than 12% or 20% or,

303
00:15:46,633 --> 00:15:49,900
whatever percentage you decide
to set for you in that specific scenario.

304
00:15:50,533 --> 00:15:53,900
then once you've set those,
then you take all the subsets in,

305
00:15:54,000 --> 00:15:57,266
transactions
having higher support than minimum,

306
00:15:57,266 --> 00:15:59,766
then the minimum support
take all the rules of the subset

307
00:15:59,766 --> 00:16:01,500
having high confidence
and minimum confidence.

308
00:16:01,500 --> 00:16:03,900
Basically apply
those two minimums that you've said.

309
00:16:03,900 --> 00:16:08,433
And then at the end, of course,
you sort the rules by the decreasing lift.

310
00:16:08,433 --> 00:16:10,033
So that's where the lift comes in.

311
00:16:10,033 --> 00:16:12,333
The rule with the highest lift

312
00:16:12,333 --> 00:16:16,466
given these criteria is going to be
the strongest rule.

313
00:16:16,500 --> 00:16:19,300
And that's the one
you might want to look into first. Right.

314
00:16:19,300 --> 00:16:23,233
Something like I don't know if a person
buys a burger and French fries,

315
00:16:23,233 --> 00:16:27,133
then they're likely to buy, tomato sauce
or ketchup as well.

316
00:16:27,866 --> 00:16:31,433
And because, you know and and that some of
that sometimes it makes sense, right.

317
00:16:31,433 --> 00:16:32,366
Because you need ketchup

318
00:16:32,366 --> 00:16:36,533
to a lot of people like to eat ketchup
with their burgers and French fries.

319
00:16:36,533 --> 00:16:39,233
So basically you find the ones
with the highest lift,

320
00:16:39,233 --> 00:16:42,433
and those are the ones in your top
ten or top five, and those are the ones

321
00:16:42,433 --> 00:16:46,033
that you consider for
actually implementing a business decision.

322
00:16:47,000 --> 00:16:48,666
and basing it on them.

323
00:16:48,666 --> 00:16:51,666
So that's pretty much
how the apriori algorithm works.

324
00:16:51,800 --> 00:16:54,100
it was quite a long story.

325
00:16:54,100 --> 00:16:56,700
Well,
I thought we had some some good fun here.

326
00:16:56,700 --> 00:16:59,866
There's there's another example
that I wanted to share with you.

327
00:17:00,300 --> 00:17:03,300
Oh, okay.

328
00:17:03,300 --> 00:17:07,433
So just wanted to mention
that recommender systems like things like,

329
00:17:07,466 --> 00:17:10,833
companies like Amazon News
and others and Netflix and so on.

330
00:17:11,566 --> 00:17:15,300
there like a good there would be a,
there would be a good,

331
00:17:15,700 --> 00:17:19,166
example for using a priori.

332
00:17:19,266 --> 00:17:21,000
I probably would be good there.

333
00:17:21,000 --> 00:17:23,500
But of course,
they are much more sophisticated.

334
00:17:23,500 --> 00:17:27,833
They're not just a priori,
they actually use combinations or, very,

335
00:17:28,966 --> 00:17:31,966
specific
or specifically designed algorithms. So,

336
00:17:33,033 --> 00:17:35,866
I just don't want you to be confused
that a priori

337
00:17:35,866 --> 00:17:37,433
that means that everything uses apriori.

338
00:17:37,433 --> 00:17:40,433
Apriori is just a basic, kind of,

339
00:17:41,000 --> 00:17:43,366
straightforward approach to,
to solving this problem.

340
00:17:43,366 --> 00:17:47,133
And it's a good example of,
you know, how it can be done.

341
00:17:47,133 --> 00:17:49,933
But of course,
there are other ways of doing it.

342
00:17:49,933 --> 00:17:52,600
And for instance, 
you know, we'll look at the

343
00:17:52,600 --> 00:17:54,700
we'll look at some of the methods
and in fact,

344
00:17:54,700 --> 00:17:57,700
some of the methods that we already use
can be used to build,

345
00:17:57,900 --> 00:17:59,433
recommender systems as well.

346
00:17:59,433 --> 00:17:59,733
All right.

347
00:17:59,733 --> 00:18:02,433
So on that note, 
thank you for your attention.

348
00:18:02,433 --> 00:18:06,333
And off we go to had to look at how

349
00:18:06,333 --> 00:18:10,133
we can code a priori in, R and Python.

350
00:18:10,133 --> 00:18:11,466
And I'll see you here next time.

351
00:18:11,466 --> 00:18:13,133
Until then, happy analyzing.