1
00:00:00,166 --> 00:00:02,166
It beats by very little.

2
00:00:02,166 --> 00:00:04,800
You know, the accuracy of the kernel SVM.

3
00:00:04,800 --> 00:00:07,800
The kernel SVM did eight incorrect
predictions.

4
00:00:07,800 --> 00:00:08,866
Five plus three.

5
00:00:08,866 --> 00:00:13,600
And the decision tree classification did
four plus three seven incorrect

6
00:00:13,600 --> 00:00:18,300
predictions, resulting
in having an accuracy almost equal to 96%.

7
00:00:18,600 --> 00:00:19,800
Well, that's really, really good.

8
00:00:19,800 --> 00:00:22,333
You know, this is the first time
actually that I'm doing this.

9
00:00:22,333 --> 00:00:26,233
You know, trying this breast cancer data
set with you, with all these models.

10
00:00:26,233 --> 00:00:29,133
I tried it once with logistic regression
in another course.

11
00:00:29,133 --> 00:00:32,100
But this is the first time
that I implement all these models

12
00:00:32,100 --> 00:00:35,400
and deploy them to do a model selection
process on this data set.

13
00:00:35,666 --> 00:00:38,900
So, you know, that shows you how
I'm confident about these code templates.

14
00:00:38,900 --> 00:00:40,966
I never tried them before on this dataset.

15
00:00:40,966 --> 00:00:44,466
And now I'm making this demo with you
for the first time on this data set.

16
00:00:45,000 --> 00:00:46,600
All right. So therefore the excitement.

17
00:00:46,600 --> 00:00:50,433
And now final one I'm very curious
to see if we can still beat it

18
00:00:50,433 --> 00:00:52,000
with random forest classification.

19
00:00:52,000 --> 00:00:57,000
So let's do this run
all and final accuracy is whoa.

20
00:00:57,400 --> 00:01:00,000
Oh. All right okay.
So I'm very disappointing.

21
00:01:00,000 --> 00:01:03,400
The random forest classification
indeed really messed up with the teamwork.

22
00:01:03,400 --> 00:01:05,333
And that's another exception to the rule
because,

23
00:01:05,333 --> 00:01:08,333
you know, usually teamwork
is better than individual work.

24
00:01:08,366 --> 00:01:12,500
But no, we had a very powerful decision
tree classification model before

25
00:01:12,600 --> 00:01:15,500
and it didn't need anybody else
to be performant.

26
00:01:15,500 --> 00:01:15,866
All right.

27
00:01:15,866 --> 00:01:16,866
So that's very interesting.

28
00:01:16,866 --> 00:01:18,866
Actually this is very surprising results.

29
00:01:18,866 --> 00:01:20,366
But that's why it's really,

30
00:01:20,366 --> 00:01:24,400
really important to try all your models
and have these very efficient code

31
00:01:24,400 --> 00:01:27,400
templates
that allow you to do a quick and efficient

32
00:01:27,400 --> 00:01:30,700
model selection process to quickly
figure out the best model.

33
00:01:31,100 --> 00:01:31,600
All right.

34
00:01:31,600 --> 00:01:33,833
You have to understand
that there is no rule of thumb.

35
00:01:33,833 --> 00:01:37,033
And for some other data
sets with other machine learning problems,

36
00:01:37,200 --> 00:01:40,500
the best model will be another
one of these five models.

37
00:01:40,633 --> 00:01:40,966
All right.

38
00:01:40,966 --> 00:01:45,300
So it was really important for me to
show you this and to give you this okay.

39
00:01:45,300 --> 00:01:47,066
So now there we go. My friends.

40
00:01:47,066 --> 00:01:50,100
We are at the end of part
three classification.

41
00:01:50,200 --> 00:01:53,400
Big congratulations to you for, well,
completing this part

42
00:01:53,400 --> 00:01:56,300
three and making such a good progress
on this course.

43
00:01:56,300 --> 00:01:57,366
Now in the next part

44
00:01:57,366 --> 00:02:01,700
we will start clustering, which will be
our first unsupervised model.

45
00:02:01,800 --> 00:02:04,533
I remind that the difference
between supervised and unsupervised

46
00:02:04,533 --> 00:02:07,266
is that with supervised learning,
you know what to predict.

47
00:02:07,266 --> 00:02:09,366
You know
which dependent variable to predict.

48
00:02:09,366 --> 00:02:12,066
And with unsupervised learning,
you don't know what to predict,

49
00:02:12,066 --> 00:02:14,533
and you will actually have to find
some patterns in the data

50
00:02:14,533 --> 00:02:18,033
to figure out something
you could predict as a dependent variable,

51
00:02:18,033 --> 00:02:21,900
but that you don't have a priori
and that you create a posteriori.

52
00:02:22,066 --> 00:02:25,500
All right, so no worries,
we will see all this in the next part.

53
00:02:25,700 --> 00:02:28,500
And until then,
I think you deserve a good break now.

54
00:02:28,500 --> 00:02:32,366
So get some good rest, relax a bit,
and as soon as you're back to back

55
00:02:32,366 --> 00:02:36,566
with some great energy, well,
join us in the next part to tackle cluster

56
00:02:36,933 --> 00:02:38,600
and until then, enjoy machine learning.