1
00:00:00,880 --> 00:00:06,760
Previous models are very popular and are widely used by analysts and data scientists.

2
00:00:08,190 --> 00:00:12,460
A model based on a simple decision tree is very easy to interpret.

3
00:00:13,200 --> 00:00:18,750
And it gives very clear decision points which can be used in making business decisions.

4
00:00:21,330 --> 00:00:26,920
Although a simple decision tree lacks a little bit in terms of accuracy.

5
00:00:27,940 --> 00:00:34,450
But there are variants or advanced techniques based on decision trees, which we will discuss in the

6
00:00:34,450 --> 00:00:40,470
later part of the course using which we can increase the accuracy really significantly.

7
00:00:42,730 --> 00:00:50,500
So if interpretation is your goal, that is you're presenting a concept to people who are not very enthusiastic

8
00:00:50,500 --> 00:00:53,830
about numbers and complex mathematical models.

9
00:00:55,510 --> 00:01:00,340
You should use a simple decision tree, which we will learn first in this.

10
00:01:02,960 --> 00:01:05,180
If prediction accuracy is the goal.

11
00:01:05,570 --> 00:01:09,230
And you can let go of some of the interpretively of the modern.

12
00:01:10,810 --> 00:01:15,330
We must use the advanced techniques that we are going to learn in the later part of Diggles.

13
00:01:19,470 --> 00:01:24,090
OK, so what is our decision tree in a decision tree?

14
00:01:24,800 --> 00:01:31,680
We are trying to split or segment the population into different parts or regions.

15
00:01:32,720 --> 00:01:38,210
And each region has a certain set of characteristics of deep predictor variables.

16
00:01:41,440 --> 00:01:42,820
So in this diffidently.

17
00:01:44,300 --> 00:01:46,550
We are getting finally four regions.

18
00:01:47,840 --> 00:01:50,900
Which are classifying each person in two unfit outfit.

19
00:01:52,430 --> 00:01:59,240
This forest region, which is classifying a person into untracked category, is having two characteristics.

20
00:02:00,650 --> 00:02:06,650
The person which belongs to this region has aged less than 30 and it's a lot of people.

21
00:02:08,180 --> 00:02:11,300
Similarly, for second region ages, there's dentally.

22
00:02:11,660 --> 00:02:14,990
But that person is not eating a lot of people's.

23
00:02:16,840 --> 00:02:17,690
So in this way.

24
00:02:17,920 --> 00:02:18,050
Ah!

25
00:02:18,400 --> 00:02:24,430
Aim is to divide the population into several regions and each region will have a certain collective

26
00:02:24,430 --> 00:02:26,140
stick of deep predictor variables.

27
00:02:27,690 --> 00:02:29,820
Let me give you an example to explain it further.

28
00:02:32,830 --> 00:02:40,120
Suppose you're trying to predict scores of students besides the number of hours they have studied prior

29
00:02:40,120 --> 00:02:40,830
to the exam.

30
00:02:42,150 --> 00:02:44,280
And they're scored in the midterm exams.

31
00:02:45,610 --> 00:02:48,970
We have the data of these 10 students and this, David.

32
00:02:50,980 --> 00:02:56,740
The first column contains the score that they actually scored in the final exam.

33
00:02:56,890 --> 00:03:03,370
Second column has the number of artist studied and third column as the midterm score of the student.

34
00:03:05,180 --> 00:03:11,900
Using this data of Penn students for eleven students, I want to predict this code, given that the

35
00:03:11,900 --> 00:03:15,650
number of hours that he has studied and the midterms code, he has school.

36
00:03:18,050 --> 00:03:23,330
So if we want to build a decision entry, we want to split this data into regions.

37
00:03:24,920 --> 00:03:32,000
If I separate those two things on the basis of number of are studied, that is students who've studied

38
00:03:32,000 --> 00:03:34,570
less than 10 hours can be one group.

39
00:03:34,700 --> 00:03:39,360
And students who study in more than 10 hours prior to the exam can be another group.

40
00:03:41,600 --> 00:03:44,830
Do we see any major difference in this sort of these two groups?

41
00:03:48,410 --> 00:03:53,060
It turns out that there is a major difference in the average score of these two groups.

42
00:03:54,520 --> 00:04:01,480
Students who study less than 10 hours on an average score, thirty nine marks, whereas students will

43
00:04:01,480 --> 00:04:05,650
study more than 10 hours score 75 marks on an average.

44
00:04:07,620 --> 00:04:15,180
So the first step of this entry, you can see these 10 student had an average score of 57, which is

45
00:04:15,300 --> 00:04:15,830
written here.

46
00:04:17,010 --> 00:04:18,600
But this is the entire population.

47
00:04:19,050 --> 00:04:21,210
That is why it is written as a hundred percent.

48
00:04:22,500 --> 00:04:30,090
When I split this population using the Rs variable and check whether the Rs study is less than 10 hours

49
00:04:30,420 --> 00:04:31,420
or more than 10 hours.

50
00:04:32,510 --> 00:04:33,650
I get this decision.

51
00:04:35,180 --> 00:04:37,580
If us today is less than 10.

52
00:04:38,700 --> 00:04:42,240
Then we have this left part in this left part.

53
00:04:42,720 --> 00:04:44,850
Average school student is 39.

54
00:04:46,380 --> 00:04:49,050
And it contains 50 percent of the population.

55
00:04:49,320 --> 00:04:55,100
Since we have 10 students, five students are coming in this side of the tree.

56
00:04:55,350 --> 00:04:57,510
That is, they have studied less than hours.

57
00:04:59,700 --> 00:05:00,660
And the other side.

58
00:05:03,120 --> 00:05:05,730
We have these students were studied more than 10 us.

59
00:05:07,830 --> 00:05:10,300
For them, the average score is seventy five marks.

60
00:05:11,100 --> 00:05:14,130
And they had also the population is 50 percent.

61
00:05:14,850 --> 00:05:19,070
So 50 percent of these two gentlemen left, right, and 50 percent are indeed 80.

62
00:05:21,970 --> 00:05:29,410
Next, if I look at the midterms score of these students, I can further separate this region or this

63
00:05:29,410 --> 00:05:30,370
class of student.

64
00:05:31,210 --> 00:05:38,790
So students who have scored less than sixty five marks on an average scored 70 marks in the final exam.

65
00:05:39,840 --> 00:05:42,570
Which is 30 percent of the total population.

66
00:05:43,230 --> 00:05:45,860
In other words, trees to rent belong to this class.

67
00:05:48,030 --> 00:05:54,530
And if I look at the other Dutch students who have scored more than 65 months in the meter on an average

68
00:05:54,540 --> 00:05:58,230
score, eighty two marks in the final exam.

69
00:06:00,550 --> 00:06:04,750
This class has 20 percent of the total population, which is two students.

70
00:06:06,240 --> 00:06:10,270
Now, if I add more predictive variables in my problem.

71
00:06:11,780 --> 00:06:13,490
And continue making these blit.

72
00:06:15,080 --> 00:06:18,140
We get something which resembles an inverted tree.

73
00:06:19,730 --> 00:06:22,550
This is why such a model is called a decision tree.

74
00:06:25,210 --> 00:06:29,970
You can see how easy it is to interpret this visual representation.

75
00:06:31,270 --> 00:06:35,230
And also, it is giving us some clear, actionable insight.

76
00:06:36,540 --> 00:06:38,520
So if you want to score more in the exams.

77
00:06:39,630 --> 00:06:46,590
Definitely study more than Danas and also try to get more than 65 marks in your midterm exam.

78
00:06:50,810 --> 00:06:56,530
Now, let us see what are the different types of visionaries, so like in machine learning models?

79
00:06:57,610 --> 00:07:02,160
We have two types of models classification indignation in D.C. and also.

80
00:07:02,590 --> 00:07:06,430
We have two types, regression trees and classification trees.

81
00:07:08,530 --> 00:07:16,030
So indivisible, you are predicting, is a quantitative type of variable like height of a person or

82
00:07:16,210 --> 00:07:19,000
number of prospective customers of your business.

83
00:07:19,780 --> 00:07:21,580
Then we be regression, please.

84
00:07:23,600 --> 00:07:29,990
Whereas if the variable is categorical, such as will a player score a goal in the football match?

85
00:07:31,000 --> 00:07:33,490
What does a patient have, heart disease?

86
00:07:33,800 --> 00:07:37,000
This is several reports for such problems.

87
00:07:37,180 --> 00:07:38,860
We drop classification 3s.

88
00:07:41,550 --> 00:07:47,410
We will be discussing both of these types of trees in our course, but we will discuss reignition trees

89
00:07:47,410 --> 00:07:47,800
first.

90
00:07:52,080 --> 00:07:58,000
Before we move further, it is important that we take note of important terminologies related to decision

91
00:07:58,030 --> 00:07:58,500
entries.

92
00:08:00,040 --> 00:08:08,070
Foster's root node, the first node which contains the entire population, which we want to subdivide

93
00:08:08,100 --> 00:08:11,010
into regions, is called the root node.

94
00:08:11,780 --> 00:08:14,390
Root node has a hundred percent of the population.

95
00:08:17,170 --> 00:08:23,680
Then there is the action of splitting, splitting means dividing a region into subregions.

96
00:08:24,490 --> 00:08:31,220
So when we divide this group, node bases us, we are performing an action of splitting.

97
00:08:32,590 --> 00:08:34,180
Then is a decision node.

98
00:08:35,710 --> 00:08:40,090
Every node where we perform splitting is called a decision node.

99
00:08:40,990 --> 00:08:44,740
For example, the root node is also a decision node.

100
00:08:45,300 --> 00:08:49,070
Then when we take a decision here that we are going to split this.

101
00:08:49,310 --> 00:08:51,520
This is the midterms code of student.

102
00:08:52,330 --> 00:08:53,800
This is also a decision node.

103
00:08:56,190 --> 00:08:58,240
Then we have Leif and we lodes.

104
00:09:00,040 --> 00:09:05,110
These are the last nodes beyond which we do not split any further.

105
00:09:07,190 --> 00:09:16,440
So here the second, sixth and seventh Nord, we are not split further, the are the live or determine

106
00:09:16,570 --> 00:09:16,870
load.

107
00:09:19,330 --> 00:09:25,640
Then there is a subtree, a small subsection of the entire tree is called a Sabry.

108
00:09:26,260 --> 00:09:31,570
So if I take out this decision node and these two leaf node in this.

109
00:09:31,960 --> 00:09:33,910
We are containing a subtree.

110
00:09:37,110 --> 00:09:40,650
You'll also hear about bearding and take note.

111
00:09:41,340 --> 00:09:43,140
So whenever we split an old.

112
00:09:44,150 --> 00:09:50,550
That note becomes the bitter note, and the notes that we get after this break are detailed.

113
00:09:50,880 --> 00:09:55,680
So this light blue three note is the bitter note.

114
00:09:56,560 --> 00:10:01,020
And these two, six and seven are detailed note of this better paranoid.

115
00:10:03,110 --> 00:10:06,570
So we'll be using this terminology as we go along this course.

116
00:10:07,180 --> 00:10:08,250
Remember this terminology.

117
00:10:08,770 --> 00:10:10,000
And to you in the next video.