1 00:00:00,570 --> 00:00:06,450 Hello and welcome back to another class of our course about the complete introduction to the science, 2 00:00:07,290 --> 00:00:12,180 basically in today's class, we are still talking about some statistical concepts and one of the most 3 00:00:12,180 --> 00:00:19,240 important statistical concepts that is very, very, very used in data science is called decision trees. 4 00:00:19,500 --> 00:00:24,180 So once again, we can make a full 40 hour course about decision trees. 5 00:00:24,480 --> 00:00:30,480 But my goal here is not to create a full course about decision trees and talking about all the mathematical 6 00:00:30,480 --> 00:00:32,820 concepts that are around decision trees. 7 00:00:33,210 --> 00:00:39,270 But just to give you guys a brief introduction so you will understand that it works well, that it's 8 00:00:39,390 --> 00:00:46,350 really, really present into their science machine learning as well as all the things that are around 9 00:00:46,350 --> 00:00:47,910 those concepts. 10 00:00:49,440 --> 00:00:49,760 All right. 11 00:00:49,770 --> 00:00:52,500 So basically, what exactly is decision tree? 12 00:00:52,860 --> 00:00:58,080 So if we talk about ourselves, we have a brain and let's say we, for example, want to make a decision. 13 00:00:58,080 --> 00:01:00,150 So our brain is a huge computer. 14 00:01:00,410 --> 00:01:05,730 Let's say, for example, our brain wants to make a decision so our brain will automatically create 15 00:01:05,730 --> 00:01:06,530 a decision tree. 16 00:01:07,380 --> 00:01:11,430 Once again, it's not going to be like it looks like this is going to be more different. 17 00:01:11,880 --> 00:01:20,040 And, well, the brain will say, OK, if I make this, I will have this outcome or if I make this, 18 00:01:20,040 --> 00:01:21,260 I will have this outcome. 19 00:01:21,270 --> 00:01:26,930 For example, let's say you have the choice between eating healthy and eating a pizza, for example. 20 00:01:27,930 --> 00:01:31,870 So your brain will say, OK, if I eat healthy, what will happen? 21 00:01:31,890 --> 00:01:36,760 So the outcome, for example, if I eat healthy, I will be well, I don't know. 22 00:01:36,780 --> 00:01:39,030 I will build muscles if I build muscles. 23 00:01:39,030 --> 00:01:41,040 Well, this will happen if this happens. 24 00:01:41,040 --> 00:01:41,790 Is that good or bad? 25 00:01:41,850 --> 00:01:42,540 OK, it's good. 26 00:01:42,840 --> 00:01:45,170 OK, if, for example, I eat pizza, what's going to happen? 27 00:01:45,540 --> 00:01:51,420 OK, I will be satisfied right now because I just ate pizza and once again I will be happy. 28 00:01:51,420 --> 00:01:53,130 But what will happen in the long run. 29 00:01:53,140 --> 00:01:55,080 OK, if I eat pizza I will be happy. 30 00:01:55,080 --> 00:02:00,120 But after that I won't be happy anymore because I won't be in shape, et cetera, et cetera, et cetera. 31 00:02:00,480 --> 00:02:05,940 So basically how we can see this reentries, those are simply diagrams or charts that people use to 32 00:02:05,940 --> 00:02:10,790 show statistical probabilities or to determine the possibility for an action to happen. 33 00:02:11,220 --> 00:02:13,650 So basically, you can see it just right here. 34 00:02:13,680 --> 00:02:17,580 So once again, I tried to find a very simple decision tree. 35 00:02:18,870 --> 00:02:20,220 So right here we have an example. 36 00:02:20,220 --> 00:02:28,320 So let's say, for example, A is higher is lower than B, so if yes, then B is lower than C and etc. 37 00:02:28,560 --> 00:02:34,740 And at the end, this would be our question and we will be able to find out an answer that looks something 38 00:02:34,740 --> 00:02:35,610 like this just with. 39 00:02:35,610 --> 00:02:36,560 Yes, in those. 40 00:02:37,260 --> 00:02:44,640 So at the end with a is is smaller than B, if it's smaller than we can say at the end that A is smaller 41 00:02:44,640 --> 00:02:52,110 than C, it's smaller or equal to B or C is smaller or equal to eight and smaller than B. 42 00:02:53,210 --> 00:02:56,480 And same thing right here, for example, B is greater than eight. 43 00:02:56,780 --> 00:02:58,300 This could be our outcome. 44 00:02:59,000 --> 00:03:06,230 So the main goal right here is really to go from having a question that we ask ourselves to an outcome 45 00:03:06,230 --> 00:03:08,100 that we will have at the end. 46 00:03:08,120 --> 00:03:10,610 So basically, the best outcome would be one of those. 47 00:03:10,940 --> 00:03:14,500 And the best outcome usually comes with. 48 00:03:16,040 --> 00:03:22,220 Well, there are mathematical formulas that we will talk about a bit later in this course will not talk 49 00:03:22,220 --> 00:03:26,420 really in depth about them, but just to give you guys, well, an introduction to this. 50 00:03:26,750 --> 00:03:32,210 So as they said, you will have your well, you will have this that needs to be resolved. 51 00:03:32,600 --> 00:03:36,770 And it's resolved by using mathematical formulas and by building up decision trees. 52 00:03:36,770 --> 00:03:42,500 So like here, for example, we have a decision tree and the decision at the end is done with the mathematical 53 00:03:42,500 --> 00:03:42,830 formula. 54 00:03:42,860 --> 00:03:49,490 For example, we have entropy that will be calculated and we'll have some other things that will be 55 00:03:49,490 --> 00:03:50,360 calculated as well. 56 00:03:50,750 --> 00:03:56,360 So if we stay in decision trees, once again, we can have some examples right here that are found. 57 00:03:56,810 --> 00:03:59,480 Once again, those decision trees examples are pretty simple. 58 00:03:59,480 --> 00:04:00,180 Just for you guys. 59 00:04:00,290 --> 00:04:02,180 Understand what a decision tree is. 60 00:04:02,780 --> 00:04:05,270 So right here, for example, is a person fit? 61 00:04:05,280 --> 00:04:11,440 So if the person is lower than 30 years old, if yes, does it a lot of pizza? 62 00:04:11,630 --> 00:04:12,110 Yes. 63 00:04:12,110 --> 00:04:13,300 Then the person is unfit. 64 00:04:13,310 --> 00:04:14,600 If no, then the person is. 65 00:04:14,610 --> 00:04:20,120 That same thing right here is the pea is the person higher than 30 years old? 66 00:04:20,150 --> 00:04:20,740 Well, no. 67 00:04:21,060 --> 00:04:23,230 Is the person exercising in the morning? 68 00:04:23,270 --> 00:04:23,700 Yes. 69 00:04:23,720 --> 00:04:24,270 And the person is. 70 00:04:24,620 --> 00:04:25,930 If not, then the person is unfit. 71 00:04:26,450 --> 00:04:28,820 Once again, this is really, really simple. 72 00:04:29,240 --> 00:04:32,300 Even the decision tree right here is pretty simple. 73 00:04:32,300 --> 00:04:36,590 Usually you have way more variables and it's way more calculated. 74 00:04:36,740 --> 00:04:41,690 It's way more hard to calculate everything and find out the best combination. 75 00:04:42,050 --> 00:04:47,990 But once again, this is just a small example of what you can find if you work with decision trees. 76 00:04:48,290 --> 00:04:53,390 So basically your brain does this automatically so you don't even think about it. 77 00:04:53,390 --> 00:04:56,960 Your brain does this automatically when you have to take a certain decision. 78 00:04:57,380 --> 00:04:58,520 It could be basic decision. 79 00:04:58,520 --> 00:05:00,890 For example, I'm walking or I'm driving. 80 00:05:00,890 --> 00:05:04,400 I will turn right or left, or it could be more complex decisions. 81 00:05:04,400 --> 00:05:09,040 For example, when you are sitting at a certain exam and you need to write down an answer. 82 00:05:10,510 --> 00:05:10,790 Right. 83 00:05:10,800 --> 00:05:16,310 So the mathematical aspect of this, basically, there is a lot of variables that can be calculated. 84 00:05:16,310 --> 00:05:21,920 But personally, I think there is two very important formulas that we will talk about right now. 85 00:05:22,160 --> 00:05:25,900 Once again, you don't necessarily need to understand them 100 percent. 86 00:05:26,180 --> 00:05:27,510 This is just an introduction. 87 00:05:27,860 --> 00:05:30,770 So the first one would be the formula of entropy. 88 00:05:31,010 --> 00:05:38,750 And basically entropy will be used to calculate the homogeneity of a sample or the integrity of a data 89 00:05:38,750 --> 00:05:39,150 base. 90 00:05:39,740 --> 00:05:46,130 So basically, basically, usually entropy is is used when you work with the databases and you build 91 00:05:46,130 --> 00:05:53,480 up your decision tree with those databases, where you need to understand is that if entropy is zero, 92 00:05:53,480 --> 00:05:55,070 the sample is homogeneous. 93 00:05:55,070 --> 00:06:00,890 So like this, for example, and if entropy is one, it's equally divided. 94 00:06:00,890 --> 00:06:02,100 So everything is equally divided. 95 00:06:02,630 --> 00:06:06,710 What's important also to understand is that entropy is somewhere between zero and one. 96 00:06:06,710 --> 00:06:09,170 It could it can be higher than one. 97 00:06:10,080 --> 00:06:11,810 Usually it's between zero and one. 98 00:06:12,500 --> 00:06:20,570 Also, entropy is very important to make the second part of our calculation because it controls how 99 00:06:20,570 --> 00:06:28,850 a decision tree will decide well, how a decision tree decides to split the data and will have an effect 100 00:06:28,850 --> 00:06:33,440 on how the tree will grow is the basis. 101 00:06:34,840 --> 00:06:41,890 So basically, this would be for entropy, so entropy is the first part of the calculation, and after 102 00:06:41,890 --> 00:06:46,830 that we have another part that is very, very important, which is the information game. 103 00:06:47,170 --> 00:06:52,120 So entropy help us after that work with information. 104 00:06:52,440 --> 00:06:58,120 Information gain would be another mathematical calculation that we will make to be able to choose the 105 00:06:58,120 --> 00:07:01,300 right path to take inside of our district. 106 00:07:01,750 --> 00:07:04,170 So here is our information gain formula. 107 00:07:04,510 --> 00:07:07,480 Basically, it looks a bit complicated, but it's pretty simple. 108 00:07:07,480 --> 00:07:07,990 Right here. 109 00:07:08,000 --> 00:07:11,210 We have our complete entropy. 110 00:07:11,560 --> 00:07:17,650 So basically the whole entropy of the whole decision tree and here will have, for example, the entropy 111 00:07:17,650 --> 00:07:22,900 of the left side and the right side, for example, usually the calculations a bit more complex because 112 00:07:23,020 --> 00:07:25,220 you have the sum of all entropy. 113 00:07:26,360 --> 00:07:30,220 But and this would be the entropy of the whole decision tree. 114 00:07:31,090 --> 00:07:35,230 But once again, this is just one left and one left and one right side. 115 00:07:36,850 --> 00:07:38,030 What does information gain? 116 00:07:38,050 --> 00:07:45,150 It will give us how much information a certain variable give us about the outcome and to interpret it, 117 00:07:45,160 --> 00:07:45,930 it's pretty simple. 118 00:07:45,940 --> 00:07:49,380 So a high information gain allow us to create a decision tree. 119 00:07:49,750 --> 00:07:53,350 There will be more representative of the results that we expect to have. 120 00:07:53,740 --> 00:08:00,610 So basically, if we come back to our decision tree right here and let's say this outcome right here 121 00:08:00,610 --> 00:08:04,600 has an information gain of zero point nine. 122 00:08:04,600 --> 00:08:06,540 Well, this would be the outcome. 123 00:08:06,550 --> 00:08:15,220 This would be the outcome that we will look for to basically the higher the information gain is, the 124 00:08:15,220 --> 00:08:16,650 higher will. 125 00:08:16,680 --> 00:08:22,720 Well, this would be the decision part of the decision tree that we will take because the results will 126 00:08:22,720 --> 00:08:26,710 be more representative of what we are looking for to have added. 127 00:08:28,450 --> 00:08:32,430 So besides that, those are the two formulas that are pretty important. 128 00:08:32,450 --> 00:08:34,600 So we have the entropy and information gain. 129 00:08:34,900 --> 00:08:39,780 But in all this course, what I want you guys to understand is that the concept of decision tree. 130 00:08:40,390 --> 00:08:44,170 So this is a really, really important concept to understand the mathematical part. 131 00:08:45,130 --> 00:08:50,350 Once again, you are not professional right now and it will take time for you to understand that as 132 00:08:50,350 --> 00:08:50,740 a whole. 133 00:08:51,070 --> 00:08:56,530 But the concept of decision, as I said, it works pretty much as the human brain. 134 00:08:56,830 --> 00:09:01,180 So basically, when you want to make a decision, you will automatically create a decision tree inside 135 00:09:01,180 --> 00:09:01,990 of your head. 136 00:09:02,230 --> 00:09:06,610 You don't even need to think about it and that this works pretty much the same way. 137 00:09:06,640 --> 00:09:12,130 So, for example, if you are building up a certain algorithm, that will not go in. 138 00:09:12,160 --> 00:09:16,840 But let's say, for example, you are building up a machine that will think by itself what you will 139 00:09:16,990 --> 00:09:17,560 add to it. 140 00:09:17,560 --> 00:09:22,980 You add an algorithm that will that will learn by its experiences. 141 00:09:23,290 --> 00:09:24,870 So it will try, for example, something. 142 00:09:24,880 --> 00:09:25,430 So I don't know. 143 00:09:25,450 --> 00:09:30,970 You want to you want to create an algorithm that will identify faces. 144 00:09:31,930 --> 00:09:36,940 So you will give you will give to this algorithm out on the one hundred thousand faces and this algorithm 145 00:09:36,940 --> 00:09:37,720 will work. 146 00:09:38,350 --> 00:09:44,820 Well, we'll see how each piece, what each face looks like and that it will learn by itself. 147 00:09:45,130 --> 00:09:51,280 And after that, when you will give a face this algorithm and tell him, OK, is there a face yes or 148 00:09:51,280 --> 00:09:51,570 no? 149 00:09:51,850 --> 00:09:57,130 And what the algorithm will do, it will build up a basic decision tree to be well, it will be more 150 00:09:57,130 --> 00:10:00,050 complex than the basic one to be able to tell you at the end. 151 00:10:00,070 --> 00:10:05,140 If this is a face based on this, based on variable, one, two, three, four or five. 152 00:10:05,500 --> 00:10:07,810 So it will test some some things. 153 00:10:07,810 --> 00:10:12,310 For example, number of pixels, number of well, a lot of things that will test a lot of things. 154 00:10:12,580 --> 00:10:17,110 But at the end of the day, it will have a certain decision tree to be able to tell you, yes, this 155 00:10:17,110 --> 00:10:19,570 is a face or no, this is not a face. 156 00:10:19,580 --> 00:10:24,310 This is I don't know, for example, this is a water bottle, for example. 157 00:10:25,500 --> 00:10:31,260 So I hope you guys understand that the whole concept of decision tree and the two formulas that comes 158 00:10:31,260 --> 00:10:31,530 with. 159 00:10:33,540 --> 00:10:39,270 So that's it for this class, guys, Insu in our next class where we are still talking, where we will 160 00:10:39,270 --> 00:10:41,940 still talk about some other statistical concepts.