1 00:00:00,910 --> 00:00:03,980 Now, we have discussed the individual sale. 2 00:00:04,950 --> 00:00:09,510 Now we are going to say that this is to create network offsets. 3 00:00:11,450 --> 00:00:13,980 Just to avoid confusion with biological neutron. 4 00:00:14,720 --> 00:00:18,020 I'll be calling a neutron as Perceptron only. 5 00:00:18,590 --> 00:00:19,580 So Perceptron. 6 00:00:19,670 --> 00:00:22,490 From now on means any artificial neutron. 7 00:00:24,370 --> 00:00:30,220 Now, there are two ways we can stack sits badly or sequentially. 8 00:00:32,170 --> 00:00:35,040 Let's see what happens when these taxes, Badali. 9 00:00:37,610 --> 00:00:41,920 Head is a single Perceptron, the three inputs and one output. 10 00:00:43,430 --> 00:00:46,480 No, we had an added Perceptron that luck do it. 11 00:00:47,670 --> 00:00:52,940 This cell also gets the same three input, but it has a different output. 12 00:00:53,210 --> 00:00:53,720 Why do. 13 00:00:55,370 --> 00:01:02,860 We can keep on adding motives to these words, maybe a third or fourth or even more than that, we'll 14 00:01:02,900 --> 00:01:05,300 just start getting new output. 15 00:01:05,870 --> 00:01:11,640 Or in other words, we can predict for multiple output using the same input features. 16 00:01:13,610 --> 00:01:20,090 For example, when we are doing image recognition and we are trying to find out a face of a person, 17 00:01:21,020 --> 00:01:26,060 he may also want to find the X and Y coordinate of that face to. 18 00:01:27,390 --> 00:01:30,760 For these two variables will become vital NYT. 19 00:01:32,700 --> 00:01:38,190 Although it was the commission needs a much more complex network by giving this example. 20 00:01:38,430 --> 00:01:43,390 I want to make the point that neural networks are not bound to only one output. 21 00:01:44,810 --> 00:01:51,470 With the same input, you can get multiple output because we can do battler's stacking of the artificial 22 00:01:51,470 --> 00:01:51,890 neurons. 23 00:01:54,760 --> 00:01:55,640 Now, let's see. 24 00:01:55,960 --> 00:01:57,100 He's going to stacking. 25 00:01:59,420 --> 00:02:04,880 In the image above, we have five inputs which we input to three badland percipient. 26 00:02:06,490 --> 00:02:14,490 Now, the output from this set of Perceptron is taken and Frager's input to another set of Belet Perceptron 27 00:02:15,710 --> 00:02:15,970 hit. 28 00:02:17,130 --> 00:02:21,570 I'm inputting the output of these three to these four separate. 29 00:02:23,630 --> 00:02:29,630 Again, I take thee forward our ports of these Perceptron and put these into this single Perceptron. 30 00:02:31,020 --> 00:02:37,890 Lastly, the singleton is giving out one single output, which is the variable which we want to predict. 31 00:02:41,440 --> 00:02:50,170 So this is sequential stacking in which the output of one set of Ballylee stack neurons is sequentially 32 00:02:50,200 --> 00:02:54,240 given as input to the next set of badly stacked neurons. 33 00:02:56,640 --> 00:02:59,370 Let's first understand the benefit of doing this. 34 00:03:00,530 --> 00:03:09,330 That is why did we not just input all the five input into a single cell and use this output to predict 35 00:03:09,330 --> 00:03:10,110 the variables like. 36 00:03:11,270 --> 00:03:14,540 How is stacking these additional sets of Meuron helpful? 37 00:03:16,840 --> 00:03:18,670 Later, we have this tape of Peter. 38 00:03:20,310 --> 00:03:22,650 There are these two input variables. 39 00:03:23,800 --> 00:03:25,240 Maybe height and weight. 40 00:03:25,780 --> 00:03:27,990 This is what we are trying to classify. 41 00:03:28,660 --> 00:03:31,780 If the anyone in the room is a cow or a dog. 42 00:03:33,100 --> 00:03:35,000 So calls generally like it. 43 00:03:35,430 --> 00:03:37,950 They have more weight and more height than a dog. 44 00:03:38,640 --> 00:03:40,760 And dogs generally like it. 45 00:03:41,160 --> 00:03:43,560 That is the other represented by the red dot. 46 00:03:45,120 --> 00:03:52,700 Now, when we're classifying this set of data, we can have a linear separator that is a straight line 47 00:03:53,080 --> 00:03:54,880 to separate these two classes. 48 00:03:56,720 --> 00:04:00,400 Anything on the right side will be predicted as a go. 49 00:04:00,720 --> 00:04:03,980 And anything on the left, fate will be predicted as a dumb. 50 00:04:05,010 --> 00:04:08,370 This is the capability of a single Perceptron. 51 00:04:09,550 --> 00:04:15,710 A single Perceptron can find out the best straight line to classify given data. 52 00:04:17,500 --> 00:04:19,090 So if we have this problem. 53 00:04:20,170 --> 00:04:22,910 Using a single Perceptron would suffice. 54 00:04:24,340 --> 00:04:26,560 But Bortolotti situation is both complex. 55 00:04:27,400 --> 00:04:34,400 In fact, in VLF situations, we'd never use neural networks when we'd need to classify a photo situation. 56 00:04:34,480 --> 00:04:35,500 As simple as this. 57 00:04:36,640 --> 00:04:40,890 The real life situations for neural networks is often more complex. 58 00:04:43,000 --> 00:04:45,390 Let me complicate the example a little bit. 59 00:04:47,060 --> 00:04:51,180 What if we wanted to classify objects which have this distribution? 60 00:04:52,180 --> 00:04:59,110 So anything to the left of the first name and anything to the date of the second lane is Class eight 61 00:04:59,850 --> 00:05:01,180 or is it a dark? 62 00:05:02,320 --> 00:05:07,160 And anything in between these two lines is Glasby or is a green dot. 63 00:05:08,930 --> 00:05:13,890 This type of classification situation can not be handled by a single Perceptron. 64 00:05:14,900 --> 00:05:19,820 A netbook such as the one shown on the right can easily handle it. 65 00:05:21,230 --> 00:05:21,890 For example. 66 00:05:23,100 --> 00:05:31,260 This first neuron will fire, that is give output as one rendi point lays to the left of lane one. 67 00:05:32,660 --> 00:05:38,540 And the second neuron will give output as one windy point lays to deviate off line to. 68 00:05:40,110 --> 00:05:46,110 And just finally, Ron, give outwards one when any one of the two impulses, one. 69 00:05:48,450 --> 00:05:49,660 You can post the video here. 70 00:05:50,370 --> 00:05:56,180 Think about it for a couple of minutes and see how the small network is handling this special classification. 71 00:05:59,230 --> 00:06:01,330 This is the part of a neural network. 72 00:06:02,900 --> 00:06:10,540 In the network we created, each neutron can focus on a particular feature of the object and not on 73 00:06:10,540 --> 00:06:11,860 the final output. 74 00:06:12,990 --> 00:06:15,210 The final output will be predicted. 75 00:06:15,770 --> 00:06:17,820 Besides the desert of these features. 76 00:06:19,520 --> 00:06:25,990 In this way, neural networks can do really sophisticated decision making with basic machine learning 77 00:06:25,990 --> 00:06:29,890 techniques such as the integration did not do with good accuracy. 78 00:06:32,660 --> 00:06:37,760 Before we move on, let's take a minute to discuss this network's nominated. 79 00:06:39,850 --> 00:06:41,200 This is a neural network. 80 00:06:42,570 --> 00:06:46,800 Now, each set of badland neurons are called lid's. 81 00:06:48,690 --> 00:06:50,460 The first is the input layer. 82 00:06:51,830 --> 00:06:53,540 The last is the output limit. 83 00:06:54,260 --> 00:06:56,780 And these in-between these live. 84 00:06:57,740 --> 00:06:59,750 This network had five inputs. 85 00:07:00,660 --> 00:07:07,380 Three in Italy and one in Italy, two and one in the outwardly. 86 00:07:08,590 --> 00:07:14,800 So for brevity, this network can also be caught as a five, three, four, what network? 87 00:07:18,630 --> 00:07:25,680 Ultimate is that due process information in this network is flowing in only Deepwater Derickson. 88 00:07:27,370 --> 00:07:28,980 Which is why so it didn't work. 89 00:07:29,170 --> 00:07:30,890 It's also called Feed Forward. 90 00:07:31,030 --> 00:07:31,510 Network. 91 00:07:33,090 --> 00:07:40,140 In comparison, if the output of one of these cells of that list goes back as input to end of the sale 92 00:07:40,140 --> 00:07:43,400 of that, literally, then it is called a cyclic network. 93 00:07:45,400 --> 00:07:46,770 They couldn't mutilate books. 94 00:07:46,930 --> 00:07:50,000 Also known as AUDIN in the example of Liquid Network. 95 00:07:51,510 --> 00:07:55,220 Auditing are used in natural language processing and language modeling. 96 00:07:56,000 --> 00:07:58,770 For now, let's come back to a standard feet forward. 97 00:07:58,880 --> 00:07:59,270 Take a look. 98 00:08:00,620 --> 00:08:03,740 Now, you can notice here that output from this in. 99 00:08:05,840 --> 00:08:07,690 It's going to force its. 100 00:08:09,360 --> 00:08:11,240 These are not four different outputs. 101 00:08:11,700 --> 00:08:13,020 It is only one output. 102 00:08:13,320 --> 00:08:16,980 The same output is going as input and all the forces. 103 00:08:18,830 --> 00:08:27,170 Also note that every neuron in Easley, it is connected to every other neuron in the dissin forward 104 00:08:27,190 --> 00:08:27,490 lere. 105 00:08:28,930 --> 00:08:31,480 Therefore, this network is fully connected. 106 00:08:33,120 --> 00:08:37,650 If somehow some links were missing, we call it possibly connected. 107 00:08:38,910 --> 00:08:42,540 But for most practical purposes, we use a fully connected network. 108 00:08:45,680 --> 00:08:51,530 Before I close this lectern, I would like to tell you that within this short span of time in which 109 00:08:51,530 --> 00:08:55,720 we covered this lecture, we have entered devoid of deep learning. 110 00:08:56,930 --> 00:09:01,030 Such artificial neural networks is what deep learning is made up of. 111 00:09:02,130 --> 00:09:09,320 Basically, think of this like a system which loans the relationship between input and output. 112 00:09:10,850 --> 00:09:19,190 The more lives we have in this system, the more deep our system is and more it is capable of establishing 113 00:09:19,310 --> 00:09:22,280 complex relationship between input and output. 114 00:09:24,950 --> 00:09:28,850 So I hope that you understand the basics of neural networks. 115 00:09:30,010 --> 00:09:37,600 In the next lecture will go deeper and see how these networks process the output and find the optimum 116 00:09:37,810 --> 00:09:43,420 values of VATE and biases to get good accuracy of prediction. 117 00:09:44,890 --> 00:09:45,890 See you in the next picture.