1 00:00:01,733 --> 00:00:04,033 Hello and welcome back to the course on Deep Learning. 2 00:00:04,033 --> 00:00:07,033 Today we're going to wrap up with backpropagation. 3 00:00:07,200 --> 00:00:07,500 All right. 4 00:00:07,500 --> 00:00:08,666 So we already know 5 00:00:08,666 --> 00:00:11,800 pretty much everything we need to know about what happens in the neural network. 6 00:00:12,000 --> 00:00:15,100 We know that, there's a process called forward propagation 7 00:00:15,100 --> 00:00:18,566 where information is entered into the input layer 8 00:00:18,566 --> 00:00:23,533 and then it's propagated forward to get our Y hats, our output values. 9 00:00:23,533 --> 00:00:28,666 And then we compare those to the actual values, that we have in our training set. 10 00:00:29,066 --> 00:00:31,866 And then we calculate the errors. 11 00:00:31,866 --> 00:00:35,366 Then the errors are back propagated through the network 12 00:00:35,366 --> 00:00:36,866 in the opposite direction. 13 00:00:36,866 --> 00:00:41,100 And, that allows us to train the network by adjusting the weights. 14 00:00:41,500 --> 00:00:45,000 So the one key important thing to remember here 15 00:00:45,000 --> 00:00:50,933 is that backpropagation is an advanced algorithm driven by very, 16 00:00:51,300 --> 00:00:55,200 interesting and sophisticated mathematics, 17 00:00:55,400 --> 00:01:00,200 which allows us to adjust the weights all of them at the same time. 18 00:01:00,200 --> 00:01:02,400 All of the weights are adjusted simultaneously. 19 00:01:02,400 --> 00:01:06,633 So if, we were doing this manually or if we are coming up 20 00:01:06,633 --> 00:01:10,300 with a different type of algorithm, then even if we calculate the error. 21 00:01:10,300 --> 00:01:14,066 And then we were trying to understand what effect each of the weights has 22 00:01:14,066 --> 00:01:17,300 on the error, we'd have to somehow, 23 00:01:17,300 --> 00:01:20,700 adjust each of the weights in the pin independently or individually. 24 00:01:21,900 --> 00:01:23,400 The huge advantage 25 00:01:23,400 --> 00:01:26,400 of backpropagation, and this is a key thing to remember, 26 00:01:26,433 --> 00:01:29,866 is that during the process of backpropagation, 27 00:01:30,133 --> 00:01:33,333 simply because of the way, 28 00:01:33,333 --> 00:01:36,333 the algorithm is structured, 29 00:01:36,733 --> 00:01:40,500 you are able to adjust all of the weights at the same time. 30 00:01:40,500 --> 00:01:43,500 So you basically know which part of the error 31 00:01:43,500 --> 00:01:46,866 each of your weights in the neural network is responsible for. 32 00:01:47,266 --> 00:01:50,266 Now that is the key fundamental 33 00:01:50,433 --> 00:01:54,133 underlying, principle of backpropagation. 34 00:01:54,133 --> 00:02:00,566 And, this was why it picked up so rapidly in the 1980s. 35 00:02:00,566 --> 00:02:02,633 And this was the major breakthrough. 36 00:02:02,633 --> 00:02:06,300 And if you'd like to learn more about that and how exactly the mathematics, 37 00:02:06,900 --> 00:02:10,066 works in the background, then a good article, 38 00:02:10,066 --> 00:02:11,366 which we've already mentioned 39 00:02:11,366 --> 00:02:16,166 is the neural networks and deep learning is actually a book by Michael Nielsen. 40 00:02:16,400 --> 00:02:19,400 There you'll find, the mathematics written out 41 00:02:19,800 --> 00:02:23,533 and, it'll help you understand how exactly this is possible. 42 00:02:23,533 --> 00:02:28,066 But for now, for our purposes, if from an intuition point of view, 43 00:02:28,066 --> 00:02:33,200 the important part is to remember that, that's what, backpropagation does. 44 00:02:33,200 --> 00:02:36,200 It adjusts all of the weights at the same time. 45 00:02:36,800 --> 00:02:40,366 And now we're going to just wrap everything up with a step by step 46 00:02:40,366 --> 00:02:44,900 walkthrough of what happens in, the training of a neural network. 47 00:02:45,266 --> 00:02:45,566 All right. 48 00:02:45,566 --> 00:02:48,233 So step one, we randomly initialized the weights 49 00:02:48,233 --> 00:02:50,966 to small numbers close to zero, but not zero. 50 00:02:50,966 --> 00:02:53,600 we didn't really focus on the initialization of weights 51 00:02:53,600 --> 00:02:58,033 during the intuition tutorials, but, the weights have to start somewhere, 52 00:02:58,200 --> 00:03:02,533 and they are initialized with random values near zero. 53 00:03:02,533 --> 00:03:06,600 And from there, through the process of forward propagation backpropagation, 54 00:03:06,600 --> 00:03:10,866 these weights are adjusted, until the error is minimized, 55 00:03:11,833 --> 00:03:13,766 until the cost function is minimized. 56 00:03:13,766 --> 00:03:17,566 then step two, inputs the first observation of your data sets. 57 00:03:17,566 --> 00:03:19,300 So the first row into the input layer. 58 00:03:19,300 --> 00:03:21,366 Each feature is one input node. 59 00:03:21,366 --> 00:03:24,700 So basically take the columns and put them into the input nodes. 60 00:03:25,300 --> 00:03:27,800 step three forward propagation from left to right. 61 00:03:27,800 --> 00:03:28,966 The neurons are activated 62 00:03:28,966 --> 00:03:32,800 in a way that the impact of each neuron activation is limited by the weights. 63 00:03:32,800 --> 00:03:37,800 The weights basically determine, how important each neuron's activation is, 64 00:03:38,033 --> 00:03:41,733 then propagate the activations until getting the predicted result 65 00:03:41,966 --> 00:03:43,800 Y hat in this case. 66 00:03:43,800 --> 00:03:46,633 So basically you propagate from left to right. 67 00:03:46,633 --> 00:03:48,833 You go all the way until you get to the end. 68 00:03:48,833 --> 00:03:50,166 You get your Y hat. 69 00:03:50,166 --> 00:03:52,600 Then compare the prediction result to the actual result. 70 00:03:52,600 --> 00:03:55,266 Measure the generated error. 71 00:03:55,266 --> 00:03:57,333 and then you do the back propagation from right to left. 72 00:03:57,333 --> 00:03:58,000 The error is back. 73 00:03:58,000 --> 00:03:58,500 Propagate it. 74 00:03:58,500 --> 00:04:01,800 Update the weights according to how much they're responsible for the error. 75 00:04:02,100 --> 00:04:06,200 Again, you are able to calculate that because of the way that back propagate 76 00:04:06,500 --> 00:04:09,366 that perturbation algorithm is structured, 77 00:04:09,366 --> 00:04:12,566 the learning rate decides by how much we update the weights. 78 00:04:12,566 --> 00:04:16,900 Learning rate is a parameter you can control in your, neural network. 79 00:04:17,666 --> 00:04:18,600 Step six. 80 00:04:18,600 --> 00:04:22,600 Repeat steps 1 to 5 and update the weights after each observation. 81 00:04:22,933 --> 00:04:24,766 that is called reinforcement learning. 82 00:04:24,766 --> 00:04:30,600 And in our case that was stochastic gradient descent or repeat 83 00:04:30,600 --> 00:04:33,833 steps 1 to 5, but update weights only after a batch of observations. 84 00:04:33,833 --> 00:04:37,800 So batch learning it's either, full gradient descent 85 00:04:37,800 --> 00:04:40,800 or batch gradient descent or mini batch gradient descent. 86 00:04:40,833 --> 00:04:44,266 And step seven, when the whole training set pass through the artificial neural 87 00:04:44,400 --> 00:04:48,933 neural network, that makes an epoch, redo more epochs. 88 00:04:48,933 --> 00:04:51,933 So basically you just keep doing that and doing that and doing that, 89 00:04:52,200 --> 00:04:54,800 to allowing your neural network 90 00:04:54,800 --> 00:04:58,366 to train better and better and better and constantly adjust itself, 91 00:04:59,633 --> 00:05:02,600 as you minimize the cost function. 92 00:05:02,600 --> 00:05:04,300 So there we go. 93 00:05:04,300 --> 00:05:06,266 those are the steps you need to take 94 00:05:06,266 --> 00:05:09,466 to build your artificial neural networks and train it. 95 00:05:09,900 --> 00:05:13,533 And, these are the steps that you will be taking together 96 00:05:13,600 --> 00:05:15,933 had learned in the practical tutorials. 97 00:05:15,933 --> 00:05:19,200 Wish you the best of luck, and I look forward to seeing you next time. 98 00:05:19,400 --> 00:05:22,400 Until then, enjoy deep learning.