1
00:00:01,630 --> 00:00:05,250
In the last lecture, we discussed a single cell called Perceptron.

2
00:00:06,280 --> 00:00:10,470
Now, in this lecture, we are going to extend the concepts that we learned in the last one.

3
00:00:12,360 --> 00:00:19,950
I told you that a Perceptron takes in binary input that is one and zero and gives out a single binary

4
00:00:19,950 --> 00:00:20,410
output.

5
00:00:21,900 --> 00:00:24,840
But there is no logical reason to put this limitation.

6
00:00:25,980 --> 00:00:29,670
We can easily extend this to any real input values.

7
00:00:31,860 --> 00:00:39,960
So instead of having black and white only or zero and one only, we can have different shades of grey

8
00:00:39,960 --> 00:00:40,410
as well.

9
00:00:40,890 --> 00:00:48,330
That is, we accept any real value as input DVDs and trishaw still function in the same way.

10
00:00:52,420 --> 00:00:59,320
Next, we will take a look at this equation of Perceptron will slightly modify it to Legia generally

11
00:00:59,320 --> 00:01:02,530
used equation in this equation.

12
00:01:02,980 --> 00:01:07,930
We are multiplying weight, adding these terms and comparing them with detail.

13
00:01:10,510 --> 00:01:16,060
We will make a small change here, bring this threshold to the left and right.

14
00:01:16,180 --> 00:01:23,950
This new term as B basically it means that we have B is equal to minus altricial.

15
00:01:25,510 --> 00:01:31,270
People usually call this constant as the bias doesn't really make any difference.

16
00:01:31,420 --> 00:01:36,970
But this is the mathematical representation of Perceptron, as you would find in most of the books.

17
00:01:38,920 --> 00:01:39,790
Now, let's move on.

18
00:01:39,880 --> 00:01:42,430
I look at the graphical representation of this function.

19
00:01:45,920 --> 00:01:54,410
If you look at this graph, if the calculated value of this left part, that is summation of weight

20
00:01:54,440 --> 00:02:02,870
multiplied by features, plus the bias, if the summation if this left part is less than zero, the

21
00:02:02,870 --> 00:02:04,370
output comes out to be zero.

22
00:02:05,870 --> 00:02:11,270
So you can see in the graph below zero, the output of the function is also zero.

23
00:02:14,210 --> 00:02:17,390
When this left part is greater than zero.

24
00:02:17,990 --> 00:02:22,010
This function suddenly activates and gives an output of one.

25
00:02:25,030 --> 00:02:28,930
This type of function is called a simple step function.

26
00:02:30,640 --> 00:02:36,910
This is one type of activation function activation functions are basically those functions which take

27
00:02:36,910 --> 00:02:41,610
into account some able to racial value here.

28
00:02:42,520 --> 00:02:43,930
The threshold value is zero.

29
00:02:44,680 --> 00:02:52,450
And this function takes a sudden step at this threshold value, which is why it is called a step activation

30
00:02:52,450 --> 00:02:52,870
function.

31
00:02:57,180 --> 00:02:59,820
There are many other types of activation functions.

32
00:03:01,200 --> 00:03:03,570
Most popular one is the sigmoid function.

33
00:03:06,120 --> 00:03:09,630
It is a pictorial representation of how sigmoid function looks.

34
00:03:11,070 --> 00:03:13,550
It is a smooth s shaped go.

35
00:03:14,430 --> 00:03:21,780
It also has a minimum of zero at minus infinity and maximum of one at plus infinity.

36
00:03:22,950 --> 00:03:31,110
But instead of having a step and raising, suddenly, this function rises gradually and continuously.

37
00:03:32,490 --> 00:03:38,100
This function is also called logistic function and is also used in logistic regression, which is a

38
00:03:38,100 --> 00:03:39,990
very basic classification algorithm.

39
00:03:43,420 --> 00:03:50,700
Now, the sigmoid function solves a major problem that we have with this step function when we are training

40
00:03:50,730 --> 00:03:55,490
our Perceptron using historical data to find the value of beads and treasure.

41
00:03:56,600 --> 00:04:00,180
This step function is very sensitive to individual observations.

42
00:04:01,230 --> 00:04:09,480
For example, when we are classifying fashion objects in our fashion m NASD dataset and that algorithm

43
00:04:09,510 --> 00:04:18,000
is misclassifying a particular image of boots as trousers to rectify, this model will need to find

44
00:04:18,000 --> 00:04:19,800
new ways and bias values.

45
00:04:21,450 --> 00:04:22,770
This is where the problem comes.

46
00:04:23,430 --> 00:04:30,720
Small change in dividend bias values will completely flip the output for a lot of the other observations.

47
00:04:31,620 --> 00:04:37,530
This makes the step function very hard to control with sigmoid function.

48
00:04:37,710 --> 00:04:41,110
The changes gradual, so it is easier to control the behavior.

49
00:04:43,350 --> 00:04:50,340
Now, when we replace this step function with a sigmoid activation function, we call this new cell

50
00:04:50,460 --> 00:04:55,390
as a sigmoid neuron or a logistic neuron instead of Perceptron.

51
00:04:57,090 --> 00:05:00,840
Mathematically, a sigmoid function formula looks like this.

52
00:05:01,650 --> 00:05:03,780
It is sigmoid.

53
00:05:03,800 --> 00:05:06,870
A Z is equal to one upon one.

54
00:05:06,870 --> 00:05:09,840
Plus it is to the power of minus C.

55
00:05:10,760 --> 00:05:17,340
And if you plot this function on the graph, that is, if you have the Z on X axis and you calculate

56
00:05:17,340 --> 00:05:21,420
the value of this function using this formula and plot it on the Y axis.

57
00:05:21,930 --> 00:05:23,880
This is how this formula looks like.

58
00:05:25,620 --> 00:05:27,430
Now we will replace the value of Z.

59
00:05:27,780 --> 00:05:30,090
With these summation plus bias value.

60
00:05:30,870 --> 00:05:37,050
So W.G. A, C plus B was the input to our activation function.

61
00:05:37,890 --> 00:05:40,630
So we input this in place of zie.

62
00:05:41,220 --> 00:05:44,700
So this is what the output of our neuron looks like.

63
00:05:45,060 --> 00:05:51,570
It is one upon one plus exponential minus summation of words with features.

64
00:05:51,780 --> 00:06:01,530
Minus B, if you calculate this value, it will always lay between zero to one and it will have a shape

65
00:06:01,530 --> 00:06:02,160
like this.

66
00:06:03,060 --> 00:06:06,930
So you can compare it with this step function also in step function.

67
00:06:07,050 --> 00:06:14,280
We calculated output using this formula where we got zero.

68
00:06:14,400 --> 00:06:19,320
If this summation was less than zero and we got one, if the summation was greater than equal to zero.

69
00:06:20,280 --> 00:06:23,640
We have replaced astep with a sigmoid function.

70
00:06:23,850 --> 00:06:25,200
This is a continuous function.

71
00:06:25,260 --> 00:06:27,030
We do not need two parts to it.

72
00:06:27,750 --> 00:06:35,730
So we just input the value of the Bluejays X and the bias to calculate the output, which is a continuous

73
00:06:35,730 --> 00:06:36,090
function.

74
00:06:37,270 --> 00:06:45,270
Now with this are artificial neural cell Israeli, which takes in any number of real value inputs and

75
00:06:45,270 --> 00:06:47,760
gives an output between zero and one.

76
00:06:49,610 --> 00:06:56,240
It is time to create an artificial neural network, which is basically a network of these individual

77
00:06:56,240 --> 00:06:56,660
cells.

78
00:06:58,520 --> 00:07:00,890
So just a brief recap of this class.

79
00:07:01,910 --> 00:07:07,760
Initially, I said that we taken binary input and gave out one single binary output.

80
00:07:08,570 --> 00:07:18,860
We replaced the input from binary to any real value, and we have replaced the binary output to a value

81
00:07:18,860 --> 00:07:20,240
between zero and one.

82
00:07:21,680 --> 00:07:28,460
So in this generalized form, we taken any input which have any real value and we get one output with

83
00:07:28,460 --> 00:07:30,020
lies between zero and one.