1
00:00:00,009 --> 00:00:02,519
Let's continue with our data structures.

2
00:00:02,529 --> 00:00:06,070
And the next is going to be what we call sets.

3
00:00:06,449 --> 00:00:12,590
Now on my screen right here, it appears that I have two different lists

4
00:00:13,010 --> 00:00:16,829
that are similar in the elements inside of them. However,

5
00:00:17,639 --> 00:00:20,629
notice that the compromised passwords uses the

6
00:00:20,639 --> 00:00:24,409
regular brackets while the new compromised passwords

7
00:00:24,649 --> 00:00:27,200
uses the curly braces.

8
00:00:27,379 --> 00:00:28,459
Now, believe it or not,

9
00:00:28,469 --> 00:00:31,940
this is extremely important because the regular brackets indicate

10
00:00:32,049 --> 00:00:37,450
that these items are in a list while the curly braces will indicate

11
00:00:37,459 --> 00:00:42,459
that the items are not in the list but inside of a set.

12
00:00:43,299 --> 00:00:46,209
So the natural question right now would be, well,

13
00:00:46,220 --> 00:00:49,619
what is the difference between a list and a set?

14
00:00:49,810 --> 00:00:51,090
There's quite a few of them.

15
00:00:51,409 --> 00:00:56,849
The thing is with your items in a list, you can order them, you can order them,

16
00:00:56,860 --> 00:00:58,930
you can reference them using an index number

17
00:00:59,229 --> 00:01:00,409
while

18
00:01:00,779 --> 00:01:05,550
in a set they are not in any particular order, they are unordered,

19
00:01:05,660 --> 00:01:06,849
right? In fact,

20
00:01:07,589 --> 00:01:11,089
let me prove this to you if I was to print right now the compromised

21
00:01:11,809 --> 00:01:13,019
uh passwords.

22
00:01:13,709 --> 00:01:14,730
All right. And then

23
00:01:15,050 --> 00:01:16,650
I also print out

24
00:01:17,720 --> 00:01:20,709
the new compromised passwords.

25
00:01:21,150 --> 00:01:22,849
Look at the order

26
00:01:23,300 --> 00:01:29,290
for the actual list, you can see it follows the same order password 1234 QW

27
00:01:29,410 --> 00:01:30,139
RT Y.

28
00:01:30,239 --> 00:01:31,660
But with the set,

29
00:01:31,800 --> 00:01:33,400
it's almost kind of random. It's set at

30
00:01:33,580 --> 00:01:34,870
1234

31
00:01:35,019 --> 00:01:37,680
it went to AB C 123, it went to Monkey

32
00:01:37,809 --> 00:01:39,069
and so on. So

33
00:01:39,269 --> 00:01:44,260
the thing is with sets, the items are not in any particular kind of order.

34
00:01:44,660 --> 00:01:49,360
Another major difference is that your list will allow duplicates, right? So

35
00:01:49,519 --> 00:01:50,599
for example,

36
00:01:51,050 --> 00:01:52,069
I could add

37
00:01:52,290 --> 00:01:54,080
in my list 1234

38
00:01:54,250 --> 00:01:54,959
again.

39
00:01:55,209 --> 00:01:56,180
However,

40
00:01:57,120 --> 00:01:58,709
if I was to do the same

41
00:02:00,269 --> 00:02:01,459
in my set,

42
00:02:01,680 --> 00:02:04,620
you will notice the difference. So if I run again,

43
00:02:04,750 --> 00:02:07,809
you can see right now in the list, 1234 is repeated.

44
00:02:07,879 --> 00:02:11,979
While in the set, 1234 is listed only once.

45
00:02:11,990 --> 00:02:14,979
It is not allowed, you're not allowed to have duplicates

46
00:02:15,330 --> 00:02:16,699
in your sets.

47
00:02:16,899 --> 00:02:18,860
And this is why whenever you're trying to

48
00:02:18,869 --> 00:02:22,580
create a function or a program involving like

49
00:02:23,270 --> 00:02:26,429
a list of items that should be unique,

50
00:02:26,550 --> 00:02:31,970
like let's say passwords or email addresses or user accounts,

51
00:02:32,149 --> 00:02:37,020
you want to use sets for those as opposed to a list because you know that

52
00:02:37,759 --> 00:02:40,750
no two users can have the exact same credentials. So

53
00:02:41,350 --> 00:02:43,490
it would be more ideal to use a set.

54
00:02:43,740 --> 00:02:47,539
So sets are typically a lot faster to run through the background,

55
00:02:47,550 --> 00:02:51,500
can run through a set a lot faster than it would uh a list.

56
00:02:51,610 --> 00:02:55,839
Now, both are mutable your lists and your sets are mutable, meaning that you can,

57
00:02:55,850 --> 00:03:00,360
you can make changes, you can add elements to move elements, you can modify them.

58
00:03:00,669 --> 00:03:03,039
And that's pretty much it.

59
00:03:03,199 --> 00:03:06,250
But just like with your lists,

60
00:03:06,399 --> 00:03:09,839
we also have operations or methods

61
00:03:09,979 --> 00:03:12,029
that we can use

62
00:03:12,199 --> 00:03:16,240
on our sets, we actually call them mathematical operations

63
00:03:16,509 --> 00:03:20,190
like union intersection difference. Let me just show you,

64
00:03:20,639 --> 00:03:21,750
let me remove

65
00:03:22,020 --> 00:03:23,830
this right here. And now

66
00:03:24,860 --> 00:03:26,190
let me come over here

67
00:03:27,339 --> 00:03:30,679
and let's work with our two different sets.

68
00:03:31,600 --> 00:03:34,929
OK? I don't think I need this line anymore actually.

69
00:03:35,229 --> 00:03:40,880
OK. So right here we have set one, we have set two and then both have two

70
00:03:41,149 --> 00:03:42,830
IP addresses.

71
00:03:42,960 --> 00:03:49,339
Now I could decide to find IP addresses that are common in both sets, right? So

72
00:03:49,679 --> 00:03:51,419
I could say, for example, common

73
00:03:52,660 --> 00:03:53,710
underscore

74
00:03:54,089 --> 00:03:54,639
uh

75
00:03:54,899 --> 00:03:55,320
IPs

76
00:03:56,160 --> 00:03:58,759
will not be equal to set one.

77
00:03:59,610 --> 00:04:00,809
And now

78
00:04:01,130 --> 00:04:02,360
union.

79
00:04:03,059 --> 00:04:05,509
OK. So intersection, right, intersection

80
00:04:05,690 --> 00:04:07,080
and now set two.

81
00:04:07,330 --> 00:04:11,770
So intersection right here is what we refer to as a operator

82
00:04:12,070 --> 00:04:14,720
that will find the common elements

83
00:04:15,119 --> 00:04:17,600
in both sets. So

84
00:04:18,320 --> 00:04:21,019
if I was to come over right now and then say print

85
00:04:21,540 --> 00:04:22,959
come on IPs.

86
00:04:23,329 --> 00:04:24,500
And I run

87
00:04:24,839 --> 00:04:27,480
you will see 10.0 0.0 0.1

88
00:04:27,630 --> 00:04:31,459
is common in both. That's why it is printed out.

89
00:04:31,630 --> 00:04:36,600
But we also have what we call the union mathematical operator. So

90
00:04:37,390 --> 00:04:39,690
I can change the symbol right here to

91
00:04:40,029 --> 00:04:43,250
the union symbol, what this will do.

92
00:04:43,380 --> 00:04:45,049
It's kind of like the extend

93
00:04:45,630 --> 00:04:50,549
a method for list where you'll add the elements in one list onto another.

94
00:04:50,559 --> 00:04:51,329
So right now,

95
00:04:51,619 --> 00:04:58,049
if I run, you can see we now have all the IP addresses. However, take note

96
00:04:58,170 --> 00:05:00,459
that again because this is a set,

97
00:05:00,559 --> 00:05:07,489
the 10.0 0.0 0.1 IP that's present in both will only be listed once, right?

98
00:05:07,739 --> 00:05:13,049
We have another operator in here which will be the difference operator

99
00:05:13,200 --> 00:05:16,920
just use the minus sign. Now, what this will do

100
00:05:17,600 --> 00:05:21,940
is that you will find elements that are present in one set but not in the other.

101
00:05:21,950 --> 00:05:23,359
So if I run this right now,

102
00:05:23,459 --> 00:05:27,209
you can see that 192.168 0.1 0.01

103
00:05:27,320 --> 00:05:35,420
is an element that is present in the first set but is absent in the second set.

104
00:05:35,559 --> 00:05:38,579
So we can do the opposite. I can say set two

105
00:05:38,809 --> 00:05:40,660
minus set one.

106
00:05:41,209 --> 00:05:42,429
And now this

107
00:05:42,600 --> 00:05:47,519
will give us the 192168.1 0.102 which is of course

108
00:05:47,529 --> 00:05:51,750
present in set two but not present in set one.

109
00:05:51,760 --> 00:05:54,320
We also have methods

110
00:05:54,489 --> 00:05:57,579
like your add, remove, clear.

111
00:05:57,760 --> 00:05:58,350
So

112
00:05:58,600 --> 00:05:59,829
as an example,

113
00:06:00,119 --> 00:06:04,000
if I wanted to add a new IP address to set one,

114
00:06:04,179 --> 00:06:06,630
all I would need to do here is

115
00:06:06,760 --> 00:06:08,869
I would just say set one.

116
00:06:09,660 --> 00:06:11,399
And now dot add.

117
00:06:11,709 --> 00:06:13,070
And now in brackets

118
00:06:13,260 --> 00:06:19,640
I can add the new IP address so I can say 192.168

119
00:06:20,000 --> 00:06:22,519
0.0 0.10.

120
00:06:23,350 --> 00:06:25,420
All right, just as an example. So now

121
00:06:25,720 --> 00:06:27,339
if I was to print out

122
00:06:29,089 --> 00:06:30,459
set one

123
00:06:31,690 --> 00:06:32,399
run

124
00:06:32,839 --> 00:06:33,679
and there you go,

125
00:06:33,809 --> 00:06:40,609
you can see right now that 192.168 0.0 0.10 has in fact been added to our set one.

126
00:06:40,709 --> 00:06:41,779
You can also

127
00:06:41,910 --> 00:06:46,149
have other methods like remove, clear

128
00:06:46,339 --> 00:06:50,869
and so on. Feel free to look this up in the Python

129
00:06:51,299 --> 00:06:53,279
website if you wanted to.

130
00:06:53,410 --> 00:06:57,269
So that's it for sets. Thank you for watching. I will see you in the next class.