1 00:00:00,009 --> 00:00:00,779 Welcome back. 2 00:00:00,790 --> 00:00:00,939 Now, 3 00:00:00,949 --> 00:00:03,029 let's talk about some very important concepts 4 00:00:03,039 --> 00:00:04,949 when it comes to dealing with strings. 5 00:00:05,230 --> 00:00:08,380 We do have the concepts of indexing. 6 00:00:08,390 --> 00:00:13,090 Now, don't worry, we're going to talk about indexing a lot more later on. 7 00:00:13,239 --> 00:00:16,270 But now let me give you sort of a brief introduction, right? 8 00:00:16,670 --> 00:00:23,670 Say for example, over here, I do have my text which is equal to cybersecurity. 9 00:00:24,040 --> 00:00:29,090 Now, what if we specifically wanted to for one reason or the other 10 00:00:29,500 --> 00:00:32,490 extract, the very first letter 11 00:00:32,979 --> 00:00:37,119 in the word cybersecurity, the first letter here would be C right? 12 00:00:37,479 --> 00:00:37,979 So 13 00:00:38,279 --> 00:00:44,520 I could do something like say first and then underscore character equals to 14 00:00:44,810 --> 00:00:45,900 text. 15 00:00:46,130 --> 00:00:48,159 And now I'm going to use my brackets 16 00:00:48,500 --> 00:00:53,159 and now I need to indicate the index position 17 00:00:53,369 --> 00:00:56,060 of the letter I want to target. 18 00:00:56,319 --> 00:01:00,250 Now, in cybersecurity C is the first letter. 19 00:01:00,259 --> 00:01:02,709 So what do you think my index number here is going to be? 20 00:01:02,950 --> 00:01:06,410 Nope, it's not, it's not one, it's going to be zero. 21 00:01:06,970 --> 00:01:10,589 Please keep in mind that in programming in general, not just Python, 22 00:01:10,599 --> 00:01:12,080 but general programming. 23 00:01:12,199 --> 00:01:15,250 Your indexing starts from zero 24 00:01:15,620 --> 00:01:16,720 and not one. 25 00:01:17,230 --> 00:01:17,779 So 26 00:01:18,519 --> 00:01:23,220 if I now wanted to print out the contents of 27 00:01:23,669 --> 00:01:26,940 my verbal called first character. 28 00:01:27,739 --> 00:01:30,919 The answer here is going to be capital C 29 00:01:31,250 --> 00:01:34,470 likewise, if I change this 1 to 1, 30 00:01:34,620 --> 00:01:37,000 now it's going to be Y 31 00:01:37,370 --> 00:01:40,930 so this right here is indexing and we're going 32 00:01:40,940 --> 00:01:43,489 to be working with it throughout this course. 33 00:01:43,900 --> 00:01:45,959 Now, the second concept here, 34 00:01:46,720 --> 00:01:49,230 it's going to be the concept of sly 35 00:01:49,519 --> 00:01:50,339 sin 36 00:01:50,709 --> 00:01:51,190 slic 37 00:01:51,400 --> 00:01:55,620 ing allows us to take a portion of the string. 38 00:01:55,839 --> 00:02:01,139 Now, Cybersecurity is a pretty long string. 39 00:02:01,639 --> 00:02:05,779 So what if we only wanted to target like the very first five characters, 40 00:02:05,790 --> 00:02:07,360 which would be cyber. 41 00:02:07,569 --> 00:02:09,589 How would we do this? Well, 42 00:02:10,100 --> 00:02:10,940 in here, 43 00:02:11,880 --> 00:02:13,210 I can add 44 00:02:14,029 --> 00:02:14,869 my 45 00:02:15,210 --> 00:02:19,610 index position for the very first character that's going to be a zero. 46 00:02:20,410 --> 00:02:21,350 And then 47 00:02:21,580 --> 00:02:23,899 where do I want to stop? Five? 48 00:02:24,309 --> 00:02:28,770 This right here is what we refer to as slicing. 49 00:02:28,990 --> 00:02:31,309 The very first number 50 00:02:31,580 --> 00:02:36,210 in here represents where we want to start the slicing from. 51 00:02:36,350 --> 00:02:37,309 In this case right now, 52 00:02:37,320 --> 00:02:41,229 I've added zero because I want to start from the very first letter. 53 00:02:41,699 --> 00:02:43,779 And then how many characters do we want? 54 00:02:43,889 --> 00:02:46,520 We want the five characters? So now 55 00:02:46,699 --> 00:02:48,089 if I run the program, 56 00:02:48,190 --> 00:02:50,089 it's going to be cyber. 57 00:02:51,089 --> 00:02:54,419 Now, what if we wanted to extract 58 00:02:55,910 --> 00:02:56,470 security 59 00:02:56,860 --> 00:03:01,149 only? Right? We don't want to take cyber. We want, we want to take security only 60 00:03:01,399 --> 00:03:03,759 one thing I could do here is that 61 00:03:04,089 --> 00:03:05,039 I could 62 00:03:05,399 --> 00:03:06,320 decide 63 00:03:06,610 --> 00:03:07,600 to simply 64 00:03:08,729 --> 00:03:12,039 go with the index number of capital S 65 00:03:12,240 --> 00:03:16,220 which would be a 012345. 66 00:03:16,470 --> 00:03:18,339 Ok. It could be five. 67 00:03:18,869 --> 00:03:19,779 And now 68 00:03:20,880 --> 00:03:22,149 I add my colon, 69 00:03:22,389 --> 00:03:26,699 I could indicate how many letters I want the slic 70 00:03:26,830 --> 00:03:27,669 to contain. 71 00:03:27,789 --> 00:03:31,380 However, if I don't indicate a value in here, 72 00:03:31,389 --> 00:03:35,639 Python will automatically default to the end of the string. 73 00:03:35,910 --> 00:03:39,339 So now if I run the program, it's gonna say security. 74 00:03:40,020 --> 00:03:41,279 Did you see how that worked? 75 00:03:41,500 --> 00:03:42,820 I indicated? Ok. 76 00:03:42,830 --> 00:03:47,460 I wanna start from capital S and the index number of s here is going to be five. 77 00:03:47,710 --> 00:03:50,970 And because I did not indicate where I want the slic 78 00:03:51,110 --> 00:03:54,520 to stop, Python will just go on until the very end. 79 00:03:54,660 --> 00:03:56,649 That's why we have security. 80 00:03:56,940 --> 00:03:58,300 Likewise, 81 00:03:58,419 --> 00:04:00,279 I could do almost the opposite. 82 00:04:00,550 --> 00:04:03,479 I could leave the start empty 83 00:04:03,979 --> 00:04:05,800 and now I could go over here. 84 00:04:06,369 --> 00:04:10,630 And if I indicated, let's say five as an example, 85 00:04:11,179 --> 00:04:13,389 what do you think the value here is going to be 86 00:04:13,839 --> 00:04:16,390 when you don't indicate your step position? 87 00:04:17,089 --> 00:04:19,358 Python will automatically Stack, 88 00:04:19,369 --> 00:04:22,048 the default position will be at the very beginning. 89 00:04:22,200 --> 00:04:25,779 So what do you think the answer here is going to be the answer here is going to be cyber 90 00:04:26,109 --> 00:04:28,369 because even though we didn't indicate the beginning, 91 00:04:28,380 --> 00:04:31,350 Python will default to the very beginning, which is C 92 00:04:31,519 --> 00:04:37,040 and then five characters is going to be Cybr and that's going to be cyber. 93 00:04:37,899 --> 00:04:38,779 However, 94 00:04:39,109 --> 00:04:40,820 that's not all 95 00:04:41,079 --> 00:04:45,739 we could also use the concept of negative indexes. 96 00:04:46,149 --> 00:04:47,799 So for example, 97 00:04:47,959 --> 00:04:50,649 check this out. OK, I'm gonna go over here. 98 00:04:50,839 --> 00:04:51,369 All right. 99 00:04:51,760 --> 00:04:54,970 And if I wanted to target security, 100 00:04:55,359 --> 00:04:57,140 one thing I could do is 101 00:04:57,420 --> 00:04:59,410 I can start from the end 102 00:04:59,899 --> 00:05:04,559 and I know that security is what it's eight characters. So I'm gonna say minus eight 103 00:05:05,100 --> 00:05:05,890 and now 104 00:05:06,029 --> 00:05:07,579 I'm going to add my colon 105 00:05:07,899 --> 00:05:09,609 and default to the very end. 106 00:05:09,980 --> 00:05:12,519 And now if I run the program, there you go. 107 00:05:12,529 --> 00:05:16,029 It is security because Python will start from the end, 108 00:05:16,040 --> 00:05:21,309 which is Y and then it's gonna go eight letters to the left, it stops at s 109 00:05:21,480 --> 00:05:24,559 because S is the eighth letter is the 110 00:05:24,570 --> 00:05:26,750 eighth character when you're starting from the end, 111 00:05:26,970 --> 00:05:30,000 that's why we have security as the answer. 112 00:05:30,190 --> 00:05:35,959 You might be wondering, OK. How will this be applicable to cybersecurity? 113 00:05:36,690 --> 00:05:39,709 Have you ever wondered how anti malware and 114 00:05:39,720 --> 00:05:43,149 antivirus are able to scan and detect malware? 115 00:05:43,390 --> 00:05:47,609 Typically the file extension will have dot exe. 116 00:05:47,869 --> 00:05:50,269 So for example, right, for example, 117 00:05:50,709 --> 00:05:52,109 if our file name 118 00:05:52,519 --> 00:05:55,519 equals to C MD, 119 00:05:56,049 --> 00:05:59,989 IOP, you know, something weird, right? And then dot Exe, 120 00:06:00,820 --> 00:06:02,920 this will be an example of the typical 121 00:06:03,079 --> 00:06:08,790 kind of malware file name that you would have. So how can we write a program 122 00:06:08,980 --> 00:06:14,609 that will scan and then detect as soon as it detects the ex extension? It knows that 123 00:06:14,779 --> 00:06:17,980 this could potentially be a malware. 124 00:06:18,190 --> 00:06:22,529 We haven't talked about the if statement yet, 125 00:06:22,679 --> 00:06:24,910 we're going to talk about in the next section. But 126 00:06:25,079 --> 00:06:26,899 just assume right now that 127 00:06:27,109 --> 00:06:30,519 if, if the file name, 128 00:06:31,519 --> 00:06:34,820 OK. And now if I added in brackets 129 00:06:35,170 --> 00:06:37,380 if I wanted to target 130 00:06:38,000 --> 00:06:43,559 the dot exe from the end because we never know how long 131 00:06:43,570 --> 00:06:47,000 the the file name is going to be before the dot exe. 132 00:06:47,010 --> 00:06:48,649 Right. It's always better to target from the end. 133 00:06:48,869 --> 00:06:49,859 I know 134 00:06:50,149 --> 00:06:53,130 that minus four starting from the end. 135 00:06:53,239 --> 00:06:54,850 If it's equal 136 00:06:55,850 --> 00:06:59,230 to what dot exe, 137 00:06:59,589 --> 00:07:03,429 if I know that the last four characters in the file name 138 00:07:03,850 --> 00:07:11,019 has, is is equal to dot Exe, then I know that most likely this is going to be a malware. 139 00:07:11,029 --> 00:07:13,390 So now I can just simply say print 140 00:07:13,869 --> 00:07:15,890 and then in brackets, I can say uh 141 00:07:16,320 --> 00:07:20,029 malware found, you know, something like this. 142 00:07:20,480 --> 00:07:22,220 So this is how 143 00:07:22,769 --> 00:07:28,859 the concept of slicing can be applied in the world of cybersecurity. 144 00:07:28,869 --> 00:07:33,380 Now, before I round up this video, let me talk to you about one more function 145 00:07:33,829 --> 00:07:36,690 that is useful when it comes to working with strings 146 00:07:36,850 --> 00:07:40,299 and that's going to be the split function. 147 00:07:40,510 --> 00:07:42,399 It is a function that will break a string into 148 00:07:42,410 --> 00:07:47,160 a list of soft strings based on a specified delimiter. 149 00:07:47,170 --> 00:07:48,429 What am I talking about? 150 00:07:48,619 --> 00:07:49,739 Say? For example, 151 00:07:50,130 --> 00:07:53,220 I go back to my text and it equals 152 00:07:54,019 --> 00:07:55,769 equals to 153 00:07:55,920 --> 00:07:58,850 Cybersecurity. 154 00:07:59,079 --> 00:08:00,809 Now you can see it's two words, right? 155 00:08:00,820 --> 00:08:03,250 It's no longer just one single word, cybersecurity. 156 00:08:03,260 --> 00:08:05,559 Now, I've got two words, Cybersecurity. 157 00:08:05,820 --> 00:08:09,239 So if I wanted to split this, 158 00:08:09,510 --> 00:08:10,529 I can say 159 00:08:10,790 --> 00:08:12,059 words, 160 00:08:12,299 --> 00:08:12,989 all right, 161 00:08:13,440 --> 00:08:15,470 equals text. 162 00:08:15,750 --> 00:08:17,209 And now dot 163 00:08:17,470 --> 00:08:18,600 split. 164 00:08:19,059 --> 00:08:19,559 OK. 165 00:08:19,890 --> 00:08:21,700 I'm attaching the split function to the 166 00:08:21,709 --> 00:08:25,630 variable text because I wanna split cybersecurity. 167 00:08:25,829 --> 00:08:28,429 So now if I simply printed 168 00:08:28,720 --> 00:08:30,059 the words, 169 00:08:30,769 --> 00:08:32,520 what do you think the output is going to be? 170 00:08:32,659 --> 00:08:36,590 It is going to be cyber and then security. 171 00:08:36,739 --> 00:08:37,609 And if you're wondering 172 00:08:37,739 --> 00:08:41,619 how are we going to apply this in cybersecurity? 173 00:08:41,789 --> 00:08:42,260 German? 174 00:08:42,580 --> 00:08:43,159 Next video web 175 00:08:43,640 --> 00:08:45,619 will begin to take a look at cybersecurity 176 00:08:45,630 --> 00:08:50,270 applications of slicing splitting and so much more.