1 00:00:00,000 --> 00:00:01,589 Instructor: Welcome back, 2 00:00:01,589 --> 00:00:04,710 and in this bonus video we're going to be covering 3 00:00:04,710 --> 00:00:06,930 the tool that I created in Python 3, 4 00:00:06,930 --> 00:00:08,883 that is used to gather emails. 5 00:00:09,720 --> 00:00:12,210 Now, even though later in the course we'll be coding some 6 00:00:12,210 --> 00:00:14,010 of our own Python tools. 7 00:00:14,010 --> 00:00:15,660 This is one that we will not code, 8 00:00:15,660 --> 00:00:17,580 so we'll just see how it works. 9 00:00:17,580 --> 00:00:19,830 I will explain how it works, of course 10 00:00:19,830 --> 00:00:23,490 and we are going to see how many emails it can gather. 11 00:00:23,490 --> 00:00:27,210 So, this is the tool right here called Email Scraper 12 00:00:27,210 --> 00:00:29,940 and you will have this to download in the resources 13 00:00:29,940 --> 00:00:31,440 of this lecture. 14 00:00:31,440 --> 00:00:33,480 But let me show you how you can transfer it 15 00:00:33,480 --> 00:00:35,700 on the Kali Linux desktop. 16 00:00:35,700 --> 00:00:38,490 If you go up here on the devices 17 00:00:38,490 --> 00:00:40,680 and you go on drag and drop 18 00:00:40,680 --> 00:00:42,870 and click on B directional, 19 00:00:42,870 --> 00:00:44,520 then anything that you have 20 00:00:44,520 --> 00:00:48,000 on your desktop since I had the program right here, 21 00:00:48,000 --> 00:00:52,410 if you go and drag it to your Kali Linux machine 22 00:00:52,410 --> 00:00:55,470 it'll get moved onto your Kali Linux desktop. 23 00:00:55,470 --> 00:00:57,510 As you can see right here, this folder already 24 00:00:57,510 --> 00:01:00,150 contains this file since I already have it on my desktop 25 00:01:00,150 --> 00:01:01,413 so I'll just skip this. 26 00:01:02,310 --> 00:01:04,860 But you, after setting this to be directional 27 00:01:04,860 --> 00:01:07,500 can transfer any file from your host desktop 28 00:01:07,500 --> 00:01:09,330 to your Kali Linux desktop. 29 00:01:09,330 --> 00:01:10,260 Okay, good. 30 00:01:10,260 --> 00:01:12,150 Now that we know how we can transfer it 31 00:01:12,150 --> 00:01:16,020 let us see what this tool is and how we can run it. 32 00:01:16,020 --> 00:01:18,450 So just to check out the code of this tool real quick. 33 00:01:18,450 --> 00:01:22,980 Let us double click on it and let me enlarge all this. 34 00:01:22,980 --> 00:01:27,000 And what this tool essentially does is it asks us 35 00:01:27,000 --> 00:01:29,580 for the URL and we provide it 36 00:01:29,580 --> 00:01:32,130 with the URL of a certain domain name. 37 00:01:32,130 --> 00:01:35,820 Then what this tool does is it tries to extract all 38 00:01:35,820 --> 00:01:36,870 of the emails 39 00:01:36,870 --> 00:01:40,770 that are in the HTML page of the URL that's specified, 40 00:01:40,770 --> 00:01:43,860 but what it also does is it tries to crawl 41 00:01:43,860 --> 00:01:47,550 within other URLs that are found inside of that page. 42 00:01:47,550 --> 00:01:49,950 For example, this count variable right here 43 00:01:49,950 --> 00:01:51,810 that is equal to 100, 44 00:01:51,810 --> 00:01:54,330 means that we will be searching for emails 45 00:01:54,330 --> 00:01:56,700 in 100 different links. 46 00:01:56,700 --> 00:01:59,010 So you specified domain URL, 47 00:01:59,010 --> 00:02:00,690 then it goes through that URL 48 00:02:00,690 --> 00:02:02,730 it extracts all of the emails, 49 00:02:02,730 --> 00:02:05,580 but it also extracts all of the other URLs 50 00:02:05,580 --> 00:02:07,560 that are leading to different pages. 51 00:02:07,560 --> 00:02:09,570 Then it goes to those different pages 52 00:02:09,570 --> 00:02:11,160 and performs the same thing. 53 00:02:11,160 --> 00:02:15,960 It tries to find the emails and it also finds more URLs 54 00:02:15,960 --> 00:02:19,920 and it does that until it reaches 100 URLs. 55 00:02:19,920 --> 00:02:22,470 This is a number that you can change if you want to. 56 00:02:22,470 --> 00:02:25,800 So you can set this to be lower or higher depending on 57 00:02:25,800 --> 00:02:27,990 how much results you want to find. 58 00:02:27,990 --> 00:02:30,480 Down here, we can see that it is finding 59 00:02:30,480 --> 00:02:33,780 those emails using regex. 60 00:02:33,780 --> 00:02:36,210 So this is the pattern that we are searching for, 61 00:02:36,210 --> 00:02:38,730 and don't worry if you don't understand any of this. 62 00:02:38,730 --> 00:02:42,540 regex is just a way for us to find certain patterns in 63 00:02:42,540 --> 00:02:43,560 a lot of texts. 64 00:02:43,560 --> 00:02:45,330 So for example, this is a pattern 65 00:02:45,330 --> 00:02:47,550 that will allow us to find emails 66 00:02:47,550 --> 00:02:49,773 in the HTML code of the page. 67 00:02:50,970 --> 00:02:54,510 And then we at the end of this print all of 68 00:02:54,510 --> 00:02:56,820 the emails that we found. 69 00:02:56,820 --> 00:03:00,360 So that is the basic principle behind this tool. 70 00:03:00,360 --> 00:03:01,751 Let us see how it runs and 71 00:03:01,751 --> 00:03:04,590 whether we managed to find more emails 72 00:03:04,590 --> 00:03:08,370 than we did with Hunter.io and theHarvester. 73 00:03:08,370 --> 00:03:10,290 So let's close this. 74 00:03:10,290 --> 00:03:14,820 Go to our terminal, find where you have this file downloaded 75 00:03:14,820 --> 00:03:17,040 and I have it on my desktop. 76 00:03:17,040 --> 00:03:18,390 And to just run it 77 00:03:18,390 --> 00:03:21,693 we can type Python 3 and then the name of the file, 78 00:03:23,070 --> 00:03:25,710 it'll tell us enter Target URL to scan. 79 00:03:25,710 --> 00:03:28,660 And here I'm going to specify the full URL 80 00:03:29,790 --> 00:03:31,470 to the same domain name that we used 81 00:03:31,470 --> 00:03:34,320 for theHarvester and Hunter.io, 82 00:03:34,320 --> 00:03:36,690 just so we can compare how many results we get 83 00:03:36,690 --> 00:03:39,240 with this tool and how many results we got 84 00:03:39,240 --> 00:03:42,480 with Hunter.io and theHarvester. 85 00:03:42,480 --> 00:03:43,436 So if I type 86 00:03:43,436 --> 00:03:45,330 (keyboard typing) 87 00:03:45,330 --> 00:03:47,830 the domain name and press your enter 88 00:03:49,320 --> 00:03:52,533 This will go and process 100 links. 89 00:03:53,760 --> 00:03:56,100 And depending on whether you change that number 90 00:03:56,100 --> 00:03:58,470 it might be higher or lower. 91 00:03:58,470 --> 00:04:00,780 And at the end of processing these links 92 00:04:00,780 --> 00:04:03,903 it will print out all of the emails that it managed to find. 93 00:04:04,890 --> 00:04:07,650 So if you remember, with Hunter.io 94 00:04:07,650 --> 00:04:09,480 the fact that we used with the free account 95 00:04:09,480 --> 00:04:12,690 we managed to gather 10 different emails. 96 00:04:12,690 --> 00:04:13,680 With theHarvester, 97 00:04:13,680 --> 00:04:16,829 first time we didn't manage to get any email, 98 00:04:16,829 --> 00:04:18,209 but after running it a couple 99 00:04:18,209 --> 00:04:20,579 of times we might be able to get around 10 to 100 00:04:20,579 --> 00:04:23,790 15 different emails with theHarvester. 101 00:04:23,790 --> 00:04:26,460 But let's see how many this tool will find. 102 00:04:26,460 --> 00:04:28,050 So let's just wait for this to finish 103 00:04:28,050 --> 00:04:30,450 and I will get back to you as soon as it's done. 104 00:04:31,380 --> 00:04:33,180 Okay, so the tool has finished scanning 105 00:04:33,180 --> 00:04:36,810 and here are all of the emails that we manage to find. 106 00:04:36,810 --> 00:04:38,910 You can see there is at least a hundred 107 00:04:38,910 --> 00:04:43,860 or 150 of them and they all belong to the same domain. 108 00:04:43,860 --> 00:04:46,590 Now, we might occasionally find some email 109 00:04:46,590 --> 00:04:48,630 that doesn't belong to this domain, 110 00:04:48,630 --> 00:04:50,940 and we saw one down here, I believe. 111 00:04:50,940 --> 00:04:52,260 Let me just find it. 112 00:04:52,260 --> 00:04:55,200 This one, it doesn't have the domain name inside 113 00:04:55,200 --> 00:04:58,080 of the email, but all of the others do. 114 00:04:58,080 --> 00:05:01,260 And we got at least 5 to 10 times more results 115 00:05:01,260 --> 00:05:03,270 than we managed to get with theHarvester 116 00:05:03,270 --> 00:05:04,800 which is Kali Linux tool 117 00:05:04,800 --> 00:05:07,503 or with the free account of Hunter.io. 118 00:05:08,400 --> 00:05:10,620 And here are all the links that it processed. 119 00:05:10,620 --> 00:05:13,050 So it clicked on all of these links 120 00:05:13,050 --> 00:05:16,230 and it tried to extract as much emails as it could 121 00:05:16,230 --> 00:05:17,970 from these links. 122 00:05:17,970 --> 00:05:18,930 Cool, right? 123 00:05:18,930 --> 00:05:22,260 So now you have a tool that will be able to capture 124 00:05:22,260 --> 00:05:25,890 a lot of emails based on the specified domain. 125 00:05:25,890 --> 00:05:28,350 Just make sure that once you run the tool 126 00:05:28,350 --> 00:05:33,153 you specify HTTP or HTTPs before the domain name. 127 00:05:34,260 --> 00:05:36,060 Okay, so this tool is now yours. 128 00:05:36,060 --> 00:05:38,460 Feel free to use it as much as you want. 129 00:05:38,460 --> 00:05:40,920 And later on, in the course we will also 130 00:05:40,920 --> 00:05:44,010 be coding our own Python tools. 131 00:05:44,010 --> 00:05:46,486 There will not be some too advanced tools 132 00:05:46,486 --> 00:05:47,910 but we will be covering basics 133 00:05:47,910 --> 00:05:50,850 of creating our own hacking tools 134 00:05:50,850 --> 00:05:53,100 which is something that every hacker should 135 00:05:53,100 --> 00:05:56,160 at some point of their journey learn. 136 00:05:56,160 --> 00:05:56,993 Great. 137 00:05:56,993 --> 00:05:58,530 So, now that we've finished 138 00:05:58,530 --> 00:06:01,080 with the information gathering section. 139 00:06:01,080 --> 00:06:04,710 We're ready to start off with scanning section. 140 00:06:04,710 --> 00:06:06,780 And you might be wondering how are you going to 141 00:06:06,780 --> 00:06:08,970 be able to follow the scanning section 142 00:06:08,970 --> 00:06:10,650 and all the other sections, 143 00:06:10,650 --> 00:06:14,400 since you don't really have permission to scan any website? 144 00:06:14,400 --> 00:06:15,360 Don't worry. 145 00:06:15,360 --> 00:06:17,640 There are a lot of free vulnerable machines 146 00:06:17,640 --> 00:06:21,330 and websites that we can download and practice on them. 147 00:06:21,330 --> 00:06:23,580 And we are going to be seeing how we can find them 148 00:06:23,580 --> 00:06:26,190 and install them inside of our virtual box, 149 00:06:26,190 --> 00:06:29,160 so we will have our own vulnerable lab where 150 00:06:29,160 --> 00:06:31,080 we can practice our hacking. 151 00:06:31,080 --> 00:06:32,580 So thank you for watching this section 152 00:06:32,580 --> 00:06:34,713 and I will see you in the next one.