WEBVTT 0:00:03.460000 --> 0:00:06.160000 Hello everyone, welcome to this video. 0:00:06.160000 --> 0:00:10.080000 In this video, we're going to be taking a look at how to copy or download 0:00:10.080000 --> 0:00:13.940000 a website with a tool called ht track. 0:00:13.940000 --> 0:00:17.660000 Alright, so the objective here under this particular section is going 0:00:17.660000 --> 0:00:21.760000 to be focused on, you know, downloading or copying a website to your local 0:00:21.760000 --> 0:00:22.820000 system for analysis. 0:00:22.820000 --> 0:00:27.400000 Now, we're not going to dive too deep into the analysis section, but the 0:00:27.400000 --> 0:00:31.360000 reason why we're doing this is primarily because in certain cases, you 0:00:31.360000 --> 0:00:36.780000 may want to have a local version or rather copy of the website, so that 0:00:36.780000 --> 0:00:41.180000 you can actually analyze with not only with regards to the actual structure 0:00:41.180000 --> 0:00:45.020000 of the website, but also trying to look for specific strings of text, 0:00:45.020000 --> 0:00:49.100000 maybe look for and identify comments that may lead to information leakage 0:00:49.100000 --> 0:00:51.280000 and disclosure, so on and so forth. 0:00:51.280000 --> 0:00:54.800000 So this is something, as I said, that is very, very important and it usually 0:00:54.800000 --> 0:00:55.720000 goes on ignored. 0:00:55.720000 --> 0:01:00.020000 But for those of you that analyze web applications and websites, you know 0:01:00.020000 --> 0:01:02.180000 that this is, you know, very, very important. 0:01:02.180000 --> 0:01:06.140000 So I've been utilizing ht track for a very long time. 0:01:06.140000 --> 0:01:09.140000 And it is extremely useful. 0:01:09.140000 --> 0:01:13.380000 There's two versions or rather, there's two ways you can use ht track, 0:01:13.380000 --> 0:01:16.740000 one of which is through a graphical user interface. 0:01:16.740000 --> 0:01:19.080000 The second is to decomman line utility. 0:01:19.080000 --> 0:01:22.520000 And this typically comes pre-packaged with Kelly, but I'll be showing 0:01:22.520000 --> 0:01:26.060000 you how to install it anyway if it's not already installed. 0:01:26.060000 --> 0:01:28.680000 And again, in this case, we're not going to be utilizing any specific 0:01:28.680000 --> 0:01:32.940000 lab environment, because we're going to do this, you know, we're going 0:01:32.940000 --> 0:01:36.860000 to do this by taking a look at how we can copy, you know, a particular 0:01:36.860000 --> 0:01:40.080000 website that we've done or that we've explored before. 0:01:40.080000 --> 0:01:44.080000 So again, I'll just switch over to my Kali Linux system and I'll show 0:01:44.080000 --> 0:01:48.560000 you how to utilize ht track, specifically the command line interface. 0:01:48.560000 --> 0:01:52.040000 It really is very simple and it's an extremely powerful tool. 0:01:52.040000 --> 0:01:54.240000 So let me switch over. 0:01:54.240000 --> 0:01:59.360000 All right, so I'm currently on my Kali Linux system. 0:01:59.360000 --> 0:02:01.280000 And as you can see, this is the website. 0:02:01.280000 --> 0:02:03.600000 So it's ht track.com. 0:02:03.600000 --> 0:02:06.020000 This is the GUI version that you can utilize. 0:02:06.020000 --> 0:02:09.020000 You can also install this on Windows if you want. 0:02:09.020000 --> 0:02:12.760000 I'm covering the command line interface because it's very useful and very, 0:02:12.760000 --> 0:02:16.700000 very powerful. Plus, it actually makes much more sense to me than utilizing 0:02:16.700000 --> 0:02:18.580000 the graphical user interface. 0:02:18.580000 --> 0:02:20.820000 So there we are. 0:02:20.820000 --> 0:02:22.700000 That's the one you can get there. 0:02:22.700000 --> 0:02:26.380000 If you you can install ht track, that's the command line version on Kali 0:02:26.380000 --> 0:02:31.660000 and also web ht track, which actually, I'm not sure exists within the 0:02:31.660000 --> 0:02:33.520000 Kali repos, but we can actually check it. 0:02:33.520000 --> 0:02:37.120000 So to get it installed, you can make sure you've updated your repositories 0:02:37.120000 --> 0:02:40.640000 and then sudo apt-get install ht track. 0:02:40.640000 --> 0:02:42.360000 That's the command line utility. 0:02:42.360000 --> 0:02:45.640000 I'll just put in my password apt-get install. 0:02:45.640000 --> 0:02:46.740000 Let me type that in. 0:02:46.740000 --> 0:02:49.560000 There we are. So that's already installed. 0:02:49.560000 --> 0:02:54.620000 Let's see if we can actually, yeah, we actually have the web version, 0:02:54.620000 --> 0:02:58.420000 but in this case, we're only going to be utilizing the command line interface. 0:02:58.420000 --> 0:03:04.580000 So the next step, of course, is copying or downloading a website that 0:03:04.580000 --> 0:03:07.300000 you'd like to have locally for analysis. 0:03:07.300000 --> 0:03:11.740000 So a good example of this is something like, you know, for example, Digi 0:03:11.740000 --> 0:03:14.820000 Ninja, we can just use a very simple example here. 0:03:14.820000 --> 0:03:20.420000 So for example, if we copied Digi Ninja or we wanted to use this, we can 0:03:20.420000 --> 0:03:24.500000 easily do that. So if we wanted to copy a site, the first thing you need 0:03:24.500000 --> 0:03:26.660000 to do is create a folder for that site. 0:03:26.660000 --> 0:03:27.480000 I'm already on my desktop. 0:03:27.480000 --> 0:03:31.620000 So I'm going to create a folder called, we can just call it zone transfer 0:03:31.620000 --> 0:03:35.440000 here. And I'll navigate into it. 0:03:35.440000 --> 0:03:38.600000 You don't actually need to, but it's really very simple. 0:03:38.600000 --> 0:03:43.500000 So what you need to do or what you need to run fundamentally is just, 0:03:43.500000 --> 0:03:46.280000 you know, HTT, HD track. 0:03:46.280000 --> 0:03:51.640000 And you then specify the website that you'd like to download or copy. 0:03:51.640000 --> 0:03:55.500000 But before we do that, if I open up the help menu for this particular 0:03:55.500000 --> 0:03:59.220000 tool, it gives you with, it gives you some really cool examples of how 0:03:59.220000 --> 0:04:00.060000 this can be used. 0:04:00.060000 --> 0:04:03.820000 So in this case, if you just want to mirror a site, you can do it that 0:04:03.820000 --> 0:04:08.500000 way. If you want to output, you know, if you want to actually store your 0:04:08.500000 --> 0:04:11.720000 results, which is very important, you just need to specify where you want 0:04:11.720000 --> 0:04:16.900000 to store it. You also have the ability to mirror two sites. 0:04:16.900000 --> 0:04:22.800000 And in this particular case, only accept a JPG and dot com sites. 0:04:22.800000 --> 0:04:24.500000 So again, that's one way. 0:04:24.500000 --> 0:04:29.580000 The second way, of course, is to specify your custom link depth, which 0:04:29.580000 --> 0:04:30.340000 is very important. 0:04:30.340000 --> 0:04:33.660000 If you don't want to download everything on the website, or that's being 0:04:33.660000 --> 0:04:35.860000 hosted rather on the domain. 0:04:35.860000 --> 0:04:38.000000 So there we are means get all files. 0:04:38.000000 --> 0:04:43.020000 And you can specify custom extension, or rather file name, if you will. 0:04:43.020000 --> 0:04:48.280000 The other examples are, of course, to run the spider, but we'll be covering 0:04:48.280000 --> 0:04:50.120000 spidering when we get there. 0:04:50.120000 --> 0:04:55.640000 So if we wanted to again, download a zone transfer.me, and again, not 0:04:55.640000 --> 0:05:01.100000 specify any link link depth here, what we would do is we would say HTT 0:05:01.100000 --> 0:05:04.080000 track or HT track. 0:05:04.080000 --> 0:05:08.700000 If I can type that in, and we just specify the website. 0:05:08.700000 --> 0:05:13.540000 So in this case, we can say, dub dub dub dot zone transfer.me. 0:05:13.540000 --> 0:05:17.400000 In this case, let's see whether this works, we'll save it under zone transfer 0:05:17.400000 --> 0:05:20.120000 .me. That's the folder where we want to save it. 0:05:20.120000 --> 0:05:21.600000 So we'll give it a few seconds. 0:05:21.600000 --> 0:05:23.180000 There we are, looks like it's done. 0:05:23.180000 --> 0:05:27.140000 So if we navigate into where we stored the actual website, you can see 0:05:27.140000 --> 0:05:32.320000 that it downloads, you know, all the favicons, the index file, and then 0:05:32.320000 --> 0:05:37.420000 under zone transfer.me, you can see actually just downloads the index 0:05:37.420000 --> 0:05:41.260000 file. So if we take a step back, and we catch the contents of the index 0:05:41.260000 --> 0:05:48.100000 .html file, it should have downloaded the actual index, the actual index 0:05:48.100000 --> 0:05:49.700000 .html file on the website. 0:05:49.700000 --> 0:05:51.840000 So let's open this up. 0:05:51.840000 --> 0:05:54.380000 And I just want to highlight a few important things. 0:05:54.380000 --> 0:05:58.580000 So I'll add over to my desktop, and zone transfer.me. 0:05:58.580000 --> 0:06:00.280000 We can actually open this up in our browser. 0:06:00.280000 --> 0:06:02.180000 And remember, this is stored locally. 0:06:02.180000 --> 0:06:05.100000 Now what you can see is happening here is there's a redirect. 0:06:05.100000 --> 0:06:06.540000 And that's what I wanted to highlight. 0:06:06.540000 --> 0:06:10.060000 So ideally, we would need to run this on digi.ninja. 0:06:10.060000 --> 0:06:11.200000 So let's actually do that. 0:06:11.200000 --> 0:06:14.780000 I'm just going to take a step back here, and we'll create one, just call 0:06:14.780000 --> 0:06:19.080000 it digi. And sorry, make directory digi. 0:06:19.080000 --> 0:06:23.120000 There we are. And we can now say ht track. 0:06:23.120000 --> 0:06:27.260000 And we're just going to say www.digi.ninja. 0:06:27.260000 --> 0:06:30.740000 And I want to output this into the folder called digi. 0:06:30.740000 --> 0:06:33.120000 So I'll let enter. 0:06:33.120000 --> 0:06:37.340000 And you can now see, let's see whether there is a redirect in this case. 0:06:37.340000 --> 0:06:42.180000 So I'll navigate to digi.ninja or digi rather, and open up the index file. 0:06:42.180000 --> 0:06:45.340000 There we are. So it looks like there is a redirect. 0:06:45.340000 --> 0:06:51.640000 So what we can do in this particular case is now specify link depth. 0:06:51.640000 --> 0:06:57.720000 All right. So if I open up the site itself and open up the index file, 0:06:57.720000 --> 0:06:59.540000 you can see it's the same thing. 0:06:59.540000 --> 0:07:01.860000 This is because there is a redirect. 0:07:01.860000 --> 0:07:05.580000 Now there's a way of handling redirects, which is very, very important. 0:07:05.580000 --> 0:07:09.620000 But if we actually, I'm just going to remove that directory. 0:07:09.620000 --> 0:07:11.220000 So digi, there we are. 0:07:11.220000 --> 0:07:13.240000 And I'll create it again. 0:07:13.240000 --> 0:07:16.920000 We're going to open up ht track. 0:07:16.920000 --> 0:07:19.040000 And I'm just going to open up the help menu. 0:07:19.040000 --> 0:07:22.160000 So I want you to take a look at some of these arguments here. 0:07:22.160000 --> 0:07:25.820000 Now there's tons of other options that you can run with regards to your 0:07:25.820000 --> 0:07:27.520000 own requirements. 0:07:27.520000 --> 0:07:33.740000 And you have the ability to stay on the same directory, or you can actually 0:07:33.740000 --> 0:07:35.960000 go down into sub directories. 0:07:35.960000 --> 0:07:40.740000 And then more specifically, one second, let me see if I can show you this 0:07:40.740000 --> 0:07:42.540000 here, if I can actually find it. 0:07:42.540000 --> 0:07:55.300000 We're not looking for it. 0:07:55.300000 --> 0:07:56.100000 Yes, there we are. 0:07:56.100000 --> 0:08:01.020000 So we can also again specify custom options, as I mentioned, but the usage 0:08:01.020000 --> 0:08:06.600000 is very simple, the URL, the option, and then the URL filter, and any 0:08:06.600000 --> 0:08:09.700000 mime of file filters, if you will. 0:08:09.700000 --> 0:08:13.320000 But what you can also do, and I'll show you this right now, is if you 0:08:13.320000 --> 0:08:17.800000 just launch ht track or ht track, you can enter project name. 0:08:17.800000 --> 0:08:20.800000 So in this case, digi ninja. 0:08:20.800000 --> 0:08:27.340000 There we are, the base path, we're going to save this home, Kelly, and 0:08:27.340000 --> 0:08:32.640000 we'll save it under the folder called digi right over there, the URLs. 0:08:32.640000 --> 0:08:36.860000 So you can specify more than one in this particular case, we're just going 0:08:36.860000 --> 0:08:40.440000 to say, you know, digi dot ninja. 0:08:40.440000 --> 0:08:44.080000 And in this particular case, let's say this is ht tps. 0:08:44.080000 --> 0:08:48.120000 So digi dot ninja, I'm just going to show you how to use the wizard first. 0:08:48.120000 --> 0:08:53.580000 So you can mirror the site, you can also mirror it with the wizard. 0:08:53.580000 --> 0:08:56.600000 And you can also, you know, just get the files that you've indicated in 0:08:56.600000 --> 0:09:00.320000 this case, you have not specified any files, you can test links, or you 0:09:00.320000 --> 0:09:01.400000 can mirror all links. 0:09:01.400000 --> 0:09:05.800000 So we can run this with the wizard, the proxy is none. 0:09:05.800000 --> 0:09:07.740000 Can you can define wildcards? 0:09:07.740000 --> 0:09:11.720000 So you can look for specific file extensions or files with specific extensions, 0:09:11.720000 --> 0:09:15.160000 if you will. So none additional options, none. 0:09:15.160000 --> 0:09:18.100000 And yes, we can then launch this here. 0:09:18.100000 --> 0:09:21.880000 And it then gives you the command line argument you can run in this particular 0:09:21.880000 --> 0:09:23.820000 case, so late, yes. 0:09:23.820000 --> 0:09:27.120000 And we're just going to wait for this to complete here. 0:09:27.120000 --> 0:09:30.560000 So just give it, this is going to take a couple of seconds to a couple 0:09:30.560000 --> 0:09:33.140000 of minutes, depending on the size of the website. 0:09:33.140000 --> 0:09:37.600000 And you can see you also have the ability to specify additional options, 0:09:37.600000 --> 0:09:42.180000 like the recursive, the recurse level. 0:09:42.180000 --> 0:09:46.120000 So you know, if you want to dive deeper into the folders or download, 0:09:46.120000 --> 0:09:50.740000 you know, all of the folders and navigate or essentially let HD track 0:09:50.740000 --> 0:09:55.780000 go into any directories and sub directories, then you can use that option 0:09:55.780000 --> 0:09:58.020000 and I'll show you how to do that as well. 0:09:58.020000 --> 0:10:00.840000 So I'll just wait for this to complete here. 0:10:00.840000 --> 0:10:06.880000 All right, so the site or HD track has been copying the site or downloading 0:10:06.880000 --> 0:10:08.540000 it for a couple of minutes now. 0:10:08.540000 --> 0:10:11.240000 And that's because the site is actually quite large, but you can actually 0:10:11.240000 --> 0:10:13.100000 see it going through the process. 0:10:13.100000 --> 0:10:15.100000 Now I'm not going to wait for this to complete. 0:10:15.100000 --> 0:10:19.560000 What I am going to show you is, you know, the actual what has actually 0:10:19.560000 --> 0:10:20.500000 been downloaded. 0:10:20.500000 --> 0:10:25.160000 So in this case, you can see that it's saved under the directory we specified, 0:10:25.160000 --> 0:10:28.060000 which is under digi and digi ninja. 0:10:28.060000 --> 0:10:29.620000 So I'll open this up here. 0:10:29.620000 --> 0:10:32.000000 So this is under Kali digi digi ninja. 0:10:32.000000 --> 0:10:34.040000 So we firstly have the index file. 0:10:34.040000 --> 0:10:39.200000 So if I open that up, that should again allow us to view the site right 0:10:39.200000 --> 0:10:40.460000 over here locally. 0:10:40.460000 --> 0:10:42.480000 So this is all offline. 0:10:42.480000 --> 0:10:45.960000 However, most importantly, what I want you to take a look at is the actual 0:10:45.960000 --> 0:10:50.320000 digi.ninja folder, which is where all the website files are being stored. 0:10:50.320000 --> 0:10:53.780000 Now, you may be asking yourself, why is this important? 0:10:53.780000 --> 0:10:57.100000 I don't really see the reason as to why we're doing this world. 0:10:57.100000 --> 0:11:00.420000 Think about it. If you have the local copy of the website, you can actually 0:11:00.420000 --> 0:11:03.840000 start analyzing things like firstly, the directory structure. 0:11:03.840000 --> 0:11:08.920000 So now just based on this, we're able to tell how things are stored on 0:11:08.920000 --> 0:11:09.940000 that actual website. 0:11:09.940000 --> 0:11:13.080000 So we can see we have the blog folder, which is where blog posts are kept. 0:11:13.080000 --> 0:11:15.020000 And these are all the HTML files. 0:11:15.020000 --> 0:11:16.660000 So we can use one as an example. 0:11:16.660000 --> 0:11:18.100000 Hopefully this was downloaded. 0:11:18.100000 --> 0:11:22.300000 There we are. So we've actually copied the entire website or rather all 0:11:22.300000 --> 0:11:25.180000 the content. And that is actively being done. 0:11:25.180000 --> 0:11:27.640000 But that's the blog folder. 0:11:27.640000 --> 0:11:32.360000 Now what's very important is, of course, to identify files or the upload 0:11:32.360000 --> 0:11:35.600000 directory. So in this case, you can see we're downloading all of these 0:11:35.600000 --> 0:11:39.740000 resources. So they look like archive files and PDFs. 0:11:39.740000 --> 0:11:45.780000 We also have the existence of, if I can take a look at it here, we have 0:11:45.780000 --> 0:11:49.400000 the scripts folder, which in this case gives us some of the JavaScript 0:11:49.400000 --> 0:11:51.180000 libraries that are being used. 0:11:51.180000 --> 0:11:54.360000 So we could potentially try and find vulnerabilities within these JavaScript 0:11:54.360000 --> 0:11:59.020000 libraries. And then we also have the style sheet for the website, the 0:11:59.020000 --> 0:12:00.860000 images directory. 0:12:00.860000 --> 0:12:03.680000 And essentially all of the directories on the website. 0:12:03.680000 --> 0:12:08.220000 So we can see we have the contact page, the about Google Analytics JavaScript 0:12:08.220000 --> 0:12:11.400000 library, the actual manifest, so on and so forth. 0:12:11.400000 --> 0:12:14.960000 So I'm actually going to terminate the copying there, because the site 0:12:14.960000 --> 0:12:16.640000 is quite a large. 0:12:16.640000 --> 0:12:20.860000 And if I, you know, just view the properties here, you can see it's already 0:12:20.860000 --> 0:12:26.020000 50 megabytes. So I just wanted to use this as an example as to sort of 0:12:26.020000 --> 0:12:27.680000 demonstrate how this can be done. 0:12:27.680000 --> 0:12:31.260000 And of course, there's many options that you can utilize with HD track. 0:12:31.260000 --> 0:12:34.320000 And I definitely recommend going through the wizard. 0:12:34.320000 --> 0:12:38.400000 You know, if you want to specify a custom recurs level, which is very 0:12:38.400000 --> 0:12:42.620000 simple to do. And then of course, you have the ability to specify more 0:12:42.620000 --> 0:12:46.340000 than one site. So in my case, because I'm done with this, I'm just going 0:12:46.340000 --> 0:12:50.440000 to delete that. And I'm also going to clear the other, the other folders 0:12:50.440000 --> 0:12:51.320000 that I had created. 0:12:51.320000 --> 0:12:54.260000 So definitely try this out for yourself. 0:12:54.260000 --> 0:12:56.980000 And of course, this is still a passive technique. 0:12:56.980000 --> 0:13:01.180000 However, it could be considered an active technique if you're using this 0:13:01.180000 --> 0:13:05.440000 to plagiarize a website, or to if you are a competitor and you're trying 0:13:05.440000 --> 0:13:09.400000 to learn how, you know, you're essentially trying to rip off what, you 0:13:09.400000 --> 0:13:11.560000 know, a company is done. 0:13:11.560000 --> 0:13:15.040000 But again, in your case, you're going to be using it for testing. 0:13:15.040000 --> 0:13:18.740000 I'm not going to cover the definition of worldcards primarily because 0:13:18.740000 --> 0:13:20.940000 that's where fuzzing comes into play. 0:13:20.940000 --> 0:13:22.940000 But it really is very important. 0:13:22.940000 --> 0:13:26.400000 So you have the ability, as I said, if you want to utilize wildcards, 0:13:26.400000 --> 0:13:30.500000 you can specify that you only want to download zip files, or HTML files, 0:13:30.500000 --> 0:13:31.640000 so on and so forth. 0:13:31.640000 --> 0:13:34.300000 So you can actually specify what you're looking for. 0:13:34.300000 --> 0:13:36.940000 Another example is JavaScript library. 0:13:36.940000 --> 0:13:40.220000 So if you only wanted to download them, you can also do that as well. 0:13:40.220000 --> 0:13:46.180000 So again, there's a ton of options that you can utilize here. 0:13:46.180000 --> 0:13:49.120000 And as I said, you can do this for more than one site. 0:13:49.120000 --> 0:13:52.940000 And once you're done, you can then analyze it offline, try and identify 0:13:52.940000 --> 0:13:57.080000 comments within the source code, try and identify information leakage, 0:13:57.080000 --> 0:13:57.900000 so on and so forth. 0:13:57.900000 --> 0:14:00.300000 So very, very useful tool. 0:14:00.300000 --> 0:14:04.260000 And that is going to conclude the practical demonstration side of this