1 00:00:00,090 --> 00:00:01,410 Hello the beautiful people. 2 00:00:01,410 --> 00:00:06,660 And welcome to this video where we're going to be learning about the super useful sword command. 3 00:00:06,689 --> 00:00:11,610 Now, the sword command is one of those commands that you can kind of use anywhere to sort data and 4 00:00:11,610 --> 00:00:14,020 make it more ordered and easier to manage. 5 00:00:14,040 --> 00:00:18,540 And over the next few videos, I'm going to be showing you how you can use the sort command to sort 6 00:00:18,570 --> 00:00:20,880 data in a variety of different ways. 7 00:00:20,880 --> 00:00:25,530 And I'll also be showing you some use cases of when this can be really useful, and we'll be building 8 00:00:25,530 --> 00:00:30,300 a super powerful pipeline that will allow us to sort data in tabular format. 9 00:00:30,300 --> 00:00:34,350 So by the end, you're going to be amazing at sorting data and you'll have a really good understanding 10 00:00:34,350 --> 00:00:40,200 of how to use the sort command to sort data in Linux no matter what, no matter whether it's text numbers, 11 00:00:40,200 --> 00:00:43,170 modified numbers, or even in tables. 12 00:00:43,170 --> 00:00:43,590 Ooh. 13 00:00:43,620 --> 00:00:45,790 So this is going to be a very useful section of the course. 14 00:00:45,810 --> 00:00:48,390 Let's go ahead and jump right into it. 15 00:00:50,180 --> 00:00:50,990 Okay. 16 00:00:50,990 --> 00:00:55,040 So the first thing I should probably say is that I've put three files on my desktop. 17 00:00:55,050 --> 00:01:01,580 I've got words text which contains 100 random words sorted in no particular order. 18 00:01:01,580 --> 00:01:04,519 They're just 100 random words that are just in that file. 19 00:01:04,700 --> 00:01:10,820 And we've got the numbers 0 to 9 point text, which is a file that has loads of numbers in there. 20 00:01:10,820 --> 00:01:14,840 I think 100 numbers each between zero and nine. 21 00:01:14,840 --> 00:01:21,560 And we've also got this other file here called Numbers Text, which is, I think another hundred random 22 00:01:21,560 --> 00:01:22,190 numbers. 23 00:01:22,190 --> 00:01:24,770 But they don't they're in no particular range. 24 00:01:24,980 --> 00:01:27,920 They're just 100 random numbers with no specific limit on them. 25 00:01:27,920 --> 00:01:34,610 And I've made these files available in the resources section as well as some information about how I 26 00:01:34,610 --> 00:01:35,300 created them. 27 00:01:35,300 --> 00:01:40,700 So if you want to grab those and try this stuff out for yourself, then you can go ahead and download 28 00:01:40,700 --> 00:01:41,600 them and give it a go. 29 00:01:41,630 --> 00:01:42,050 Okay. 30 00:01:42,050 --> 00:01:49,130 So let's start off by taking a look at this words txt, which is a file that contains 100 randomly generated 31 00:01:49,130 --> 00:01:49,790 words. 32 00:01:50,480 --> 00:01:55,970 And as I said, like if we just open that up again, it's just got 100 random words in there in no particular 33 00:01:55,970 --> 00:02:00,170 order, no particular length, just 100 randomly generated words. 34 00:02:01,100 --> 00:02:01,580 Okay. 35 00:02:01,580 --> 00:02:07,130 So first of all, how can we sort that file so that it's sorted alphabetically? 36 00:02:07,130 --> 00:02:12,500 So everything that starts with an A is at the top and everything that starts with a Z is at the bottom. 37 00:02:12,590 --> 00:02:17,930 Well, that's actually the default behaviour of the sort command, which is the perfectly named command 38 00:02:18,170 --> 00:02:19,220 for this kind of thing. 39 00:02:19,220 --> 00:02:20,810 So it's really simple to use as well. 40 00:02:20,810 --> 00:02:30,440 If we just type sort words, dot text that we've now got a list of sorted of sorted words from A to 41 00:02:30,440 --> 00:02:30,680 Z. 42 00:02:30,680 --> 00:02:37,700 So you can see that all the words that start with A at the top and all the words that start with Y are 43 00:02:37,700 --> 00:02:38,150 at the bottom. 44 00:02:38,150 --> 00:02:40,250 And it doesn't look like we've got anything that starts with a Z. 45 00:02:40,250 --> 00:02:42,260 So they've all been sorted alphabetically. 46 00:02:42,260 --> 00:02:43,160 So that's pretty cool. 47 00:02:43,160 --> 00:02:49,220 And of course we can write this standard output to a file or to a T command. 48 00:02:49,370 --> 00:02:50,690 We can use that as well. 49 00:02:50,960 --> 00:02:56,900 So like we could say, say into the file here called sorted text and that's going to be output there 50 00:02:56,900 --> 00:02:58,820 and we can pipe it down the pipeline. 51 00:02:58,820 --> 00:03:00,590 You know, it's just written on standard output. 52 00:03:00,590 --> 00:03:02,330 All the same rules apply. 53 00:03:03,020 --> 00:03:08,390 So what if we wanted instead of sorting it A to Z, what if we wanted to sort it in reverse? 54 00:03:08,390 --> 00:03:08,960 How would we sort it? 55 00:03:08,960 --> 00:03:10,850 So Z was at the top. 56 00:03:10,850 --> 00:03:12,980 Well, two options come to mind. 57 00:03:13,220 --> 00:03:19,070 First of all, what you could do is you could just sort the words like just like normal, but then pipe 58 00:03:19,070 --> 00:03:20,900 that into the tac command. 59 00:03:20,900 --> 00:03:25,850 And as you can see here, what's happened is the letters, the words that start with a or now at the 60 00:03:25,850 --> 00:03:30,560 bottom and the words that start with closer to Z are now at the top. 61 00:03:30,560 --> 00:03:37,130 And the reason for that is that, as we know, the tac command will sort or flip output vertically. 62 00:03:37,130 --> 00:03:41,720 So what would be at the bottom is now at the top and what was at the top is now at the bottom. 63 00:03:41,720 --> 00:03:44,780 So that's effectively as though we've just reversed the sorting. 64 00:03:44,780 --> 00:03:45,260 Right. 65 00:03:45,710 --> 00:03:50,780 But a better way to do it, I suppose a more built in and efficient way to do it would be to just give 66 00:03:50,780 --> 00:03:55,700 the sort command the R option and the R option stands for reverse. 67 00:03:55,700 --> 00:04:00,710 So if I do that, we can see again we've got the letters starting with a R towards the bottom. 68 00:04:00,710 --> 00:04:02,240 The word starting with a R towards the bottom. 69 00:04:02,240 --> 00:04:02,690 Sorry. 70 00:04:02,690 --> 00:04:06,980 And the words starting with close to Z are now at the top. 71 00:04:06,980 --> 00:04:11,750 And now you may notice that when I'm showing you this output that I'm scrolling up and down the shell 72 00:04:11,750 --> 00:04:12,410 quite a bit. 73 00:04:12,410 --> 00:04:14,390 So this is where the less command would come in. 74 00:04:14,390 --> 00:04:19,010 I could just pipe it into the less command to make the output a bit easier to view because now I'm using 75 00:04:19,010 --> 00:04:21,560 the arrow keys instead of scrolling with my mouse pad. 76 00:04:21,560 --> 00:04:24,050 So that's, you know, a bit of a nicer way to deal with things and compress. 77 00:04:24,050 --> 00:04:25,430 Q to come out of that. 78 00:04:25,430 --> 00:04:28,070 So just building upon what we did before, so awesome. 79 00:04:28,070 --> 00:04:34,490 Notice how you have you now have enough building blocks that you can solve the same problem in multiple 80 00:04:34,490 --> 00:04:34,820 ways. 81 00:04:34,820 --> 00:04:38,540 You know that you could now use the R option to reverse it. 82 00:04:38,540 --> 00:04:43,050 Or if you didn't want to do that, you could also pipe it into the tack command. 83 00:04:43,250 --> 00:04:46,100 This is one great thing about Linux. 84 00:04:46,130 --> 00:04:51,800 It gives you multiple different ways to solve the same problem because you've got so many different 85 00:04:51,800 --> 00:04:55,670 building blocks, so there's plenty of room for creativity in the way that you approach these problems 86 00:04:55,670 --> 00:04:58,580 is not necessarily one right way to do it. 87 00:04:58,580 --> 00:05:00,530 So so there we are now. 88 00:05:00,530 --> 00:05:02,690 How about sorting numbers? 89 00:05:02,690 --> 00:05:07,280 Well, I, I have a list here of 100 numbers called numbers text. 90 00:05:07,370 --> 00:05:13,010 And if we take a look, there's 100 numbers in there that don't have any particular size limit. 91 00:05:13,010 --> 00:05:15,800 They can be any number, but there's 100 of them. 92 00:05:15,800 --> 00:05:16,120 Okay. 93 00:05:16,220 --> 00:05:18,620 So how would we sort those? 94 00:05:19,370 --> 00:05:21,260 Well, sorting with letters. 95 00:05:21,260 --> 00:05:27,320 So sorting eight A-Z sorting alphabetically is different from sorting numbers, which is sorting numerically. 96 00:05:27,320 --> 00:05:35,840 So if we try to do sort numbers just like that, you see it didn't really work that well. 97 00:05:35,840 --> 00:05:40,550 What kind of happened here is if you if you notice and we put this into the less command so we can make 98 00:05:40,550 --> 00:05:45,710 it a bit easier to see, you know, so all the numbers that started with one have come to the top. 99 00:05:45,710 --> 00:05:49,280 And then it kind of as you see now, we're getting all the numbers that start with two. 100 00:05:49,570 --> 00:05:53,010 And then we'll get in all numbers that start with three and so on. 101 00:05:53,500 --> 00:05:56,560 But we're not a four or five, six, seven, eight and nine. 102 00:05:56,770 --> 00:06:01,500 You can see it's sorting by the first digit, but it's not sorting by the value of that number. 103 00:06:01,510 --> 00:06:09,670 For example, 160 16,029 is much bigger than 1641. 104 00:06:09,670 --> 00:06:11,200 So why is it above it? 105 00:06:11,200 --> 00:06:11,380 Right. 106 00:06:11,440 --> 00:06:14,500 There's no real sense of the size of the number. 107 00:06:14,500 --> 00:06:18,640 It's just the fact that it starts with a number one and it's sorting numerically as it goes. 108 00:06:18,640 --> 00:06:22,000 It's sorting as it goes along by just looking at the number. 109 00:06:22,000 --> 00:06:24,700 It's not looking at the value of the number, it's just looking at the digits. 110 00:06:24,700 --> 00:06:31,150 But if you want to sort using the value of the number, you need to give the sort command the N option. 111 00:06:31,150 --> 00:06:33,690 So the RN option allows it to sort numerically. 112 00:06:33,710 --> 00:06:37,420 When we take a look at that now, you can see that we're getting the smallest numbers at the top. 113 00:06:37,420 --> 00:06:43,480 So we're starting with 123 and then the numbers are getting bigger as they go down. 114 00:06:43,480 --> 00:06:45,760 So that's something important to bear in mind. 115 00:06:46,030 --> 00:06:51,280 Sorting just by the digits is different than sorting by the value of the whole number. 116 00:06:51,280 --> 00:06:55,030 And if you want to search by the value of the whole number, you need to give it the N option. 117 00:06:55,030 --> 00:07:00,310 And of course, you can reverse this so that the biggest at the top by using the giving the R option 118 00:07:00,310 --> 00:07:01,330 as well for reverse. 119 00:07:01,330 --> 00:07:06,670 So now we see that the biggest numbers are at the top and as we scroll down, the numbers get smaller. 120 00:07:06,670 --> 00:07:07,930 So that's very, very nice. 121 00:07:08,290 --> 00:07:14,020 So you've seen how to sort text alphabetically and you've seen her two sort numbers numerically. 122 00:07:14,260 --> 00:07:19,990 But one thing that's very useful is to only show unique results rather than the same result over and 123 00:07:19,990 --> 00:07:20,890 over and over again. 124 00:07:20,890 --> 00:07:24,580 So we've got a file here called numbers 0 to 9 point text. 125 00:07:24,610 --> 00:07:31,050 And in there there's 100 rows of data, but only the rows only contain one of the numbers from 0 to 126 00:07:31,050 --> 00:07:31,330 9. 127 00:07:31,330 --> 00:07:33,790 So you'll necessarily have some duplicates. 128 00:07:33,790 --> 00:07:36,280 For example, here we've got 000. 129 00:07:36,820 --> 00:07:38,620 It's not just unique results. 130 00:07:38,620 --> 00:07:41,560 There are results that are repeated. 131 00:07:41,980 --> 00:07:48,160 So if we take a look in here and we try to sort that, which are just sort numbers 0 to 9.2, so let's 132 00:07:48,160 --> 00:07:49,180 pipe it into less. 133 00:07:49,180 --> 00:07:52,990 You'll see that all the zeros come first, then all the ones and all the twos and all the threes and 134 00:07:52,990 --> 00:07:55,780 all the fours, the fives, the six, the 7/8 and the nines. 135 00:07:56,020 --> 00:07:57,550 And that's that's a bit it's a bit annoying. 136 00:07:57,550 --> 00:08:00,550 Okay, maybe we just want to just have the results once. 137 00:08:00,550 --> 00:08:06,760 Well, to do that, what you do is you give the sort option sort command the you option and the you 138 00:08:06,760 --> 00:08:08,740 option stands for unique. 139 00:08:08,740 --> 00:08:13,330 And when you run that now you'll see that we only get the results once. 140 00:08:13,630 --> 00:08:16,150 So let me just set up hoping that into less. 141 00:08:16,150 --> 00:08:16,900 There you go. 142 00:08:16,900 --> 00:08:21,520 You can see the you only get the results once so 0 to 9 and this is useful. 143 00:08:21,520 --> 00:08:25,270 Sometimes you don't want to have the same result repeated over and over and over again. 144 00:08:25,270 --> 00:08:27,700 You only want the results to appear once. 145 00:08:27,700 --> 00:08:31,840 And that's why you would use the you option which allows you to sort uniquely. 146 00:08:31,840 --> 00:08:36,850 And of course this still works with other options such as the reverse, which now we're sorting backwards 147 00:08:36,850 --> 00:08:39,970 and you can still sort numerically and whatever else you might need to do. 148 00:08:39,970 --> 00:08:44,290 But that's just how you can make sure that you only get each of the results once. 149 00:08:45,300 --> 00:08:49,110 So in this video you've had a bit of an introduction to the Salt Command, and you've seen that you 150 00:08:49,110 --> 00:08:52,320 can sort data using the sort command. 151 00:08:52,320 --> 00:08:54,750 Now, the sort command tends to sort. 152 00:08:54,780 --> 00:08:56,370 Smallest first. 153 00:08:56,370 --> 00:09:02,550 So it'll tend to sort from A to Z or from 0 to 9 by default. 154 00:09:03,030 --> 00:09:05,910 But you can reverse this by using the R option. 155 00:09:05,910 --> 00:09:07,440 So the R option stands for reverse. 156 00:09:07,440 --> 00:09:12,720 And if you ever sorting numerical data and you want the sort command to take into account the actual 157 00:09:12,720 --> 00:09:18,600 value of the number rather than just the digits that it contains, then make sure that you give the 158 00:09:18,600 --> 00:09:22,080 sort command the N option to allow it to sort numerically. 159 00:09:22,080 --> 00:09:27,240 And if you only want to get unique results out of the sort command, provide it with the U option which 160 00:09:27,240 --> 00:09:29,970 will filter out just the results that appear once. 161 00:09:29,970 --> 00:09:31,500 So that's what we covered in this video. 162 00:09:31,500 --> 00:09:36,000 But in the next video, we're going to take it to a next level and show you some advanced sorting techniques 163 00:09:36,000 --> 00:09:39,600 that will allow you to sort data that comes in table format. 164 00:09:39,600 --> 00:09:43,920 So and we're also going to be building a really cool pipeline that will do some of this as well. 165 00:09:43,920 --> 00:09:48,360 So let's just go ahead and get right into it in the next video and take your sorting skills to the next 166 00:09:48,360 --> 00:09:48,870 level.