1 00:00:00,05 --> 00:00:02,01 - [Scott] When we talk about file metadata, 2 00:00:02,01 --> 00:00:04,07 we'll usually think first of things like EXIF data 3 00:00:04,07 --> 00:00:05,08 and authorship data, 4 00:00:05,08 --> 00:00:08,01 things we're accustomed to seeing in various software tools, 5 00:00:08,01 --> 00:00:10,07 like photo editors and word processing software. 6 00:00:10,07 --> 00:00:11,06 And as we've seen, 7 00:00:11,06 --> 00:00:14,06 this kind of metadata is important to understand. 8 00:00:14,06 --> 00:00:16,07 There's another kind of metadata we should be aware of 9 00:00:16,07 --> 00:00:18,03 when working with files, though. 10 00:00:18,03 --> 00:00:20,08 That's the metadata that's stored in the file system, 11 00:00:20,08 --> 00:00:23,00 outside of an actual files data. 12 00:00:23,00 --> 00:00:24,00 Before we explore this, 13 00:00:24,00 --> 00:00:25,07 let's take a moment to refresh ourselves 14 00:00:25,07 --> 00:00:28,02 about how files are stored on computers. 15 00:00:28,02 --> 00:00:31,08 A file is made up of information encoded into binary bits. 16 00:00:31,08 --> 00:00:34,04 These bits represent whatever the content of the file is, 17 00:00:34,04 --> 00:00:37,05 whether it's text, an image, a video, or whatever. 18 00:00:37,05 --> 00:00:39,00 The bits are organized into bytes, 19 00:00:39,00 --> 00:00:40,05 and the bytes are stored on disk 20 00:00:40,05 --> 00:00:43,08 in little areas called blocks on Unix and Linux type systems 21 00:00:43,08 --> 00:00:45,06 and clusters on Windows. 22 00:00:45,06 --> 00:00:48,05 Every file system, regardless of operating system, 23 00:00:48,05 --> 00:00:50,07 uses these fundamental little buckets of data 24 00:00:50,07 --> 00:00:53,06 to store files or groups of them called extends, 25 00:00:53,06 --> 00:00:55,00 and they can be of different sizes 26 00:00:55,00 --> 00:00:56,04 on different file systems, 27 00:00:56,04 --> 00:00:58,04 but these blocks, or clusters, or extends 28 00:00:58,04 --> 00:01:00,02 each have their own address. 29 00:01:00,02 --> 00:01:01,08 When we save a file on disk, 30 00:01:01,08 --> 00:01:04,05 its data takes up one or more little unit, 31 00:01:04,05 --> 00:01:06,02 and the addresses of those storage units 32 00:01:06,02 --> 00:01:07,08 are recorded in a data structure 33 00:01:07,08 --> 00:01:10,08 that associates them with other information about the file. 34 00:01:10,08 --> 00:01:13,05 Most importantly, this is where we associate a name, 35 00:01:13,05 --> 00:01:16,02 permissions, and creation and modification times 36 00:01:16,02 --> 00:01:17,09 with the data of a file. 37 00:01:17,09 --> 00:01:19,02 Depending on the file system, 38 00:01:19,02 --> 00:01:21,07 other metadata may be associated with files here, 39 00:01:21,07 --> 00:01:24,02 including ACLs or access control lists 40 00:01:24,02 --> 00:01:26,00 and extended attributes. 41 00:01:26,00 --> 00:01:28,01 On Linux and Unix type file systems, 42 00:01:28,01 --> 00:01:30,01 these records that contain file metadata 43 00:01:30,01 --> 00:01:33,02 and point to disk blocks are called inodes. 44 00:01:33,02 --> 00:01:34,08 On a Windows NTFS drive, 45 00:01:34,08 --> 00:01:36,09 these records are kept as entries in a structure 46 00:01:36,09 --> 00:01:39,06 called the MFT or master file table. 47 00:01:39,06 --> 00:01:41,00 We'll explore both of these approaches 48 00:01:41,00 --> 00:01:44,02 to storing filesystem metadata later in the course. 49 00:01:44,02 --> 00:01:46,03 Other file systems may use other structures, 50 00:01:46,03 --> 00:01:48,06 like a FAT or file allocation table 51 00:01:48,06 --> 00:01:50,00 to record basic information 52 00:01:50,00 --> 00:01:52,03 like the name and location of a file, 53 00:01:52,03 --> 00:01:54,03 but often they don't support richer metadata 54 00:01:54,03 --> 00:01:56,02 that modern file systems handle. 55 00:01:56,02 --> 00:01:58,03 Support for certain characters and file names 56 00:01:58,03 --> 00:02:01,08 and case sensitivity also vary across file systems. 57 00:02:01,08 --> 00:02:04,00 While all file systems support basic metadata, 58 00:02:04,00 --> 00:02:06,08 like name, size, and modification date 59 00:02:06,08 --> 00:02:09,07 and while modern file systems have rich metadata support, 60 00:02:09,07 --> 00:02:11,08 they don't all work in the same way. 61 00:02:11,08 --> 00:02:13,08 This means that copying or moving a file 62 00:02:13,08 --> 00:02:15,06 isn't quite as simple of an operation 63 00:02:15,06 --> 00:02:17,01 as it might appear to be. 64 00:02:17,01 --> 00:02:19,00 When a file is copied or moved, 65 00:02:19,00 --> 00:02:21,02 the copy operation includes the metadata 66 00:02:21,02 --> 00:02:23,03 from the file system or metadata store, 67 00:02:23,03 --> 00:02:25,05 along with the actual bytes of the file. 68 00:02:25,05 --> 00:02:27,07 The name, the modification dates, and so on 69 00:02:27,07 --> 00:02:30,01 are translated into whatever values can be stored 70 00:02:30,01 --> 00:02:32,00 in the destination file system. 71 00:02:32,00 --> 00:02:33,03 Though platform-specific data 72 00:02:33,03 --> 00:02:34,08 sometimes doesn't have a place to go 73 00:02:34,08 --> 00:02:37,02 and it temporarily get separated from its file 74 00:02:37,02 --> 00:02:39,02 or even lost entirely. 75 00:02:39,02 --> 00:02:40,05 As we'll see throughout the course, 76 00:02:40,05 --> 00:02:43,06 the loss of this metadata is often not a big problem 77 00:02:43,06 --> 00:02:45,07 because the bytes of the data itself are preserved 78 00:02:45,07 --> 00:02:47,03 when the file is moved or copied. 79 00:02:47,03 --> 00:02:50,01 However, metadata loss or cross-platform handling 80 00:02:50,01 --> 00:02:52,09 can have some odd effects unless you know what's going on. 81 00:02:52,09 --> 00:02:54,02 And in some situations, 82 00:02:54,02 --> 00:02:56,03 metadata is quite important, 83 00:02:56,03 --> 00:02:58,05 and so we need to consider carefully any action 84 00:02:58,05 --> 00:03:01,07 that might strip or otherwise separate it from its file. 85 00:03:01,07 --> 00:03:03,01 Because filesystem metadata 86 00:03:03,01 --> 00:03:05,05 is stored outside of the files it describes, 87 00:03:05,05 --> 00:03:08,01 we can change many of these metadata aspects of files 88 00:03:08,01 --> 00:03:11,00 without modifying the actual file data. 89 00:03:11,00 --> 00:03:13,01 That's what we do whenever we change permission modes 90 00:03:13,01 --> 00:03:15,00 or set ACLs. 91 00:03:15,00 --> 00:03:17,01 Those changes are made in the file system, 92 00:03:17,01 --> 00:03:19,05 not to the individual files themselves, 93 00:03:19,05 --> 00:03:20,06 and as a consequence, 94 00:03:20,06 --> 00:03:22,08 if we move or copy these files to a file system 95 00:03:22,08 --> 00:03:25,05 that doesn't support permissions, or flags, or whatever, 96 00:03:25,05 --> 00:03:28,00 that particular metadata can be lost. 97 00:03:28,00 --> 00:03:31,03 This can have security implications, as we'll discuss later. 98 00:03:31,03 --> 00:03:32,08 Okay, so it turns out the files 99 00:03:32,08 --> 00:03:34,00 are a little bit more complicated 100 00:03:34,00 --> 00:03:36,00 than they might seem at first, 101 00:03:36,00 --> 00:03:37,06 but the important part to think about here 102 00:03:37,06 --> 00:03:40,05 is that the files themselves, the bytes on the disk, 103 00:03:40,05 --> 00:03:43,01 don't include all this other metadata information 104 00:03:43,01 --> 00:03:46,02 that we, as humans, think about when we think about files. 105 00:03:46,02 --> 00:03:49,00 They're separate even though they're presented as one thing. 106 00:03:49,00 --> 00:03:50,06 The data structures, strategies, 107 00:03:50,06 --> 00:03:52,02 and features of different file systems 108 00:03:52,02 --> 00:03:54,00 are really interesting to learn about, 109 00:03:54,00 --> 00:03:55,08 but that's not the focus of this course. 110 00:03:55,08 --> 00:03:57,00 It's important to recognize 111 00:03:57,00 --> 00:03:58,02 that there are differences, though, 112 00:03:58,02 --> 00:04:00,00 as we'll see in the next videos.