1 00:00:01,240 --> 00:00:04,810 What about the common problem we, Windows systems administrators, 2 00:00:04,810 --> 00:00:09,230 face in terms of our file servers having so much wasted or 3 00:00:09,230 --> 00:00:12,190 dead space due to duplicate file storage. 4 00:00:12,190 --> 00:00:13,300 That's a pain, isn't it? 5 00:00:13,300 --> 00:00:15,770 You may have all of your users home folders. 6 00:00:15,770 --> 00:00:19,000 Just think of all of the same copies of reports, 7 00:00:19,000 --> 00:00:22,980 PowerPoint files that weigh in several megabytes in size, 8 00:00:22,980 --> 00:00:26,760 meeting recordings that are hundreds of megabytes or gigabytes in 9 00:00:26,760 --> 00:00:30,240 size being redundantly stored on these servers. 10 00:00:30,240 --> 00:00:31,350 What can we do about it? 11 00:00:31,350 --> 00:00:31,510 Well, 12 00:00:31,510 --> 00:00:34,310 we've got a built‑in feature that's been in Windows server 13 00:00:34,310 --> 00:00:36,910 for a long time called data deduplication, 14 00:00:36,910 --> 00:00:41,440 which offers a space savings without sacrificing access performance, 15 00:00:41,440 --> 00:00:44,390 and in the Microsoft docs, you can see some numbers, 16 00:00:44,390 --> 00:00:46,220 they say that user documents, 17 00:00:46,220 --> 00:00:50,060 you can get a 30 to 50% space savings on your file servers, 18 00:00:50,060 --> 00:00:53,710 for virtual hard disks, in other words, for your Hyper‑V VMs, 19 00:00:53,710 --> 00:00:58,150 80 to 95% savings, and then for your binaries, 20 00:00:58,150 --> 00:01:01,320 your deployment shares, 70 to 80% saving. 21 00:01:01,320 --> 00:01:05,580 So you see here that we can do data dedupe not only on user files, 22 00:01:05,580 --> 00:01:07,340 but on binary files as well. 23 00:01:07,340 --> 00:01:08,140 What happens? 24 00:01:08,140 --> 00:01:14,380 Well, in a data dedupe job, when the operating system senses that two artifacts, 25 00:01:14,380 --> 00:01:17,340 two files are exactly the same through check summing, 26 00:01:17,340 --> 00:01:21,380 only one of those will be kept, and then pointers will be internally 27 00:01:21,380 --> 00:01:25,040 maintained to wherever that file is referenced on your volume. 28 00:01:25,040 --> 00:01:28,460 There are a few different contexts that you can choose from when 29 00:01:28,460 --> 00:01:31,280 you turn on data deduplication for a disk, 30 00:01:31,280 --> 00:01:33,160 for a virtual disk or volume. 31 00:01:33,160 --> 00:01:36,370 One is the file server context where you're going to work, 32 00:01:36,370 --> 00:01:39,700 like I mentioned a moment ago, home folders work folders, 33 00:01:39,700 --> 00:01:41,420 software development shares. 34 00:01:41,420 --> 00:01:44,580 Another context is virtual desktop infrastructure. 35 00:01:44,580 --> 00:01:49,440 This is the VHD, or the virtual machine virtual hard disk context. 36 00:01:49,440 --> 00:01:56,000 And then we have the backup context, how you can optimize your backup snapshots by applying deduplication.