1
00:00:01,240 --> 00:00:04,810
What about the common problem we, Windows systems administrators,

2
00:00:04,810 --> 00:00:09,230
face in terms of our file servers having so much wasted or

3
00:00:09,230 --> 00:00:12,190
dead space due to duplicate file storage.

4
00:00:12,190 --> 00:00:13,300
That's a pain, isn't it?

5
00:00:13,300 --> 00:00:15,770
You may have all of your users home folders.

6
00:00:15,770 --> 00:00:19,000
Just think of all of the same copies of reports,

7
00:00:19,000 --> 00:00:22,980
PowerPoint files that weigh in several megabytes in size,

8
00:00:22,980 --> 00:00:26,760
meeting recordings that are hundreds of megabytes or gigabytes in

9
00:00:26,760 --> 00:00:30,240
size being redundantly stored on these servers.

10
00:00:30,240 --> 00:00:31,350
What can we do about it?

11
00:00:31,350 --> 00:00:31,510
Well,

12
00:00:31,510 --> 00:00:34,310
we've got a built‑in feature that's been in Windows server

13
00:00:34,310 --> 00:00:36,910
for a long time called data deduplication,

14
00:00:36,910 --> 00:00:41,440
which offers a space savings without sacrificing access performance,

15
00:00:41,440 --> 00:00:44,390
and in the Microsoft docs, you can see some numbers,

16
00:00:44,390 --> 00:00:46,220
they say that user documents,

17
00:00:46,220 --> 00:00:50,060
you can get a 30 to 50% space savings on your file servers,

18
00:00:50,060 --> 00:00:53,710
for virtual hard disks, in other words, for your Hyper‑V VMs,

19
00:00:53,710 --> 00:00:58,150
80 to 95% savings, and then for your binaries,

20
00:00:58,150 --> 00:01:01,320
your deployment shares, 70 to 80% saving.

21
00:01:01,320 --> 00:01:05,580
So you see here that we can do data dedupe not only on user files,

22
00:01:05,580 --> 00:01:07,340
but on binary files as well.

23
00:01:07,340 --> 00:01:08,140
What happens?

24
00:01:08,140 --> 00:01:14,380
Well, in a data dedupe job, when the operating system senses that two artifacts,

25
00:01:14,380 --> 00:01:17,340
two files are exactly the same through check summing,

26
00:01:17,340 --> 00:01:21,380
only one of those will be kept, and then pointers will be internally

27
00:01:21,380 --> 00:01:25,040
maintained to wherever that file is referenced on your volume.

28
00:01:25,040 --> 00:01:28,460
There are a few different contexts that you can choose from when

29
00:01:28,460 --> 00:01:31,280
you turn on data deduplication for a disk,

30
00:01:31,280 --> 00:01:33,160
for a virtual disk or volume.

31
00:01:33,160 --> 00:01:36,370
One is the file server context where you're going to work,

32
00:01:36,370 --> 00:01:39,700
like I mentioned a moment ago, home folders work folders,

33
00:01:39,700 --> 00:01:41,420
software development shares.

34
00:01:41,420 --> 00:01:44,580
Another context is virtual desktop infrastructure.

35
00:01:44,580 --> 00:01:49,440
This is the VHD, or the virtual machine virtual hard disk context.

36
00:01:49,440 --> 00:01:56,000
And then we have the backup context, how you can optimize your backup snapshots by applying deduplication.