Windows Storage Deduplication
Background
This article was written on 27-Nov-2023.
The company in which I am working on is organized by multiple teams of 3–4 staffs each.
Although we do have a “team share” folder, there is no way to stop user in creating a file copy on his/her own folder.
I suspect that there are duplicate files in company document storage.
Reducing storage size immediate benefit is on backup time and size.
Long term benefit would be saving money from purchasing storage device.
References
https://learn.microsoft.com/en-us/windows-server/storage/data-deduplication/overview
Precautions
So far I do not see that we need to concern about anything.
Windows Storage Deduplication just purring in the background, waking up at schedule and takes care of my user duplicate files.
Prerequisites
Windows Server 2016 onwards and data deduplication feature installed.
A running company storage server and file sharing service.
Configuring Data Deduplication Service
Assuming that the pre requisites are met, you should open [server manager + file and storage services]
* Yes, Data Deduplication service still not available on Windows Admin Center
Then click on “Volumes”
Right click on the volume that you want to “deduplicate” and select “configure data deduplication”.
Please refer to above reference for explanation about data de duplication options.
In my case, I want to de duplicate my file server, so I choose “General purpose file server”.
The way de duplication works is either in background or by schedule. You can specify this in “set deduplication schedule”.
Since I consider it is not critical to run de duplication and I want to conserve server load, then I choose to run de duplication by schedule, only after backup is finished and when nobody open files (I have script to shutdown user computers at office).
The following day, you can check data de duplication result on the same page (server manager).
In my case I was able to save 43%.
My suspicion on staffs saving their own files (instead of using team share) is proven :).
You may notice the volume name is “Data_Rep”, indeed I am using another Microsoft technology/service, which is Storage Replication.
Please see my other article on that.
Happy working !