Microsoft File Server Redundancy Options
[Last updated: 11-Oct-24] I added more explanation on automated disaster recovery.
Background
This article was written on 12-Dec-23, in relation to my other article on how to configure Microsoft Storage Replica — which becomes my choice for file server redundancy.
Although a small organization, redundancy is a subject that we all should pursue for a relatively calm, relaxed working environment.
Downtime is still inevitable with limited options (read: budget) that we face, but at least we can achieve a minimum time to recover.
Precautions
My writing covers only Microsoft technology, although Syncthing or scheduled file copy tools are also available under Linux.
Prerequisites
I am assuming that you should have at least x1 on premise server and another x1 server either on premise or on cloud.
On the software side, you need to be familiar with Microsoft Distributed File System, which includes “distributed namespace” technology.
Review
Below is a summary of the subjects which concerns me on selecting file server redundancy option.
The objective of file server redundancy is to have all files (and network shares) available as soon as possible in case of a (God forbid) disaster.
Locked Files Handling
Many options cannot handle files that are being locked/opened by user.
For me, these locked files means they are the most recent files that users are working on, and I need to make them redundant. So this feature is very important for me.
Real Time Sync
Even after user already close the (recent) files that they are using, many options cannot replicate/synchronize the files in real time. This may result in most recently updated files not made redundant.
O/S Level Separation
Virtual machine replication is designed to replicate the entire virtual machine, not a specific virtual disk.
Even if you choose only specific virtual disk which contains data, then that virtual disk is mostly administered by the source server and may cause issue or need further administration if attached to other VM.
And if you replicate the entire VM, there is a chance that the O/S is corrupt and you will still unable to turn it on again.
I mean, what can be the reason the original server is not available? It can be that the O/S is having trouble.
Automatic Disaster Recovery
I would say, automatic disaster recovery is not workable.
What I meant by automatic disaster recovery is to have multiple file server (shared folders) enabled at the same time. Theoretically if (one) server is down, users will be redirected to the active enabled DFS Namespace shared folders.
Yet actually DFS Namespace will distribute user access across all enabled servers. So if users are accessing the same file at the same time, they may be actually accessing different file entity at different servers.
This will result in chaos as saved files by other user will conflict with most recent saved files and the most recent will win. Updates performed by earlier user is gone.
All in all, I am left with Microsoft Storage Replica — which I am already using for several months already.
DFS NameSpace
I need to discuss this subject as it is the key technology that allows us to have file server redundancy.
Even if you have redundant files, but if your shared folder is not yet created, you will need a lot of time to recreate them (although possible using script).
With DFS Namespace, we can define multiple locations of shared folders.
In above example, I have a shared namespace for “Accounting” which are configured in two servers (FS2 and FS5).
Please note that I disabled one of the namespace server (FS5). See above automatic disaster recovery ak for explanation.
Failover Clustering
Please see below reference for this technology:
a. Two node file server.
https://learn.microsoft.com/en-us/windows-server/failover-clustering/deploy-two-node-clustered-file-server
b. Storage spaces direct
This involves creating “storage spaces direct” in VM (which requires x2 virtual disks).
https://learn.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-direct-in-vm
Cluster means more than one physical server grouped together. Then cluster failover means it requires more than one cluster for failover.
This is a “luxury” setup for small businesses, something which most of us cannot afford.
After reading this, it seems that storage replica is more intended for replication between clusters — instead of within cluster.
Yet indeed we can operate storage replica between individual servers, it does not have to be clusters.
While the main weight of storage spaces direct; why it is not suitable for small business; is it requires x2 virtual disks in a fast storage device.
It will unnecessarily consumes storage space while a hypervisor would most likely already have a RAID setup.
And it is uncommon nowadays to have physical server dedicated just for one function.
Alright, have fun with your file server(s).