In this article, we’ll discuss several Azure File Sync troubleshooting methods you can use to:
- Ensure AFS is operating properly on your server.
- Receive support from AFS engineers (utilizing the AFSDiag tool).
- Monitor sync progress and identify sync errors.
- Troubleshoot cloud-tiering errors.
We’ll also discuss our own file sync solution, Resilio Connect, and how it provides superior replication that’s easier to monitor and troubleshoot.
Resilio Connect is an alternative to AFS and DFSR that, in addition to providing faster replication, provides a centralized Management Console that makes managing, monitoring, and troubleshooting replication easier. It’s a highly reliable (i.e., fault tolerant) replication solution that natively supports Azure Files, Azure Blobs, and other Azure storage services. It also supports multi-cloud and hybrid cloud deployments.
Resilio Connect is an agent-based replication solution that uses peer-to-peer file transfer and WAN optimization technology to provide fast, resilient, scalable replication (no limits on file sizes and can sync 400+ million files in a single job). It always works over any network.
Resilio is a vendor-agnostic solution that natively supports any cloud storage provider (including Azure), multiple operating systems, and any deployment scenario (i.e., on-prem, cloud, or cloud-hybrid).
Organizations in gaming (2K Games, Blizzard), media (CBS, Warner Brothers), tech (Match.com, Microsoft), retail (Otto Bros, Mercedes-Benz), and more use Resilio Connect to synchronize their data and enhance business workflows.
After explaining common problems with Azure file sync and their solutions, we’ll discuss Resilio’s:
- Simplicity and ease of management
- Fast, scalable file replication
- Superb performance over long-range, wide area networks
- Flexible, vendor-agnostic deployment
- Superior security
To learn more about synchronizing with Resilio Connect, schedule a demo.
Addressing General Issues with AFS on a Server
When trying to resolve sync, cloud tiering, or other issues with AFS on your registered servers, Microsoft strongly advises against removing and recreating the server endpoint (which many users may be inclined to try). Doing so can result in data loss and inaccessible files — issues that won’t be resolved when the server endpoint is recreated.
Instead, perform the following steps to ensure AFS is installed correctly on your server:
- Go to the Event Viewer and review operational, diagnostic, and telemetry event logs. Operational and diagnostic event logs can be viewed under “Applications and Services\Microsoft\FileSync\Management”.
- Verify that the Azure File Sync agent is running on the server.
- Then verify that the AFSfilter drivers are running by using the fltmc command prompt. Confirm that the StorageSync.sys and StorageSyncGuard.sys file system filter drivers are listed.
If this does not resolve the issue, you’ll need to use the AFSDiag tool to get a readout of your system that can be sent to Azure engineers so they can better diagnose and troubleshoot Azure file sync problems.
To use the AFSDiag tool, run the following Powershell command:
cd "c:\Program Files\Azure\StorageSyncAgent"Import-Module .\afsdiag.ps1Debug-AFS -OutputDirectory C:\output -KernelModeTraceLevel Verbose -UserModeTraceLevel Verbose
Be sure to specify an output location. AFS will create a .zip file that contains logs and trace files and save it in the designated output location. Then send the .zip file to your designated support engineer.
Monitoring Sync Status and Progress
You can monitor sync status and identify any sync issues within each sync group in your Azure portal.
Check Last Completed Sync Job
Check your last completed sync job to see if any files failed to sync. The “Files Not Syncing” value should be 0 if everything synced without issue. Otherwise, it will display the number of files that failed to sync.
Monitor Current Sync Progress
You can monitor the status of a sync job currently in progress by checking the Sync Activity section of your sync group. It will display how many files have been uploaded and downloaded in the current sync session (though this status will be delayed by roughly 5 minutes).
Identify Files That Failed to Sync
To see which specific files failed to sync, run the FileSyncErrorsReport.ps1 script in PowerShell. This will reveal files that failed to sync due to unsupported characters, open handles, or other issues, as well as the location of the file in relation to the root directory.
For a full list of common sync errors and how to resolve them, visit Microsoft’s troubleshooting page.
Troubleshooting Cloud Tiering Issues
AFS’ cloud tiering feature enables you to store frequently accessed files on your local server for quicker access.
Issues can occur when a file fails to tier (i.e., doesn’t get successfully stored locally) or when files fail to recall (i.e., a file fails to download when a user attempts to access the tiered file from a cloud endpoint).
For a full list of common cloud tiering errors and how to resolve them, visit Microsoft’s troubleshooting page.
Note: Resilio doesn’t offer cloud-tiering like AFS; it’s simply not necessary. Instead, Resilio offers lightning-fast sync and provides file caching (where files are stored locally in a dehydrated state that doesn’t consume excessive disk space). End users get a centralized view of your entire file system and can easily access locally stored files on-demand.
Superior Sync with Resilio Connect
Resilio Connect is a high-performance file sync solution that uses a proven P2P replication architecture — which enables it to provide fast (20+ Gbps per server), resilient, highly-scalable replication in any environment — and WAN optimization technology — which enables it to maximize bandwidth usage and provide fast file transfer over any network.
Resilio Connect is easier to deploy and manage than AFS. It’s a vendor-agnostic, multi-cloud solution that runs on popular servers (in virtual machines, containers, or on physical systems) and uses any type of storage (NAS, SAN, DAS, or object storage).
Resilio can be deployed cross-platform, on-prem, in cloud, or in cloud-hybrid scenarios. It’s easier to manage than AFS, doesn’t require sync groups or complex management tasks, and enables full control over replication from a single, unified dashboard.
Simplicity & Ease of Management
With AFS, managing larger sync jobs can become complex, requiring you to split large syncs into multiple namespaces and sync groups. But Resilio’s Management Console makes it easy to create, visualize, and automate sync jobs.
After you’ve installed Resilio agents on your servers, you simply need to add servers and shares to your sync jobs. The Management Console provides a centralized view of your entire sync environment and enables you to monitor sync status with real-time notifications.
You can also use it to control replication parameters, such as disk io threads, packet size, data hashing, bandwidth utilization (create schedules that control how much bandwidth a file server can utilize at a certain time of the day or day of the week), and more. And you can automate sync jobs and script any type of functionality using Resilio’s REST API.
Case Study: Wargaming
The video game design company Wargaming uses Resilio Connect to distribute their video game files of 50+ GB across 20 offices over WAN networks.
“Resilio met all our technical requirements, but was also very easy to deploy and integrate into our workflow. We also liked the fact that the simplicity carried over to transparent pricing and clear documentation.”
Fast, Organically Scalable P2P Replication
Resilio Connect’s replication architecture is one of the major features that sets it apart from AFS and other replication solutions.
AFS uses a point-to-point file transfer architecture. In point-to-point replication, files are replicated from one server to another sequentially. This type of replication is more than just slow: it can create bottlenecks that impede full synchronization across your entire environment.
For example, if file replication is impeded on one server, the other servers in your environment won’t receive the replicated files until the transfer is complete.
Resilio Connect provides faster, more resilient replication using:
- Peer-to-peer replication
- File chunking
- Real-time replication
In a peer-to-peer replication environment, every server can take part in replication simultaneously. And with file chunking, replicated files are broken down into chunks that can transfer independently of each other.
This means that every server in your environment can work together to concurrently distribute files across your entire system, resulting in replication speeds 3-10x faster than traditional solutions (as illustrated in the GIF below).
And Resilio is able to immediately detect and replicate file changes in real time by using notifications from the host operating system and optimized checksum calculations — i.e., identification markers for each file that change when a change is made to the file.
In addition to faster replication, peer-to-peer transfer enables you to seamlessly scale your replication environment to grow along with your replication needs.
Since every server can take part in replication, adding more servers increases transfer speed and bandwidth — enabling Resilio to synchronize hundreds of servers in roughly the same time that traditional solutions would sync two servers.
According to Microsoft’s deployment guide, AFS has been tested with 100 million files in a single sync job. However, they recommend keeping sync jobs to no more than 20-30 million files, requiring users to split their namespaces into multiple shares if they exceed these numbers.
Resilio, on the other hand, has been tested with 400+ million files in a single sync job. There are no limits on the file sizes or number of files you can transfer with Resilio. And because of its P2P transfer architecture, there’s no need to engage in any complex transfer schemes to synchronize your servers.
Optimized Transfer over WANs
Thanks to Resilio’s proprietary WAN acceleration protocol, Zero Gravity Transport™ (ZGT), Resilio Connect is an optimal replication solution for organizations that need to sync over high-latency, lossy WANs.
ZGT optimizes WAN transfer using:
- A congestion control algorithm: ZGT uses a congestion control algorithm that periodically probes the RTT (Round Trip Time) of packet transmission in order to constantly calculate and maintain the ideal packet send rate.
- Interval acknowledgements and delayed retransmission: Rather than sending acknowledgements for every packet receipt (like other transfer protocols), ZGT sends acknowledgements for groups of packets that contain additional info about lost packets. ZGT then retransmits lost packets just once per RTT in order to decrease unnecessary retransmissions and increase transfer speed.
- Dynamic rerouting: ZGT is able to dynamically route around slow or downed networks in your environment, enabling you to fully utilize 100% of your bandwidth and ensure your files always reach their destination.
ZGT also enables Resilio users to sync from the edge of networks and across areas with little to no network connectivity (e.g., in remote locations or across the ocean).
Case Study: Northern Marine Group
Northern Marine Group is a ship management and marine services provider that uses Resilio Connect to synchronize data across their fleet of sea vessels.
“In the future, Resilio Connect and its API will vastly improve our database replication mechanism. More importantly, it will allow us to realistically increase the frequency of replication, so that the information passing back and forth comes far more in real-time and we’ll be able to automatically remediate the discrepancies.”
Vendor-Agnostic Flexibility
Resilio Connect is a vendor-agnostic replication solution that offers more flexibility than AFS. For on-prem deployments, AFS can be deployed on macOS, Linux, and Windows servers. But for cloud deployments, it’s limited to Microsoft Azure’s cloud infrastructure.
Resilio doesn’t lock you into any specific storage provider. It has full compatibility with multiple operating systems, such as Windows, Linux, FreeBSD, Unix, MacOS, and all major NAS solutions. And it can be deployed on-prem, in the cloud (any type of cloud for DFS), and in hybrid cloud scenarios.
Resilio Connect also natively supports Azure cloud storage, so you can deploy Resilio in conjunction with AFS to sync data across sites and between Azure storage regions, sync data between on-premises sites, reduce recovery times for disaster recovery deployments, and enable Active-Active high availability.
End-to-End Security
Resilio Connect keeps your files protected using end-to-end security tactics that include:
- AES 256 encryption: Resilio encrypts your data at rest and in transit.
- Role-based permission controls: Resilio enables you to control access to settings, jobs, and agents, as well as create multi-level admins.
- Mutual authentication: Resilio only delivers your data to approved, designated endpoints.
- Cryptographic integrity validation: Integrity validation is used to ensure that your data arrives at its destination uncorrupted.
To learn more about synchronizing with Resilio Connect, schedule a demo.