AWS Snowball and Microsoft Azure Data Box are physical devices used for data transportation into the AWS or Azure clouds. Users order the devices, upload their data on them, and then ship them back so AWS or Azure staff can upload the data to the cloud.
Conversely, Resilio Platform for edge ingest is a software-only data collection solution that runs on any device. It’s designed to reliably transfer files from the edge to locations anywhere — including other edge locations, core facilities, or any cloud provider. Resilio Platform is also purpose-built for use across poor-quality networks such as VSAT, cell, and Wi-Fi.
So, which solution is right for you?
AWS Snowball and Azure Data Box are good for:
- Organizations that have people physically located in remote locations able to handle, manage, and ship devices;
- Places with zero internet connectivity where there’s no other option for collecting and transporting data;
- Organizations that want to perform one-time, one-way data migrations from remote locations into AWS or Azure.
Put simply, if you have remote teams in the field able to physically collect, handle, and manage devices, you could use AWS Snowball or Azure Data Box.
However, if you are continuously generating data at the edge and need the flexibility to store and transfer data from the edge on an ongoing basis using any type of device, Resilio Platform is most definitely worth a look. There’s simply no better way to reliably and continuously retrieve data from the edge across VSAT, cell, radio, and broadband.
Please get in touch to learn more about how Resilio can help you replicate data from the edge by scheduling a demo.
In this article, we’ll compare AWS Snowball, Azure Data Box, and Resilio Connect — including how they work as well as differences in pricing, operation, and features.
We’ll also discuss the downsides of using physical data migration devices and how Resilio Platform overcomes these challenges to provide reliable data transfer, sync, and replication from edge to cloud over any network, in any location. Specifically, Resilio Connect:
- Provides centralized management: Resilio works with almost any type of device, operating system, or cloud. You can install it on your existing IT infrastructure and manage data replication across your entire environment with one solution. You can manage Resilio through the Centralized Management Console, which gives you granular control over how replication occurs across your entire environment.
- Fully utilizes any network: Resilio’s WAN acceleration protocol enables it to fully utilize any network, no matter how low-quality or unreliable. It works with any type of connectivity, such as VSATs, cell (3G, 4G, 5G), Wi-Fi, broadband, and any IP connection. You can ingest data from any remote location, such as at sea or in countries with underdeveloped network infrastructure.
- Offers flexible replication: You can use Resilio to ingest, replicate, and sync data from edge to edge, edge to core (cloud or on-premises), and across cloud regions. Resilio can sync data in any direction, such as one-way, bidirectional, one-to-many, many-to-one, and N-way.
- Delivers reliability: Resilio’s P2P architecture eliminates single points of failure. And it can dynamically route around outages and downed networks.
- Keeps data secure: Resilio secures data at rest and in transit with AES 256-bit encryption and protects data with other built-in security features.
Organizations use Resilio Platform to ingest, sync, and replicate data for media workflows (Turner Sports, Innovative), gaming (Wargaming, Larian Studios), remote operations (Mercedes-Benz, Buckeye Power Sales), and more. If you want to learn how Resilio Platform can help your business, schedule a demo with our team.
How AWS Snowball, Azure Data Box, and Resilio Solution Work
We’ll begin by quickly reviewing the basic operations of each solution, including the types of devices you can order, device specifications, and how to set up and use them.
AWS Snowball Setup and Operation
To get an AWS Snowball device, you have to navigate to the AWS Snow Family Console, and choose between one of two device types.
- AWS Snowball Edge Compute Optimized: This device offers powerful computing resources for machine learning, full motion video analysis, analytics, and local computing stack use cases — such as 104 vCPUs, 416 GiB of memory, and an optional NVIDIA Tesla V100 GPU. It provides 28 TB usable NVMe SSD capacity for Amazon S3-compatible storage or EBS-compatible block volumes.
- AWS Snowball Edge Storage Optimized: This device is better suited to large, one-time data migrations and recurring migrations. It provides 80TB HDD or 210 TB NVMe (capacity for Amazon S3-compatible object storage). The 80 TB device provides 40 vCPUs, and 80 GB of RAM (the 210TB device only supports data migration use cases currently).
You’ll receive your device in 4-6 business days (which is long enough as is, but can be even longer in remote environments).
You can use AWS OpsHub to unlock the device, connect it to your LAN (local area network), transfer data, and launch Amazon EC2 instances.
Once data is uploaded to the device, ship it back to AWS where their team will upload your data to an AWS S3 storage bucket. Upon completion, your data is erased in compliance with NIST standards.
Snowball also has features that make it better suited to certain use cases, including its:
Computing Resources
While you can use both solutions for one-time or periodic data migrations, AWS Snowball includes computing resources in their Snowball Edge Compute Optimized device that make it useful for machine learning, full-motion video analysis, analytics, and local computing stacks (e.g., 104 vCPUs, 416 GiB of memory, etc.).
On the other hand, Azure Data Box Heavy can store 1 PB of data and is better suited to ingesting massive workloads.
Faster Transfer Speeds
Snowball devices have high-speed network connections that support 10 Gbps to 100 Gbps links. Data Box’s network interfaces support speeds of 1 Gbps or 10 Gbps. This means that Snowball can upload your data and make it available to you in the cloud faster than Data Box.
Support for Kubernetes Deployments
You can use Snowball devices for Kubernetes deployments. Data Box does not offer this capability.
More resources: You can find instructions for ordering an AWS Snowball device in the AWS developer guides.
Azure Data Box Setup and Operation
Ordering Azure Data Box devices is pretty similar to ordering Snowball devices. Navigate to your Azure cloud portal. Create a resource, and choose Azure Data Box. Then select the Data Box product you want:
- Data Box: The specs for this device include 100 TB of storage space, 1×1/10 Gbps RJ45, and a 2×10 Gbps SFP+ interface. It encrypts data with AES 256-bit encryption.
- Data Box Disk: The specs for this device include 40 TB of storage space and a USB/SATA II or III interface. It encrypts data with AES 256-bit encryption.
- Data Box Heavy: The specs for this device include 1 Petabyte of storage space, 4×1 Gbps RJ45, and a 4×40 Gbps QSFP+ interface. It encrypts data with AES 256-bit encryption.
You can track the status of your order in the Azure portal. You can use Azure’s local web UI to set up the device, connect to your LAN, view copy logs, and contact Microsoft support.
Once you’ve imported data onto the device, ship it back to Microsoft where their team will upload it onto your Azure account (Azure Block Blob, Page Blob, or Azure Files) or a Managed Disk.
Some users feel that the naming structure for Azure services is more straightforward, logical, and easier to understand than AWS. AWS also has many more storage services, so learning which ones you need and how to use them takes a lot more time. Therefore, Azure has less of a learning curve.
Some users of both cloud storage platforms also feel that Azure has some features that make it more user-friendly, such as the ability to perform Google searches on your cloud resources in Azure. For example, you can find any of your virtual machines by typing in the name of your VM in Azure search while AWS does not provide the ability to search your EC2 instances.
Additional resources: Learn more about how to set up and use a Data Box on Azure’s Quickstart page.
Resilio Platform Installation and Operation
Unlike Snowball and Data Box, Resilio Platform is a software-only file replication and sync solution that ingests, syncs, and replicates data online, across any network.
Resilio Connect’s proprietary WAN acceleration protocol enables it to transfer data predictably and reliably over any network — no matter how low-quality the connection is. There’s no need to order or manage physical devices and data transfers can still occur in real-time.
We have clients that continuously transfer data to geographically distributed sea vessels (e.g., Northern Marine Group, Lindblad Expeditions) and across countries with underdeveloped networks like Uganda (e.g., Shifo).
You can use Resilio for one-time or periodic data ingestion, but it’s primarily for continuous data ingestion, replication, and synchronization.
Simply install Resilio agents on the endpoints you want to replicate and sync (e.g., your cloud storage accounts and on-prem devices).
Use Resilio’s Management Console to configure how you want replication to occur:
- Real-time, scheduled, or manual synchronization
- Which files you want replicated to which endpoints
- Replication parameters
- Bandwidth allocation at each endpoint (which can be configured by time of the day and day of the week)
You can automate replication to occur exactly as you want it to, and program how Resilio responds to replication errors (such as file conflict resolution) so it can be operated with minimal human intervention.
AWS Snowball vs Azure Data Box vs Resilio Connect: Key Differences
While there are many small differences between Azure Data Box, AWS Snowball, and Resilio Connect, we’ll cover the most consequential ones below.
Pricing
The pricing structures of Azure Data Box and AWS Snowball consist of:
- A service fee: For each transfer job, you’ll be charged a service fee for every device you use. This applies to the first 10 days you possess the device.
- Extra day fees: For every additional day you possess the device after the first 10 days, you’ll be charged an extra day fee.
- Shipping costs: You’re responsible for round-trip shipping costs, which will vary depending on your location.
- Ingest charges: Transferring data into either cloud is free. But they charge for transferring data out of their cloud and across cloud regions. Pricing varies by region. See Azure and AWSpricing pages for regional data transfer rates.
As a software-only solution, Resilio Connect’s pricing structure is very different. We detail each pricing model below.
How AWS Snowball Pricing Is Calculated
The table below displays Service and Extra Day Fees for AWS Snowball devices.
Snowball Edge Storage Optimized (data transfer only) | Snowball Edge Storage Optimized (with EC2 instances) | Snowball Edge Compute Optimized (data transfer only) | Snowball Edge Compute Optimized (with EC2 instances) | |
Service Fee | $300 per device | $500 per device | $1,250 per device | $1,650 per device |
Extra Day Fee | $30 per day | $50 per day | $125 per day | $165 per day |
Long-Term AWS Snowball Plans
For continuous data transfer jobs, AWS offers 1 and 3-year long-term pricing plans (paid upfront). Long-term plans eliminate Extra Day Fees and allow you to save up to 62%.
Snowball Edge Storage Optimized Device (with EC2 instances) | Snowball Edge Compute Optimized Device (with EC2 instances) | |
1-Year | $15,330 ($42 per day) | $29,200 ($80 per day) |
3-Year | $38,325 ($35 per day) | $60,225 ($55 per day) |
AWS Snowball Pricing Examples
To give you a better idea of how much you can expect to pay for using AWS Snowball, we provide two example scenarios below:
Example Scenario 1
If you want to use a Snowball Storage Optimized device to import 50TB of data into the AWS EU (Ireland) region, your costs will include:
- Service Fee (per device): $300
- Ingestion charge: Free (since you’re importing into AWS)
- Total: $300
Example Scenario 2
If you want to export 50TB (51,200 GB) of data out of the Asia Pacific (Seoul) Region using one Snowball Edge Storage Optimized device. You use the device for 18 days on-site (you don’t get charged for shipping days). Your charges would be:
- Service Fee (per device): $400
- Extra Day Fees (minus 1-day shipping to you and 1-day shipping back to AWS): $40 x 6 extra days = $240
- Data transfer charges: $0.08 (per GB) x 51, 200 GB = $4,096
- Total: $4,736 (plus shipping)
For more information on pricing, visit the AWS Snowball Pricing page.
How Azure Data Box Pricing Is Calculated
The table below displays Service and Extra Day Fees for Azure Data Box devices.
Data Box | Data Box Disk | Data Box Heavy | |
Service Fee | $250 | $50 | $4,000 |
Extra Day Fee | $15 per day | $10 per day | $100 per day |
Standard Shipping Fee | $95 | $30 | (Starting at) $1,500 |
Long-Term Azure Data Box Plans
Azure doesn’t list their long-term pricing plans publicly. For more information, contact an Azure Sales Specialist.
Azure Data Box Pricing Examples
To give you a better idea of how much you can expect to pay for using Azure Data Box, we provide two example scenarios below:
Example Scenario 1
If you want to use a Data Box device to import 50TB of data into an Azure North America region, your costs will include:
- Service Fee (per device): $250
- Standard shipping: $95
- Ingestion charge: Free (since you’re importing into Azure)
- Total: $345
Example Scenario 2
If you want to export 50TB (51,200 GB) of data between North American regions using one Data Box device and you use the device for 18 days on-site (you don’t get charged for shipping days), your costs will include:
- Service Fee (per device): $250
- Extra Day Fees (minus 1-day shipping to you and 1-day shipping back to AWS): $15 x 6 extra days = $90
- Data transfer charges: $0.02 (per GB) x 51, 200 GB = $1,024
- Standard shipping: $95
- Total: $1,459
To calculate costs for your specific scenario, use the Azure Pricing Calculator.
How Resilio Pricing Is Calculated
Resilio Platform is a software-only, agent-based file synchronization software system that’s based on a yearly subscription model (paid annually). Agents are licensed based on volume and capability and offered in a variety of packages.
For more information on Resilio Platform pricing for your specific use case, contact our team.
Using Resilio Platform for Continuous Ingest and Sync
Offline data transfer appliances like AWS Snowball and Azure Data Box have several downsides, such as:
- The complexity involved in handling, setting up, and managing devices. And if devices are lost or damaged, your data can be lost and you can be charged steep prices to replace the devices.
- The time it takes for your data to appear in a location where you can access it. You must wait to receive the device, download your data onto it, ship it back, and wait for them to upload it.
- Their reliance on AWS DataSync or Azure File Sync for online data transfer to other cloud regions and on-prem endpoints. These solutions are limited by the network you’re using. They replicate slowly and inefficiently. And they create single points of failure — replication can be impeded by a slow network or downed device, delaying sync across your entire environment.
- In multi-cloud scenarios, you’ll need different online data transfer solutions to replicate your data across cloud regions and on-prem devices in each cloud. This requires you to buy, learn, and use more tools, increasing the complexity and cost of your data storage operation.
Resilio Platform overcomes each of these challenges and provides a superior solution for continuous data ingest at the edge.
Fully Utilize Any Network with WAN Acceleration
The primary reason organizations use data transfer devices rather than software solutions is because they operate in an area with poor, unreliable network connectivity.
But Resilio Platform utilizes a proprietary WAN acceleration protocol known as Zero Gravity Transport™ (ZGT). ZGT optimizes network traffic so it can fully utilize any network — such as WANs, VSATs, cell (3G, 4G, 5G), Wi-Fi, broadband, and any IP connection. It accomplishes this using:
- Congestion control: ZGT uses a congestion control algorithm that constantly probes the RTT of a network to calculate the ideal rate of packet transfer and maintain a uniform packet distribution over time.
- Interval acknowledgements: Rather than sending an acknowledgement after every packet receipt (as other protocols do), ZGT sends acknowledgements in groups to reduce traffic over the network.
- Delayed retransmission: Rather than retransmitting lost packets with each acknowledgement (as other protocols do), ZGT retransmits lost packets in groups once per RTT to reduce unnecessary retransmissions.
Resilio Platform also works reliably over edge networks. Clients like Shifo use Resilio to sync files in countries with underdeveloped network infrastructure, such as Uganda.
Case Study: Northern Marine Group
Northern Marine Group is a company that provides shipping and marine services to customers across the globe. They use Resilio Platform to distribute updates to their fleet of sea vessels over VSAT connections.
“In the future, Resilio Platform and its API will vastly improve our database replication mechanism. More importantly, it will allow us to realistically increase the frequency of replication, so that the information passing back and forth comes far more in real-time and we’ll be able to automatically remediate the discrepancies.”
Reliable, Scalable Replication with P2P Architecture
While AWS DataSync and Azure File Sync replicate files using point-to-point replication, Resilio Platform replicates files using a proprietary P2P (peer-to-peer) replication architecture.
Point-to-point replication occurs in one of two models:
- Hub-and-spoke replication: This consists of a hub server and several remote servers. The remote servers can’t transfer data to each other. They must first transfer data to the hub server, which then transfers the data to each remote server one by one.
- Follow-the-sun replication: In this model, one server replicates data to another sequentially — i.e., Server 1 replicates to Server 2; Server 2 replicates to Server 3; Server 3 to Server 4; and so forth.
Point-to-point replication is unreliable and inefficient. Replication can only occur between two servers at a time, which slows the full synchronization of your environment (particularly in large environments with lots of servers, large files, and/or a large number of files). It also creates single points of failure — if one server fails or is on a slow network, it can delay replication for the rest of your environment.
In P2P replication, every server can communicate with every other server simultaneously, providing:
Fast Replication and Sync Speeds
In a P2P architecture, every server can work together concurrently to synchronize your entire environment. This occurs with the help of a process called file chunking, which breaks a file down into multiple chunks that can transfer independently of each other.
For example, imagine you want to sync a file across five servers. Resilio can break that file into five chunks. Server 1 can share the first chunk with Server 2. Server 2 can share the first chunk with another server immediately, even before it receives the rest of the file. With every server working together, Resilio can sync your entire system 3-10x faster than point-to-point solutions like DataSync and Azure File Sync.
Organic Scalability
A P2P environment is organically scalable, as every server you add to your environment increases replication speed and resources. Resilio can sync hundreds of servers in roughly the same time that most point-to-point solutions can sync two.
Resilio accomplishes this through a process known as horizontal scale-out replication, which enables it to reach speeds of up to 100+ Gbps per cluster. In tests, our engineers successfully replicated a 1 Terabyte dataset across Azure regions in 90 seconds.
Resilio can also replicate files of any size or number (we successfully synced 450+ million files in a single job).
Omnidirectional Replication
P2P replication enables Resilio to sync in any direction, such as one-way, bidirectional, one-to-many, many-to-one, and N-way sync.
One-to-many replication is especially useful for software update distribution, and allows DevOps teams for software companies like VoiceBase to significantly reduce software distribution time.
N-way sync is great for remote and hybrid workforce scenarios. Geographically distributed teams can collaborate on files from anywhere in the world. Any changes made to a file can immediately sync to every other office and remote user, so everyone always has the most up-to-date version of files.
N-way sync also enables Resilio to provide Active-Active High Availability in hot-site disaster recovery use cases. By syncing data across your entire environment N-way, Resilio effectively turns every server into a backup server. File changes are synchronized in real-time and, in the event of a disaster, every server can work together to bring your applications back online — enabling Resilio to achieve sub-five-second RPOs and RTOs within minutes of an outage.
Reliability and Fault Tolerance
Resilio is an incredibly reliable and resilient solution. If any endpoint or network goes down, the necessary files or services can be retrieved from any other endpoint in your environment.
If a file transfer fails mid-way through, Resilio can perform a checksum restart to resume the transfer where it left off. It can also dynamically reroute around outages and will retry transfers until they’re complete — ensuring your data is always delivered to its destination.
Case Study: Lindblad Expeditions
Lindblad Expeditions is an organization that provides ecotourism and nature photography. They use Resilio Platform to maintain their fleet software and sync data between ships and HQ.
“Resilio Platform has been a game changer. It’s proven to be reliable in file transfer, It’s proven to be reliable in database replication. Overall, Resilio Platform jobs are very easy to set up and they just work!”
Efficient File Transfer and Access
Resilio Platform optimizes resource management in order to maximize efficiency.
Some of these optimizations are built into Resilio. For example, Resilio’s engineers were able to significantly reduce the amount of physical memory needed per agent on replication jobs by 80%. Resilio also optimizes indexing, startup time, CPU usage, merging, storage I/O, and end-to-end transport — which is why it can efficiently sync millions of files in any direction.
Users can adjust replication parameters (such as hashing, buffer size, and more) to better suit their replication jobs. And you can even create bandwidth profiles that govern how much bandwidth each endpoint is allocated at certain times of the day and on certain days of the week.
Efficiency is one of the reasons Deutsche Aircraft replaced DFSR with Resilio Connect in order to sync business-critical files across their DFS Namespace. According to their IT manager Mathias Reitinger:
“We have a 10Gbps network but prefer to use under 1Gbps for data transfer and replication. With Resilio, we’re able to keep that down to 250Mbps during the day and at night move back up to 1Gpbs.”
You can also use Resilio Platform as an efficient object storage gateway that provides low-latency access to files stored in any cloud via SMB (Server Message Block) and NFS (Network File System) protocols. Remote and hybrid workforces can browse and download files from a unified interface that operates much like Microsoft OneDrive.
Resilio’s object storage gateway solution allows organizations to enhance productivity and efficiency through:
- Automated, selective syncs: You can automate synchronization so employees can focus on their tasks without the need to perform manual syncs. And you can control which files get synced to which endpoints.
- Selective file caching: You can control which specific files get stored on local devices. You can store frequently accessed files locally to reduce data egress costs and give employees quicker access to necessary files. And you can store infrequently accessed files in long-term cloud storage.
- Full and partial downloads: Employees can perform partial downloads to get quicker access to the portions of files they need and minimize data transfer.
- Unified access interface: Everyone can browse and download files from a unified interface that operates much like Microsoft OneDrive.
Case Study: Skywalker Sound
After the COVID-19 pandemic, Skywalker Sound switched to a remote work business model. They used Resilio Platform to enable their team of geographically distributed employees to collaborate on and synchronize files.
“I think the concept of people being more distributed and being able to work from home is here to stay. With Resilio in place and building in other aspects around it to make the content secure, we’ve achieved that in such a way that we can have a more diverse workforce and give our staff more flexibility with how they can work in the future.”
Learn more about how Resilio Platform helped Skywalker Sound support a remote work model.
Granular, Centralized Control over Your Entire Environment
You can install Resilio Platform agents on your existing IT infrastructure and begin replicating in as little as two hours. Resilio is a hardware and cloud-agnostic solution that works with just about any:
- Device: Install Resilio agents on servers, desktops, laptops, virtual machines, mobile devices, data centers, and NAS/DAS/SAN devices.
- Operating system: Resilio supports any operating system, such as Linux, Windows, MacOS, Unix, FreeBSD, Ubuntu, OpenBSD, and more.
- S-3 compatible cloud object storage provider: Resilio supports almost any cloud storage provider, such as Azure Blob Storage, Amazon Web Services, Google Cloud Platform, Wasabi, MinIO, Backblaze, and more.
Resilio provides you with granular control over your entire hybrid and multi-cloud replication environment from one centralized location. You can use Resilio’s Management Console to:
- Adjust key replication parameters like buffer size, bandwidth usage policies, and disk I/O threads.
- Set up metrics, rules, and notification parameters.
- Set user permissions that govern who can access what files and folders.
- Set up notifications to be delivered via email or Webhooks.
- Set bandwidth allocation.
- Script any type of automation or functionality with Resilio’s REST API.
Native Security Features
Resilio Platform offers built-in features that secure data end-to-end, such as:
- AES encryption: Data is encrypted at rest and in transit with AES 256-bit encryption.
- Mutual authentication: Before receiving any files, each endpoint must provide an authentication key. This prevents data from being delivered to unauthorized, unapproved locations.
- Forward secrecy: Resilio uses one-time session encryption keys.
- Cryptographic integrity validation: Resilio validates data end-to-end to ensure files remain intact and uncorrupted.
Ingest, Sync, and Replicate Data Reliably across Any Cloud with Resilio Connect
While Snowball and Data Box are useful solutions for transferring data from low-connectivity areas to the cloud, Resilio Platform provides a superior option because it:
- Uses WAN acceleration: Resilio overcomes the major problem with online data transfer from the edge — poor network connectivity. Its proprietary WAN acceleration technology enables it to fully utilize any network to transfer data from the edge to your cloud and across cloud regions and on-prem devices.
- Provides fast transfer and sync: Resilio’s P2P architecture enables it to sync data 3-10x faster than traditional replication solutions. It can also sync in any direction, including one-way, bidirectional, one-to-many, many-to-one, and N-way.
- Scales organically: As your environment grows, your replication speed and resources increase. Resilio’s horizontal scale-out replication can reach speeds of 100 Gbps per server.
- Reliably syncs data: Resilio’s P2P architecture and WAN acceleration technology enable it to reliably sync data over any network. There’s no single point of failure, and it can route around downed networks and devices. And it enables Active-Active High Availability. Resilio simply just works all the time.
- Provides efficient transfer and access: You can use Resilio as an efficient object storage gateway that provides low-latency access to files. It includes features that enable you to reduce data migration and egress costs while increasing productivity.
- Provides granular control over your entire environment: Resilio works with just about any device, cloud, and operating system. You can use Resilio to replicate and manage data across your entire environment from one location.
- Keeps data secure: Resilio’s built-in security features encrypt data end-to-end and keep it secure.
Organizations use Resilio Platform to ingest, sync, and replicate data for media workflows (Turner Sports, Innovative), gaming (Wargaming, Larian Studios), remote operations (Mercedes-Benz, Buckeye Power Sales), and more. If you want to learn how Resilio Platform can help your business, schedule a demo with our team.