AWS Snowball helps companies copy their data into the Amazon Web Services (AWS) cloud via portable hardware devices.
These devices are shipped by AWS all over the world. Users then copy their data on them and ship them back to Amazon. Once received, AWS staff transfers the data to an S3 bucket and securely erases it from the device.
This makes Snowball a great option if you’re:
- Operating in locations with non-existent internet connections, as it lets you move offline data or remote storage to the cloud.
- Tackling data migration projects by copying large amounts of data from a single source location into AWS.
However, if you’re continuously ingesting data from the edge in remote places, or have a subset of data at the edge that needs to remain there or be updated, the online delivery model to AWS doesn’t work so well.
In these situations, you may want to consider a solution like Resilio Platform — a highly reliable file ingest software that runs on any device and fully utilizes any network. You can learn more about Resilio as an alternative to AWS Snowball Edge and AWS Snowcone here.
In this guide, you’ll learn how AWS Snowball works in detail — including its use cases, benefits, and workflow.
We’ll also explore the key downsides that teams face when using it, like:
- Having to handle, set up, and manage each device.
- Spending time copying data from existing devices or storage systems to a Snowball device.
- Waiting up to six business days to receive their device and then another several days for AWS to get it back.
- Relying on AWS DataSync for online data transfer. As a point-to-point solution, DataSync is unreliable as it creates single points of failure and it has transfer speeds that are limited to the slowest endpoint. Common issues in remote environments (like unreliable networks) can degrade DataSync’s performance, which makes it a poor choice if you’re going to leave a Snowball device in the field to collect and ingest data into AWS.
To help you overcome these downsides in cases where you’re looking to collect and control files on the edge on a recurring basis, we’ll also explore Resilio Connect— our reliable, efficient, and secure file ingest solution designed to work incredibly well in low connectivity or extreme edge locations.
Our solution is:
- Software-only, so it works with your existing infrastructure and doesn’t require you to buy new hardware. Resilio can be installed on your current servers, desktops, mobile devices, or any edge device and you can start using it in as little as two hours.
- Flexible, with several use cases around data ingest, transfer, sync, and replication. For example, Resilio Platform can move data from the extreme edge to any other location, including edge to edge, edge to core (whether that’s in your data centers or in a cloud of your choice), or across cloud regions.
- Easy to manage, as everything is controlled through a Central Management Console (even in a multi or hybrid cloud scenario), so you don’t have to juggle a bunch of different tools.
- Extremely fast and reliable, due to its P2P architecture and UDP-based WAN optimization. Resilio Platform can collect and ingest data from many remote locations — from remote sites to fleets of vehicles at the extreme edge — to any number of other locations. It does not require a connection to a single server or for a single server to always be online. It can also fully utilize any network, like VSAT, cell, Wi-Fi, broadband, and more, making it ideal for extreme environments.
- Secure by default, as it comes with AES 256 encryption for data at rest and in transit, as well as other security features.
Below, we’ll provide a quick overview of how AWS Snowball works. Then, we’ll cover Resilio Platform (a software-only ingest solution) and three hardware-based cloud provider alternatives for AWS Snowball.
Customers rely on Resilio Platform to ingest, sync, and replicate data for media workflows (Turner Sports, Innovative), gaming (Wargaming, Larian Studios), remote operations (Mercedes-Benz, Buckeye Power Sales), and more. If you want to learn how Resilio Platform can help your business, schedule a demo with our team.
How AWS Snowball Works
To start using AWS Snowball, you need to open the AWS Snow Family console and select between:
- AWS Snowball Edge Storage Optimized devices that have 40 vCPUs, and 80 GiB or 104vCPU and 416GB of RAM of memory. In terms of storage, they offer 80TB or 210TB NVMe capacity for block volumes and Amazon S3-compatible object storage. This makes them a better option for big cloud migrations and recurring, large-scale data transfers.
- AWS Snowball Edge Compute Optimized devices that have 104 vCPUs, 416 GiB of memory, and an optional NVIDIA Tesla V100 GPU. For storage, the device offers 28 TB usable NVMe SSD capacity for Amazon S3-compatible storage or EBS-compatible block volumes. The more powerful compute capabilities make these devices better for use cases like machine learning, analytics, and video analysis.
To select and order your device of choice, you need to create a job with an Amazon S3 bucket. This is a simple task that you can perform by following the instructions in the AWS developer guides.
AWS prepares and ships the device to you in four to six business days, after which you can unlock it with AWS OpsHub and connect it to your LAN (local area network). AWS OpsHub is actually used to manage the device, transfer data, and launch Amazon EC2 instances.
Once you’re done, you ship the device back to AWS. Their team uploads your data to an S3 storage bucket, after which the data is securely erased from the device.
In terms of pricing, you pay only for your use of the AWS Snowball Edge device and for data transfer out of AWS. You can learn more about Snowball’s pricing structure here.
AWS Snowball’s Benefits and Limitations
The biggest benefit of AWS Snowball is the ability to move offline data and local storage into the cloud from all over the world. This includes places with extreme conditions and little or no network connectivity.
Even if it takes up to two weeks to ship the Snowball device back and forth (a major downside, discussed below), for users in remote locations with no connectivity, it’s still worthwhile to have any means to transfer data to an S3 bucket.
This can include businesses in various industries, like architecture and construction, earth sciences, energy, and even gaming that collect data at the extreme edge — for example, via drones, cameras, vehicles, or other IoT devices.
As an ingest solution, Snowball also provides a few other benefits as well, including:
- End-to-end device tracking via an E-ink shipping label.
- Secure data erasure that follows the National Institute of Standards and Technology guidelines for media sanitization.
- A TPM (Trusted Platform Module) that provides a hardware root of trust, which helps protect the integrity of the device.
But despite these benefits, AWS Snowball has certain limitations that can’t be avoided.
Limitation #1: Shipping and Managing New Hardware in Extreme Environments
Shipping devices to and from remote locations is difficult, expensive, and stressful. The transport often has to take place over rough terrain at extreme temperatures, which creates legitimate risks for your data’s safety.
Limitation #2: Uncertainty around Data Availability
It can easily take over two weeks to receive the device from AWS, set it up, ship it back to AWS, and have someone from their team upload your data to the cloud. And this is assuming no interruptions to the process.
But there can easily be delays during the device preparation and shipping. Receiving, configuring, and staging data onto the appliance in a remote environment can also take much longer than expected. Plus, there’s the time needed to synchronize the source and destination datasets. All this makes it very difficult to know when your data will be available in a target location.
Limitation #3: An Unreliable Online Data Transfer Option
Lastly, AWS Snowball also relies on AWS DataSync for online data transfer. While DataSync can be useful for one-time data migrations from on-prem to the cloud, its point-to-point nature makes it very unreliable.
Its transfer speeds are limited to the slowest endpoint, with each device being a single point of failure. If a connection is lost for any reason, which is common in extreme environments, data transfers will stop and need to restart either from the beginning or (best case) where they left off.
Overall, DataSync and other TCP-based transfer solutions do a very poor job in scenarios with intermittent low bandwidth connectivity.
Read our full article here: AWS DataSync: Strengths, Weaknesses, Alternatives & More
Resilio Connect: The Best Software-Only AWS Snowball Alternative
Resilio Connect is our file ingest solution that uses a P2P (peer-to-peer) architecture and proprietary, UDP-based WAN optimization technology to deliver the fastest transfer, sync, and replication speeds in the industry.
Our solution is a superior alternative to AWS Snowball because it’s:
- Software-only, so you don’t need to worry about learning new devices or shipping them in extreme environments.
- Extremely fast, efficient, and reliable, thanks to its P2P architecture and WAN optimization technology (we’ve seen sync speeds of 100+ Gbps site to site).
- Simple to manage, since all data ingest, transfer, sync, and replication tasks are controlled from a Central Management Console.
- Flexible, as you can use it with your existing infrastructure, including on-prem, in the cloud (e.g., AWS, Azure, GCP, or any other provider), or in a hybrid cloud environment.
In the next sections, we’ll explore these benefits in more detail, so you can get a good understanding of how Resilio Platform can help your business.
Flexible, Software-Only Solution: Use Your Existing Devices without Needing New Hardware
Unlike AWS Snowball, Resilio Platform doesn’t require you to buy new hardware and train your team on it.
Instead, our solution is agent-based, allowing it to be deployed on just about any device or infrastructure. For example, you can set up Resilio on:
- Industry-standard servers, networks, desktops, and laptops, as well as DAS, NAS, and SAN storage volumes.
- Most popular operating systems, like Windows, Mac, Linux (a variety of distros), iOS, Android, FreeBSD, Unix, and more.
- Virtual machines, like VMware, Citrix, and hypervisors.
- Any cloud storage service, like AWS, Google Cloud Platform, Azure Blobs, MinIO, Backblaze, Wasabi, and more.
- Mobile devices, and much more.
Because Resilio doesn’t require you to buy new hardware, you can set up our solution and begin syncing your data in as little as two hours. As a result, you get much better predictability in terms of when your data will be available in a certain region.
Resilio Platform is also a cloud-agnostic solution. While AWS Snowball, DataSync, and other proprietary ingest and sync solutions are built to keep you in the AWS ecosystem, Resilio gives you complete control over where your data is stored.
For example, you can use Resilio Platform to:
- Ingest your data into any AWS region.
- Browse and sync files on file, block, or object storage via popular tools on operating systems like Mac and Windows.
- Transfer, sync, or replicate data across a variety of file storage solutions and cloud storage services like AWS, Azure, GCP, Wasabi, Backblaze, and more.
In short, with Resilio Connect, you have the freedom to build and tear down projects in any cloud at an instant, without worrying about vendor lock-in.
Thanks to its versatility, you can even use Resilio Platform as an addition to AWS Snowball.
For example, if you’re operating in an area with no network connectivity, you’ll need a hardware solution like Snowball to get your data into the cloud. Once your data is in the cloud, you can use Resilio for all your transfer, sync, and replication needs. That way you won’t have to rely on AWS solutions (like S3 Replication or S3 Transfer Acceleration), which are complex to manage and limited in terms of transfer speeds.
P2P Architecture and WAN Optimization: Fast, Reliable, and Scalable Data Ingest in Any Environment
Traditional transfer, sync, and replication solutions, including DataSync, S3 Replication, and others offered by AWS use one of two types of point-to-point transfers:
- Hub-and-spoke, in which one server acts as a hub, while the others are designated as clients. The clients can’t share data with each other directly. Instead, they must first send their data to the server, which can spread it across the entire environment.
- Follow-the-sun, in which servers can only share data sequentially (i.e., Server 1 sends data to Server 2; then Server 2 sends data to Server 3, and so on).
Both models have severe disadvantages that make them unsuitable for many use cases, including syncing large files and large numbers of files, or syncing to many endpoints.
Data transfers are always limited to one device at a time. This inevitably leads to slower transfer speeds since the process can only take place between two devices at a time. Each device is also a single point of failure, so common issues in extreme environments — like slow networks or power outages — can easily obstruct the transfer process.
AWS’ solutions also use TCP for transfer over WANs (wide area networks), which further impacts transfer speeds. Packet loss and latency, which are defining characteristics of WANs, both disrupt TCP, leading to delays and an inability to make the most out of expensive WAN connections.
Resilio Connect’s P2P Architecture and UDP-based WAN Optimization technology can help you overcome these issues.
P2P Architecture
Resilio Platform uses a unique P2P (peer-to-peer) architecture to let every device in your environment take part in data transfers. There are two key benefits of this approach compared to standard point-to-point architectures:
- Transfers aren’t limited to one device at a time.
- There’s no single point of failure.
Plus, Resilio Platform uses file chunking to separate files into different pieces and transfer them independently. This process results in transfer speeds that are 3–10 times faster than traditional solutions.
The unique P2P architecture and the ability to hash files in chunks allows Resilio Platform to sync files in any direction:
- One-to-one.
- One-to-many.
- Many-to-one.
- N-way (or many-to-many). This is essential if you want to transfer, sync, or replicate data across many distributed endpoints. For example, if you have a remote workforce, N-way synchronization helps you ensure every change to shared files is instantly distributed to every office.
WAN Optimization
Resilio Platform uses a proprietary, UDP-based transfer protocol called Zero Gravity Transport™(ZGT) to maximize transfer speed across any network and overcome latency and packet loss.
ZGT is optimized for unreliable networks, allowing you to ingest, sync, and replicate data from the edge of a network to a centralized location. You can use any network (VSAT, broadband, Wi-Fi, cell, etc.) and any device while overcoming the most extreme conditions to collect and move many terabytes of data in predictable timeframes.
ZGT does this by:
- Maintaining a uniform rate of packet distribution over time. The rate is calculated with a congestion control algorithm that periodically probes Round Trip Time (RTT). This keeps our software informed about the transfer speed over any network.
- Reducing unnecessary retransmissions and sending out interval acknowledgements. Our solution sends out acknowledgements for a group of packets, instead of after receiving each packet. It also retransmits lost packets once per RTT. These techniques reduce unnecessary retransmissions and make the transfer and replication processes more efficient.
For a deeper dive into this topic, we have a detailed WAN optimization whitepaper.
You can also go to our speed calculator to estimate how much time Resilio’s technology can save your organization, based on your use case (server sync, remote work, fast file send, and so on).
Resilio Connect’s P2P architecture and USD-based WAN Optimization also make it an ideal disaster recovery solution because:
- There’s no single point of failure in the P2P architecture. If one device in your environment fails, our solution can always access data from the others. This means you can utilize all of your servers to achieve sub-five-second RPOs (Recovery Point Objectives) and RTOs (Recovery Time Objectives) within minutes of an outage.
- The ZGT transfer protocol detects bandwidth changes in real time. As a result, data is dynamically routed around failures to overcome latency and network congestion.
In short, Resilio Platform is perfect for implementing various disaster recovery strategies, like hot-site DR, warm-site DR, cold DR, and offsite copy.
For a real-life example of Resilio Connect’s impact, check our case study with the Northern Marine Group.
Before using Resilio, the company had to mail physical CDs with software updates, so it took weeks to troubleshoot all the installs on the ships. Thanks to Resilio, they now distribute and synchronize updates across their fleet of vessels in a much faster, simpler, and more reliable way.
Specifically, the time needed to bring the fleet into compliance dropped from six months to just two weeks — a 92% improvement.
Here’s what Paul Clark, Head of IT at Northern Marine Group, had to say about working with Resilio:
“Being able to use the scripting engine meant that we could essentially have the updates distribute, install, and report back on status automatically, allowing us to avoid the installation going wrong because of user error.”
Security by Default: Enterprise-Grade Data Protection Capabilities
Transferring data from the edge to a central location always carries risks, especially when shipping hardware devices back and forth.
That’s why we’ve built state-of-the-art security features right into Resilio Connect. This means you don’t have to buy separate security software to ensure your data arrives at its destination intact.
These features include:
- AES 256 encryption: Resilio encrypts your data at rest and in transit.
- Mutually authenticate endpoints: This guarantees data only arrives at designated endpoints.
- Cryptographic data integrity validation: This ensures your data always arrives at its destination uncorrupted.
- Data immutability: Our solution stores copies of your data in the public cloud to protect you from data loss and ransomware.
- And much more.
Lastly, all Resilio Platform security features have been verified by 3rd-party security experts to guarantee they’re up to the highest data protection standards.
Central Management Console: Simple, Centralized Management for All Data Ingest, Sync, and Replication Needs
Complexity has always been one of AWS’ biggest downsides. There are so many services for ingesting, transferring, syncing, and replicating data that it can be incredibly time-consuming to find the right option for your needs.
For example, you can use the AWS Snow Family (AWS Snowmobile for exabyte-scale migrations, Snowball, and Snowcone) to ingest data into the AWS cloud. From there, you can use S3 Replication to replicate it across one or more regions.
However, AWS offers other replication solutions that are more suitable to niche use cases, like DataSync, S3 Batch Operations Copy, and the S3 CopyObject API. If you want to speed up data transfers, you can also choose from a range of AWS services like S3 Transfer Acceleration. And this doesn’t even take into account auxiliary features for tracking and debugging these processes, as well as other services like AWS Lambda, EC2, and Direct Connect.
In contrast, Resilio Platform removes all of that complexity by letting you control all aspects of data ingest, transfer, sync, and replication from a Central Management Console.
The Console lets you easily set up:
- Key parameters like buffer size, bandwidth usage policies, and disk I/O threads.
- Metrics, rules, and notification parameters.
- User permissions.
- Webhooks.
You can also use it to control bandwidth allocation for each endpoint, collect logs, manage Resilio agents, and much more.
For example, MixHits Radio was able to massively reduce their troubleshooting times thanks to the ability to do everything from the Central Management Console:
“We have gone from spending 15 hours on average per week troubleshooting conflicts in the prior solution to spending no time at all with Resilio. We configure jobs once in the Resilio Platform Management Console and never have to look at it again.”
— Gary Hanna, CEO of MixHits Radio
Check out the full case study on our website for more details on how MixHits Radio’s team uses Resilio Connect.
You can also use the Central Management Console to increase efficiency and lower egress costs by:
- Downloading and synchronizing files on demand with Transparent Selective Sync (TSS). This capability lets you browse objects, select individual files, and download, partially download, or sync them according to your needs. That way, there’s no unnecessary data being transferred off-network or across regions.
- Choosing the most optimal network for your traffic with Smart Routing. For example, you move parts of your traffic to a remote edge network, so you don’t rely on expensive WAN connections all the time.
- Storing frequently accessed files on local devices. This results in lower AWS egress costs, as you don’t have to download the same files from the cloud every time you need them.
In terms of simplicity and ease of use, Resilio Platform also gives users access to their files — regardless if they’re in AWS, another cloud provider, or on-premises — from a simple user interface. The interface operates much like Microsoft OneDrive, making it very familiar and initiative for most users.
4 Top Alternatives to AWS Snowball
While AWS Snowball is among the most popular ways to ingest data from the edge to the cloud, it’s not the only tool for the job. Below, we’ll look at four other options, including our software-only ingest solution and three hardware-based services from other cloud providers.
1. Resilio Connect
As we explained earlier, Resilio Connect is an ideal AWS Snowball alternative for companies that need to transfer lots of data across large distances quickly and securely, without the hassle of handling new hardware.
Our solution is:
- Easy to set up due to its software-only nature that lets you use it on your existing infrastructure without needing to buy, learn, and ship new physical devices.
- Lightning-fast due to its P2P architecture and proprietary, UDP-based WAN optimization technology.
- Reliable due to the lack of single points of failure and the ability to route data dynamically around failures.
- Easy to manage due to the ability to manage all data ingest, sync, and replication tasks (even in a multi or hybrid cloud scenario) from a single place.
- Efficient due to the ability to store frequently accessed data locally, pin traffic to an optimal network, and more.
- More flexible due to its various deployment options (e.g., you can ingest data within or across AWS regions and services, or use other cloud providers and on-prem environments).
You can learn more about Resilio Platform by visiting our website and scheduling a live demo with our team.
2. Azure Data Box
Azure Data Box offers a series of different devices for moving terabytes to petabytes of data to the cloud using standard NAS protocols or tools like Robocopy.
These devices include:
- Data Box, which can copy data to 10 storage accounts and has 80 TB of usable capacity and a 1×1/10 Gbps RJ45, 2×10 Gbps SFP+ interface.
- Data Box Disk, which can copy data to one storage account and has 35 TB of usable capacity and a USB/SATA II, III interface.
- Data Box Heavy, which can copy data to 10 storage accounts and has 800 TB of usable capacity and a 4×1 Gbps RJ45, 4×40 Gbps QSFP+ interface.
All devices use either 128- or 256-bit encryption (AES) and are wiped clean after the data has been uploaded to the cloud.
3. Google Cloud Transfer Appliance
Google Cloud Transfer Appliance is a high-capacity storage device. It lets businesses transfer on-premises data to Google Storage. Google’s team helps organizations pick the right device and ships it to them. Once received, teams upload their data to the device (Windows systems use SCP or SSH to upload data to the appliance) and ship it back to a Google upload facility where the data gets transferred to the cloud.
4. Backblaze B2 Fireball
Backblaze B2 Fireball is an ingest service for migrating large databases from on-prem environments to the B2 Cloud Storage (which is Backblaze’s S3-compatible storage service). The Fireball device has a 96 TB storage array with 10 Gigabit Ethernet connectivity. According to the website, the device can move around 80 TB of data in a day or two using the 10 Gigabit Ethernet connection.
Use Resilio Platform to Collect and Manage Remote Data Quickly, Efficiently, and Reliably
Resilio Platform is the ideal software-only solution if you want to ingest data from the edge to any region in the cloud in a fast, reliable, and secure way.
Its UDP-based WAN Optimization technology helps you overcome latency and packet loss across any network, regardless of distance. Plus, the unique P2P Architecture makes it ideal for heavy workloads, such as transferring lots of data across geographically distributed endpoints.
Our solution is also:
- Flexible because it can be deployed on any cloud, on-prem, or hybrid cloud infrastructure. You can also use it to store and replicate data across any storage type — NAS, DAS, SAN, object storage, file storage, block storage, AWS storage, Azure storage, and much more.
- Easy to manage because you can set up, monitor, and control the entire replication process (even in multi or hybrid cloud scenarios) from one Central Management Console.
- Secure, thanks to the built-in AES 256 encryption and a plethora of other security features.
Ready for a live demo? Click here to schedule a demo with our team.