How to Configure and Speed Up S3 Cross-Account Replication

Eleanor Parker

Eleanor Parker

How to Configure & Speed Up S3 Cross-Account Replication

Replicating objects across Amazon S3 buckets owned by the same AWS account is relatively straightforward. 

Things get trickier when dealing with cross-account replication, as you need to complete more steps (similar to how AWS S3 cross-region replication requires more considerations than same-region replication).

Specifically, you need to get the destination bucket’s owner account to grant the source bucket owner permission to replicate objects. There are also some auxiliary factors if you’re using Object Lock or AWS KMS (Key Management Service). 

In this guide, we’ll explain the main steps for configuring S3 replication when the destination bucket is owned by a different AWS account.

But before we dive in, it’s worth noting many users experience additional frustrations with S3 Replication beyond the extra setup required for use cases like cross-account and cross-region replication. Mainly:

  • S3 Replication can be slow and unreliable. Most objects should be replicated within 15 minutes, but delays of up to a few hours can occur — especially when replicating large S3 objects across AWS regions. These delays are unacceptable for organizations that depend on delivering accurate data across the globe.
  • It can be difficult to find out exactly what went wrong when S3 replication stops working. Since lots of factors affect the replication process — like replication rules, IAM users and role permissions, and ACLs — debugging replication failures is often very time-consuming.

That’s why, in the second part of this article, we’ll show you how you can overcome these issues with Resilio Platform — our agent-based replication solution. 

Note: You can learn more about Resilio Platform on our site and schedule a demo with our team.

Specifically, you’ll learn how Resilio Platform can help you:

  • Achieve fast and reliable replication speeds (as fast as you need to go) using Resilio’s proven P2P replication solution and WAN transfer technology. On average, Resilio Platform is 10-20x faster than conventional transfer, replication, and sync solutions. Our engineers have validated 100Gbps+ replication speeds within and across cloud regions. 

  • Overcome latency and packet loss across all cloud regions, regardless of the distance between them. Resilio enables full utilization of allocated bandwidth across multiple locations or regions, across any distance. 
  • Replicate your data across AWS regions and AWS storage services (object, file, block), as well as any other cloud and on-prem environments. Replication can be in real-time, on-demand, or on a schedule.
  • Set up, manage, and control the entire replication process from a single, central console. It’s easy to visually track and monitor all jobs, centrally. All jobs can be automated via scripts and APIs. 

Companies like Blizzard, Microsoft, and Warner Brothers have used Resilio to achieve high-speed, reliable data transfer, sync, and replication. To learn how Resilio Platform can help your organization experience those benefits as well, schedule a demo with our team.

How to Configure S3 Cross-Account Replication

First, enable versioning for the source and destination buckets. 

This isn’t a cross-account consideration but it’s a key prerequisite for successful replication across AWS S3 buckets, so it’s always good to keep it in mind. Check out this AWS tutorial for details and examples of how to enable versioning using the console, REST API, SDKs, and Command Line Interface (AWS CLI).

Next, you need to have two different credentialed profiles set for the AWS CLI. Profiles are collections of settings that can be stored in the config and credentials files. You’ll use the two different profiles for CLI commands related to the source and destination buckets, respectively.

Now, you have to add a bucket policy on the destination bucket that lets the owner of the source bucket replicate objects. AWS’ documentation offers the following example policy:

{
   "Version":"2012-10-17",
   "Id":"",
   "Statement":[
      {
         "Sid":"Set permissions for objects",
         "Effect":"Allow",
         "Principal":{
            "AWS":"arn:aws:iam::source-bucket-acct-ID:role/service-role/source-acct-IAM-role"
         },
         "Action":["s3:ReplicateObject", "s3:ReplicateDelete"],
         "Resource":"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*"
      },
      {
         "Sid":"Set permissions on bucket",
         "Effect":"Allow",
         "Principal":{
            "AWS":"arn:aws:iam::source-bucket-acct-ID:role/service-role/source-acct-IAM-role"
         },
         "Action":["s3:List*", "s3:GetBucketVersioning", "s3:PutBucketVersioning"],
         "Resource":"arn:aws:s3:::DOC-EXAMPLE-BUCKET"
      }
   ]
}

Note: When editing this policy, make sure to provide the AWS account ID of the source bucket owner and the destination bucket name.

Also, if you’re using Object Lock to store objects using a write-once-read-many model and KMS keys for server-side encryption, you need to create a separate IAM policy with special permissions. The process is a bit more complex but described in detail in AWS’ documentation.

Finally, remember that by default, the owner of the source object also becomes the owner of the replicas, even if the destination bucket is owned by another account. 

If you want to change that, you’ll need to:

  • Add an owner override option to the replication configuration.
  • Let S3 change replica ownership. This is done by adding permissions for the s3:ObjectOwnerOverrideToBucketOwner action in the policy associated with the IAM role that lets S3 replicate objects.
  • Get the owner of the destination bucket to grant the owner of the source bucket permissions for the s3:ObjectOwnerOverrideToBucketOwner action.

Again, AWS has documentation with useful examples showing how to do this in more detail.

How Resilio Platform Can Help You Achieve Faster and More Reliable Replication

Resilio Platform is a real-time replication software that uses P2P file transfer and proprietary WAN optimization technology to deliver the fastest replication speeds in the industry (up to 20+ Gbps per server). 

Brands like Turner Sports, Blockhead, and Skywalker Sound use our software to achieve near-instant replication speeds, simplify the replication process, and handle massive data workloads, like replicating large objects over many endpoints.

You can also use Resilio Platform to:

  • Bring data into AWS and replicate it across regions, other AWS services, and even other cloud providers.
  • Overcome latency and packet loss across all cloud regions, irrespective of distance.
  • Get granular control and complete visibility over the replication process from a single place (as opposed to relying on 3-5 AWS services for different use cases).

In the next section, we discuss exactly how Resilio Platform works with a focus on its main advantages over AWS’ replication solutions.

P2P Replication Architecture & WAN Optimization: Overcome Replication Latency Across any Network

Resilio Platform uses a unique P2P (peer-to-peer) replication architecture to ensure the fastest-possible replication times. 

This is in stark contrast to most replication solutions (including AWS’) that use one of two point-to-point replication topologies:

  1. Client-server: With this topology, the servers in an environment are separated into clients and hubs. The hub can replicate data across all other servers, while the clients are limited to only sharing data with the hub. So, if a client wants to replicate data across the environment, it must first transfer that data to the hub, which is a performance bottleneck and a single point of failure.
  2. “Follow-the-sun”: Here, replication can only occur sequentially — from Server 1 to Server 2; then from Server 2 to Server 3; then to Server 4, and so on.

The replication process is serialized between at most 2 systems (or endpoints); each system is a bottleneck and single point of failure. Replication delays worsen as you add more endpoints, or the volume of data grows.

In contrast, Resilio’s P2P architecture lets all servers share data with each other. In the Resilio model, each server running a Resilio agent works together to process and replicate files. The load can be distributed across as many servers as needed. This offers built-in availability, data protection, and data integrity. As a result, replication becomes much faster and more efficient, as there’s no reliance on a single device.

Resilio Platform also uses file chunking to separate files into pieces and transfer them independently from each other. The combination of P2P replication and chucking lets every server replicate data at the same time, resulting in 3-10x faster replication speeds than traditional point-to-point solutions.

P2P vs Client-Server architecture

Additionally, Resilio Platform uses a proprietary UDP-based transfer protocol called Zero Gravity Transport (ZGT™) to maximize replication speed across any network and overcome the impact of latency and packet loss. 

ZGT is designed to optimize transfers over WANs (wide area networks) by using a congestion control algorithm that calculates the ideal send rate by periodically probing the time it takes for the destination to receive a packet (i.e., the Round Trip Time).

This means you get built-in WAN optimization technology for all data replication, sync, and transfer jobs across cloud regions, storage solutions, and cloud providers, (not just AWS).

Plus, ZGT:

  • Creates a uniform packet distribution over time to avoid network overload.
  • Sends packet acknowledgements for a group of packets, as opposed to each individual packet.
  • Reduces unnecessary retransmissions by retransmitting lost packets once per Round Trip Time.

Resilio Platform vs Competitors: 10GB file to 10 endpoints over 10 Mbps link

One of the world’s largest marine construction companies uses Resilio Platform to transfer data reliably over any connection. 

Before Resilio, the company was using Microsoft SCCM with Distribution Points (DPs) on each one of their vessels. But with frequent network disruptions, this setup couldn’t get the systems to reliably update, resulting in mission-critical software being left vulnerable to attack. 

Once their team started using Resilio Connect, they were able to completely overcome intermittent internet connections. The company now uses Resilio on 30 of its vessels to guarantee high-speed, reliable data transfers from shore to ship and back. 

“Prior to Connect, we were using SCCM. Our vessel environment operates over extremely slow and unreliable connections and SCCM couldn’t deliver updates and wasted costly satellite bandwidth in such an environment. With Connect, our speed of deployments goes much faster than SCCM, its post-transfer scripts are powerful and execute instantly. Now we have a bullet-proof solution that can overcome our complex environment.” 

For more details on their challenges and results, check out the complete case study.

Note: We also have a special calculator for calculating how much time you can save with Resilio Connect, depending on your use case.

Efficiency & N-Way Replication

Resilio Connect’s P2P architecture allows every server in your environment to take part in replicating and transferring data. As a result, you can.

  • Replicate and sync data in any direction — one-to-one, two-way, one-to-many, many-to-one, and N-way. 
  • Replicate objects across AWS services and regions (and other cloud providers) quickly and reliably.

Resilio is Peer-to-Peer (P2P)

You can also make the replication process more efficient and lower your S3 replication costs with:

  • Transparent Selective Sync (TSS): You can use TSS to reduce the amount of data being replicated (and as a result your AWS bill) by easily driving replication only of the objects you’ve selected.
  • Smart Routing: This feature gives you the ability to keep all your traffic on the AWS network or move data to a remote edge network. That way, you can always choose between keeping your traffic on a more expensive WAN or a cheaper LAN.
  • Local file storage: With Resilio, you can store frequently accessed data locally. That way, you don’t have to constantly download the same data from the cloud, which can drastically reduce your AWS egress charges.

Lastly, our engineering team is always working on improving Resilio’s efficiency. For example, in a recent update, we managed to reduce the average memory footprint required on replication jobs by 80% by optimizing time, merging, CPU usage, indexing, storage io, and end-to-end transport.

Organic Scalability

Scaling up traditional client-server environments is costly. As you add more clients, you need to buy more hub servers. You also need to constantly find ways to balance the network load between the hubs.

And as we said, performance tends to degrade when you add endpoints and try to replicate more (or bigger objects) over large distances.

In contrast, Resilio Platform is an organically scalable solution. The more servers you add, the better our software performs.For example, Resilio can synchronize data 50% faster than point-to-point solutions in a 1:2 scenario and 500% faster in a 1:10 scenario.

Put simply, Resilio Platform lets you easily scale out omnidirectional replication and move data across AWS regions and services, as well as your data centers and even other cloud providers.

Perfect for Disaster Recovery

Traditional data transfer and replication topologies have a single point of failure. 

For instance, in a client-server scenario, the hub server is the only one that can replicate data to clients. Plus, the hub must finish transferring data to a specific client before it can begin transferring to others. As a result, the entire replication process can be slowed down or interrupted if:

  • The hub goes down or experiences a delay.
  • Any of the other devices are impeded by network issues, outages, or any other problem. 

Thanks to its P2P architecture, Resilio Platform doesn’t have a single point of failure. If one device goes down, our software can still replicate or retrieve objects from the nearest available device by automatically routing around the outage. 

This makes Resilio an ideal hot-site disaster recovery software for organizations that need to meet sub-five-second RPOs (Recovery Point Objectives) and RTOs (Recovery Time Objectives) within minutes of an outage.

It also lets companies make their data globally available by quickly replicating files across cloud regions, services, and on-prem storage. End-users can always access files from the server closest to them, ensuring data can be retrieved as quickly as possible.

AES 256 Encryption: Get Built-in, End-to-End Security

When using Resilio Connect, you don’t need to worry about purchasing separate security software. Instead, our solution comes with built-in security features for keeping your data safe, like:

  • AES 256 encryption for data at rest and in transit.
  • Data immutability, meaning Resilio stores copies of your data in the public cloud, protecting you from data loss and ransomware.
  • Cryptographic data integrity validation to ensure your data always arrives at its destination uncorrupted.
  • Mutually authenticate endpoints to ensure data only arrives at designated endpoints. 
  • Granular access control from the Central Management Console, which lets you easily control who can access your data.

These and all other Resilio Platform security features have been verified by 3rd-party security experts to guarantee they’re up to the strictest data safety standards.

Central Management Console: Set Up, Manage, and Debug Replication from One Interface

Complexity has always been one of the biggest challenges when working with Amazon Web Services.

For example, there are four different options just for replicating data across S3 buckets — S3 Replication, AWS DataSync, S3 Batch Operations Copy, and S3 CopyObject API. Things get even messier when you look into all their other features for monitoring and speeding up replication like S3 Replication Time Control (RTC), CloudWatch, or AWS Elastic Disaster Recovery (EDS).

With Resilio Connect, you don’t need to worry about choosing the correct service or feature for each replication use case. Instead, you can track, manage, and debug the entire replication process — whether it’s between buckets in the same or different regions, or even between cloud providers — from a single Central Management Console.

You can choose to store the console anywhere else that suits your needs, e.g., in EC2 and in any Windows or Linux instance (virtual or physical), located in the cloud or on-premises.

As you can see below, the console gives you a centralized view of the entire replication environment, including the total number of files, bytes, maximum speed, and much more. 

Resilio Platform Overview, General Info, Statistics

You can use the console to set up:

  • Key replication parameters like buffer size, bandwidth usage policies, and disk I/O threads.
  • Metrics, rules, and notification parameters. 
  • User permissions.
  • Webhooks.

MixHits Radio is one of the companies that experienced massive time savings thanks to the ability to manage and troubleshoot everything from a single place. 

Here’s what their CEO Gary Hanna had to say about working with Resilio Connect:

“We have gone from spending 15 hours on average per week troubleshooting conflicts in the prior solution to spending no time at all with Resilio. We configure jobs once in the Resilio Platform Management Console and never have to look at it again.” 

For a deep dive into their challenges and how Resilio helped solve them, check out the full case study on our website.

Cloud-Agnostic Software: Deploy Resilio Platform on Any Infrastructure and Replicate Without Limitations

The Amazon S3 replication solutions are built to keep users within the AWS ecosystem. They don’t work natively with other cloud providers and there are hefty charges for moving your data out of the AWS cloud.

Conversely, Resilio Platform is a cloud-agnostic solution that helps you avoid vendor lock-in. This means you can use our software to ingest, move, and replicate data across any cloud provider, service, or region.

For example, you can:

  • Ingest your data into AWS and replicate it across S3 buckets (including different storage classes) in geographically dispersed regions and other AWS services with minimal latency.
  • Sync and replicate data across other cloud providers — like Azure, GCP, Wasabi, Backblaze, or any other S3-compatible storage — from a single interface.
  • Use any mix of hybrid and on-prem storage, like DAS, NAS, or SAN.

Put simply, Resilio Platform can be deployed on any infrastructure — in the cloud, on the edge, on-premise, or in a hybrid cloud environment. And since Resilio is a software-only solution, there’s no need to buy new hardware and train your team on it. 

You can continue using your servers, networks, and desktops, as well as storage solutions like NAS devices, hard drives, SSDs, and so on. Our software also works with popular operating systems, like Mac, Android, Linux, and Windows file servers, and virtualization platforms like VMware and Citrix.

The flexibility and wide technology support allow Resilio Platform to be deployed on your existing infrastructure and begin replicating in as little as two hours.

Get Industry-Leading Replication Speed and Reliability with Resilio Connect

Resilio Platform is the ideal replication and sync solution for companies that depend on data being replicated incredibly fast across locations all over the world.

Our software is:

  • Lighting-fast, thanks to its unique P2P architecture and WAN optimization technology. 
  • Organically scalable as it performs better after you add more endpoints. 
  • Highly resilient and fault-tolerant, as there’s no single point of failure. When one device goes down, our software can automatically route around the outage and replicate or retrieve objects from the nearest available device.

  • Secure, thanks to AES 256 encryption and other security features that have been verified by 3rd-party security experts.
  • Easy to manage since you can control all aspects of data replication, transfer, and access (even in multi or hybrid cloud scenarios) from a single interface.
  • Flexible, because you can deploy it on your existing infrastructure, without needing to buy new hardware or migrate data. Plus, you can use it to replicate data across any cloud provider and on-prem storage.

Ready for a live demo?  We’d love to hear from you!  Please schedule a demo with our team.

Overview

Learn how to set up S3 cross-account replication & get faster and more reliable replication with Resilio Connect.
Related Posts