A North American Engineering and Construction Leader Improved VDI User Experience with Resilio Platform and Hybrid Cloud
Introduction
No job is too challenging for this North American engineering and construction leader. What began in 1884 with two hardworking brothers has grown into an employee-owned, forward-thinking, engineering powerhouse—today with revenues over $10 billion annually.
The company now employs about 25,000 workers. Some employees work remotely in the field supporting a variety of extreme engineering projects, from building stadiums to bridges to power plants. Irrespective of workers’ locations, these users also need timely and continuous access to mission critical applications and digital assets such as building plans, CAD applications and drawings, 3D designs, Microsoft Office, and a variety of other apps and data sets.
To meet the requirements of remote workers, the company’s Edge Computing Team, a strategic IT infrastructure team within the company, developed a multi-site Citrix and Microsoft FSLogix solution for Virtual Desktop Infrastructure (VDI), built on Azure NetApp Files, on-premise NetApp storage, Microsoft Azure—and powered by Resilio Platform for real-time profile synchronization.
Challenges
When a construction project gets underway, remote workers need timely access to hundreds of the latest applications and 3D design tools. All of these applications and digital assets are stored centrally on NetApp storage systems in a primary data center located in Omaha, NB, and also on Azure NetApp Files in (2) two Microsoft Azure data centers located in the East and South Central US regions.
The Edge Computing Team’s goals were both to continuously improve the VDI end-user experience and to meet Service Level Agreements (SLAs) for high availability (HA) and disaster recovery (DR). Other goals for the VDI deployment included:
Improving customer satisfaction and end-user adoption of VDI.
Supporting continuous active-active high availability across all sites, spanning their on-premise core and the two Microsoft Azure cloud sites.
Reducing recovery times for both planned and unplanned outages to meet the company’s recovery time objectives (RTOs).
Reducing replication times across sites to improve recovery points and ensure VDI user profiles are continuously available to all users from all sites.
Bringing strategic applications and larger data sets closer to users, where possible.
Working with the existing on-premise and cloud storage vendor, NetApp, and virtualization platform vendors, Citrix and Microsoft FSLogix.
According to the VDI project leader, a Citrix virtualization and enterprise storage veteran with over 15 years experience: “There was simply no way to achieve true active-active high availability (HA) across sites, prior to Resilio.”
Early in its planning, the project leader and his IT team researched a variety of approaches to VDI. They tested solutions ranging from global file systems to caching appliances. At first, the company looked into deploying global file systems on top of their NAS infrastructure. The latency incurred between the time a user logged in and the time it took for their VDI profile to be delivered and loaded, could be measured in minutes.
“Our VDI solution could not tolerate a 5-minute delay waiting for user profile data to be shipped across the wire and made available to end-users,” stated the project lead. “In some cases the Virtual Hard Disk (VHDx) files were considerably large. Prior to Resilio, all users had to be redirected back to our on-premise core in Omaha. This was cost prohibitive and in some cases impractical.” Adding, caching user profile data in Omaha was also not a good option because in cases where the user profile data had to be replicated on-demand,” stated the VDI project lead. “Shipping a large multi-GB VHDx user profile file across the wire at login time was impractical.”
Reducing the time-to-desktop was one essential goal. Features such as Microsoft FSLogix Cloud Cache were in production for about a year. While Cloud Cache offered simplicity, the feature incurred an unacceptable penalty in latency, or time-to-desktop. In some cases, using FSLogix Cloud Cache—which was backed by high-performance enterprise storage arrays—incurred a time-to-desktop between 60 to 90 seconds, depending on the use case. The company’s goal was to reduce this to 30 seconds on average.
Solution
The team deployed a hybrid cloud solution optimized for VDI, powered by Resilio Platform and Microsoft FSLogix. Resilio Platform is used for VDI Profile Sync with Citrix User Profile Management (UPM), Microsoft FSLogix, and Azure NetApp Files for hybrid cloud storage.
The solution supports up to 2,500 concurrent users (on average) spanning 3 all-active sites. “We want the server side apps as close to users as possible. On the front-end, we use Citrix Global Server Load Balancing (GSLB) to geo-locate users to the right data center,” stated the project lead. “We use Resilio Platform to keep all sites (on-premise and in Azure) synchronized with all user profiles. A user may geo-locate to any of the active sites at any time. The recovery point objective (RPO) for the synchronized user profiles and applications is about 5 minutes.
“When we first evaluated Resilio Platform, we thought the product was too good to be true for how well it worked!” Upon further testing, the solution proved to be well suited to VDI profile replication and as a DFS-R alternative used in combination with their storage and cloud vendors.
“With Resilio,” he added, “we can keep our edge synchronized with our core on-premise data center and 2 cloud locations. Resilio keeps all data updated and replicated. We found no other way to implement an active-active 3-node VDI host cluster across sites.”
The Resilio solution for VDI profile sync is well-suited to enterprises requiring faster, more predictable replication performance, active-active high availability, and faster user logon times, using the same on-premise or hybrid cloud hardware infrastructure that is deployed on site.
The solution enabled a hybrid cloud using Microsoft Azure Files and Blob storage, Azure NetApp Files, and the Microsoft Distributed File System (DFS). The IT team uses the Microsoft DFS namespace combined with Resilio Platform for replication across file servers. There are 500Mbps dedicated networks running between 2 Microsoft Azure cloud locations and the company’s on-premise data center in Omaha.
Impact
Since moving to Resilio Platform, the company is improving the VDI user experience and expanding user adoption. “A successful VDI deployment is all about customer satisfaction and user adoption. The fact we are expanding our VDI user base and not seeing many support calls testifies to how well the deployment is going. The more people adopting and using our VDI environment the better,” added the project leader.
The company also needed to meet its goals for time-to-desktop, high availability, DR, and cost reduction where possible. The company is now able to achieve true active-active high availability across all sites spanning on-premise and cloud, improve recovery times, and reduce the time-to-desktop. The team also measures user experience quality by monitoring support calls, measuring VDI latency over the network (their goal is 200 ms or lower), and user adoption.
With Resilio Platform, the company is seeing a 2-5x improvement in time-to-desktop. “Our goal is 30 seconds from user click to desktop,” stated the project leader. “We were seeing 45 to 90 seconds using FSLogix Cloud Cache alone and it was taking up a lot of storage IO. Moving to Resilio in combination with Microsoft FSLogix VHDLocations and Citrix, we were able to both achieve all-active HA and get our user login times down.”
Adding: “The time it takes from when a user logs on to being able to use their desktop is now about 15 to 30 seconds or so, depending on the use case.”
Prior to Resilio, achieving always-on active-active HA across all sites had been an elusive goal. With Resilio Platform, the company is able to both achieve true active-active HA across all sites as well as reduce recovery times for both planned and unplanned outages.
“We went from a 30-minute replication window using Microsoft DFS-R to a 5-minute lead time to have all data synchronized across all sites with Resilio Platform,” said the project lead. When users log off, all of their updates will be propagated and available across all sites within 5 minutes.
“If we have to recover, we need that to be as low of a recovery time as possible. Our target is 5 minutes from being offline to being live again on a backup copy. This is something we could not have done with our existing hybrid on-premise and cloud infrastructure alone.”
Another significant benefit of using Resilio Platform is location-independent, scale-out performance: the company is seeing a 3x performance boost since moving to Resilio Platform.
“We were averaging 2.5Gbps site-to-site replication performance before Resilio. Using Resilio Platform, our site-to-site performance is nearly 8Gbps. We do profile redirection (of 500 to 600 tiny files) and Resilio reliably moves all files to the other end in a predictable way. It’s very consistent,” stated the project lead.
This solution combines the best of the enterprise. Resilio Platform is fully compatible with Citrix, Microsoft FSLogix (using VHDLocations), NetApp storage and Azure NetApp Files, as well as Azure Files and Blobs. The on-premise Microsoft DFS infrastructure could also be synchronized with the Microsoft Azure cloud and backended by the on-premise NetApp gear. No new global file systems or hardware were needed.
“With Resilio, all data is synchronized in real-time across all of our sites,” says the project lead. “We use Azure Files and Blobs combined with NetApp filers on-premise. Now we can integrate with Azure Files cold storage and drop VHDs into an archive repository for user research. We can keep multiple blobs and disks in sync. And users have fast access to their VDI profiles, irrespective of where they are located.”
“From a DR perspective, we likely saved $8.6 million in remote access to our graphics-intensive 3D workstations,” stated the project lead. “These workstations are expensive and mission-critical. We can not afford to be offline for 96 hours if the applications are unavailable.”
By using Resilio Platform, stated the project lead, “Instead of having to move the entire 3D platform out to the edge for 90 users, we moved the users to the platform. Users are efficiently re-directed back to the data center housing the 3D applications. The only other option would have been to duplicate the entire 3D infrastructure (all of the hardware and software). We did not want to do that for 90 users when you can use Resilio Platform and bring the users to the 3D apps.”
“This deployment was also about making VDI possible for our incredibly innovative business. It’s about making end users’ lives better; and, in turn, increasing adoption. The fact that we are seeing day-over-day growth speaks to users being satisfied and spreading the (good) word across the company. Our users are liking VDI more and more. We really like working with Resilio!”
Overview
Innovative Fortune 500 company delivers hybrid cloud Virtual Desktop Infrastructure (VDI) solution, powered by Resilio Platform, Citrix, Microsoft Azure, Azure NetApp Files, and NetApp storage.
Resilio Platform for VDI benefits summary:
3x performance boost in site-to-site replication
2-5x speed up in time-to-desktop (Resilio defines time-to-desktop (TTD) as the total time required for a user to remotely logon, load, and access their virtual desktop.)
$8.6 million saved in access to on-premise 3D applications and data.