Guest Column | February 9, 2023

Getting To The Bottom Of Cloud Tiering

By Rohn Noirot, Nasuni

Cloud-computing-data-storage-GettyImages-1367696835

Marketing hype notwithstanding, cloud tiering isn’t based on any technological innovations, and it isn’t a new, improved approach to managing data. Cloud tiering is similar to Information Lifecycle Management (ILM), a strategy that’s been around for years. In the current iteration, storage vendors are simply layering cloud features onto conventional hardware, then calling the combination by a new name. But cloud tiering, like ILM before it, is fundamentally flawed in the way it manages capacity.

Enterprises that have bought into cloud tiering are now realizing that cloud solutions relying on tiering data — rather than syncing it — are inherently inefficient at scale.   

What’s The Difference? 

The concept of tiering data, or organizing it according to certain attributes, is nothing new. Historically, before the shift to the cloud, businesses prioritized data that’s frequently accessed or mission-critical — what some organizations label “hot, front-end or production data” must be immediately and reliably available, with little or no tolerance for latency or downtime. Without tiering, it’s expensive to maintain a high level of availability, so data that’s less active or less important is typically housed in lower-cost storage systems that don’t offer instant, on-demand access. 

Traditionally, organizations have archived data moving from the front end toward the back end on drives, disks, and tapes. Many companies refer to this type of tiering as Information Lifecycle Management, which comprises of the policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost-effective IT infrastructure — from the time information is conceived through its final disposition. Historically, data was often moved or tiered to a slower storage class which included tapes.

Today, companies are increasingly moving data to the cloud, which is obviously light years ahead of on-premises storage devices and policies. Yet they’re still using storage strategies originally devised for decades-old hardware, albeit with a few tweaks. Coupling state-of-the-art technology with stale strategies makes little sense, but it’s exactly what’s happening with tiered cloud storage. It’s impossible to optimize the cloud’s revolutionary capabilities — limitless capacity and availability, on-demand and at low cost — in that scenario. To get the most from the cloud, organizations should be syncing data rather than tiering it.

Whether a company is executing ILM, tiering to lower-class storage tiering, moving data to the cloud, or syncing to the cloud, all have a common goal: address the capacity limitations of network-attached local storage. But there’s a critical difference. Tiering is finite and users must monitor their data tiers to avoid bumping up against the limits. Syncing, on the other hand, was designed for unlimited storage of objects and therefore offers infinite capacity. Although tiering to the cloud does allow users to leverage the unlimited, cost-effective capacity of the cloud, it’s not an easy operation, and it can’t be done incrementally.

 The Cloud Made Easy 

Syncing to the cloud takes the opposite approach. Everything is streamed to the cloud and maintained there, on the back end. The cloud becomes the single source of truth, and automatic caching ensures that the front end is “hot.” To generalize the two approaches: Companies create storage in the cloud and move data to hot or standard storage tiers. Users may have multiple devices that need to access the data, on the backbone of a single slow HTTP(s) data transport protocol. To tier data, administrators must run storage policies to tier less critical data to less expensive stores. On the other hand, syncing data to the cloud allows users fast access and versatility through multiple devices with more available robust protocols like SMB, CIFS, NFS, or SMB over Quic using UDP. Syncing occurs automatically in the background; the gold copy of the data and all versions are in the cloud. Data also can be synced to cold or standard storage tiers.

With syncing, data needed at the edge flows back the instant a user clicks on a file. Everything happens in real time. Furthermore, it’s automatic, meaning there’s no need to decide what you’re going to move to the tier. Deletion of data from the edge also takes place quickly because it’s already been designated as the source of truth. That enables operators to run an infinite volume with uniform performance across it all. No need to decide what’s on the back end and what’s on the edge. The system automatically does it, which is an enormous improvement. 

While traditional storage vendors may label their tiering technology with the term “cloud,” that’s not entirely accurate. Some storage companies excel at spinning up their controllers in the cloud, but their scale is limited because they were intended to run on local storage. A true cloud system should scale to infinity without adding complexity or forcing users to continually add and migrate data from tier to tier. And it should take place automatically to avoid capacity and data protection issues.

Complexity can undermine even the best technology, and if a system becomes more complex as it scales, that system will eventually break down. Cloud-native file systems that are architected to sync data can scale up without an increase in complexity, operating just as smoothly with an international enterprise as they do with one site. Organizations seeking the best solution for their data storage needs should settle for nothing less.

About The Author

Rohn Noirot is Senior Global Project Executive at Nasuni.