A New Cloud Storage Tier?

When you are paying around $10–$20 per GB per month for the fastest in-memory cache layer in the cloud, $0.015 per GB per month for Nearline object storage might seem cheap as chips. But as anyone managing petabyte-scale storage estates knows, the older and colder the data becomes, the less frequently accessed it is, the more sensitive cloud storage economics become.

Unlike expensive NVMe or DRAM, which lose data when powered off, cold data is persistent. Because cloud archive data is often measured in petabytes (PBs) rather than gigabytes, costs mount quickly. Even at a fraction of a cent per GB, storing multiple petabytes over many years becomes a strategic cost centre. Add retrieval costs, egress fees, air-conditioning, power for spinning disks, and operational management, and what looks cheap on paper can balloon in practice.


S3 to Cold Cloud Storage

As data ages, organisations weigh up two priorities: minimising cost and maintaining accessibility. Hyperscaler deep-archive tiers (e.g., Amazon Glacier Deep Archive or Azure Archive) are often positioned as the cheapest cloud option, at as little as $0.001 per GB. But these tiers are typically used for insurance backups, data not expected to be touched again, because retrieving anything substantial can be slow, operationally awkward, and surprisingly costly. In short, hyperscaler deep cold archive storage offerings are not designed to serve up production data.

The reality is that production data doesn’t vanish after its “hot” phase. Old production datasets, scientific experiments, digital media assets, compliance records, clinical archives, or AI training datasets may only be accessed occasionally; however, when needed, fast access without punitive egress fees is essential.

As Gartner noted in a 2023 report on cloud storage economics, “egress and retrieval charges remain a significant source of unplanned expenditure, often exceeding initial storage budgets for archival data projects” (Gartner, 2023).

So organisations typically store infrequently accessed production data on the HDD-based nearline storage tier rather than the environmentally friendly, lower-cost cold storage tier.


Real-World Use Cases

Media & Entertainment: Studios like A+E Networks generate petabytes of broadcast content each year. Archiving to hyperscaler storage may be cost-effective short term, but production demands quick access to old footage for remastering or licensing. As A+E’s CTO once said: “Access is everything… we don’t just want to store our content, we want to monetise it later” (Broadcast Tech, 2022).

Research & Academia: CERN produces over a petabyte of physics data every day. Much of this must be archived, but scientists still need fast retrieval for future analysis. In such environments, retrieval penalties aren’t just financial but can slow discovery (CERN Annual Report).

Healthcare & Life Sciences: Hospitals are required to retain patient imaging data for decades. A 2022 study in Applied Radiology highlighted how rising retrieval costs made AI-driven diagnostic model training prohibitively expensive for some institutions.

Financial Services: Compliance regulations often require firms to retain records for 7–10 years. A major European bank noted in 2021: “Our cloud storage costs tripled in three years, largely due to retrieval fees during audits” (The Banker).

AI & ML: AI training pipelines often handle terabytes to petabytes of historical data—from medical images to AI-generated datasets. Traditional hyperscaler archive storage incurs not only lengthy restore delays but also substantial egress costs, making experimentation, iterative model retraining and inference workloads expensive. (arXiv)

These industries highlight a shared pain point: long-term data retention with unpredictable retrieval costs if data is archived.


The On-Premises Parallel

Solutions such as Spectra Logic BlackPearl pioneered on-premises nearline S3 interfaces to complex high-capacity cold storage. Enterprises can store petabytes of production data with object storage, low-cost archive back-end infrastructure, simply storing data with little to no power requirements, all accessible via S3-like APIs, but crucially without retrieval fees. For some organisations, this on-premises model still makes perfect sense.

The barrier, however, has always been scale and skills. High upfront hardware costs, ongoing technology refreshes, and shrinking pools of storage administrators mean these complex systems are out of reach for many mid-sized and scaling organisations.


Cloud Cold Storage Becomes Nearline

Cloud Storage Providers are now challenging this status quo by offering the same enterprise-grade complex cloud cold object-based storage with a simple, seamless nearline S3 user experience and characteristics in the cloud:

  • $0.0015 per GB per month storage costs, comparable to deep archive pricing.

  • Access to streamed data in minutes, without rehydration delays.

  • No egress or retrieval fees, removing the budget shock of restores.

For example:

  • Wasabi offers hot cloud storage at around $7 per TB per month, chosen by organisations like the Boston Red Sox to manage historical video and analytics data (Wasabi Case Study).

  • Customers like Verizon Media leverage Backblaze B2 Cloud Storage to handle large-scale nearline workloads at predictable costs (Backblaze Customers).

  • CloudColdStorage stores data at deep archive prices and then streams that data back in minutes without expensive fees.

This allows IT teams to design data pipelines where:

  • Active workloads stay on high-performance SSD/NVMe.

  • Nearline low latency data moves to non-hyperscaler cloud storage (~$7 per TB) with hot storage performance.

  • Cold production archives shift to more efficient cloud cold storage (~$1.55 per TB), without any restore penalties.

The result: significant long-term savings and greater predictability across multi-tier cloud storage strategies.

Cloud cold storage Pearline's production data at archive data costs

Store older production data for cents and use less energy retaining it.


The Numbers at Petabyte Scale

Here’s how costs stack up over 10 years for a 1 PB dataset:

  • Hyperscaler block general-purpose SSD-based low ms latency hot storage: $22,800,000 (plus egress fees)

  • Hyperscaler Nearline storage: $1,800,000 (plus egress fees and retrieval fees of around $1M per year to retrieve 1PB)

  • Hyperscaler Deep Archive storage: $120,000 to $300,000 (plus retrieval and egress fees of around $6M per year to retrieve 1PB)

  • Non-hyperscaler hot storage: $840,000 (limited or no egress fees or api calls)

  • Cloud cold storage: $186,000 (no retrieval or egress fees)

At this scale, avoiding unpredictable retrieval charges is as impactful as reducing raw $/GB pricing.


Summary

  • Cloud archive vs nearline is no longer a binary choice. Emerging cloud storage tiers blur the line by combining archive pricing with nearline access.

  • Egress and retrieval fees are the silent budget killer; eliminating them is as important as lowering storage rates.

• Industries like healthcare, finance, media, and research all need affordable, predictable long-term storage that still enables access for compliance, monetisation, and AI training datasets.


Next Steps

1. Audit Your Data Estate: Identify how much of your current “hot” data actually needs to be kept spinning with no latency. You might be surprised at how much data can be kept cooler.

2. Model Retrieval Patterns: Estimate retrieval demand over 3–5 years; this is where hidden costs emerge.

3. Compare Cloud Archive vs Nearline Providers: Assess hyperscaler storage tiers alongside alternatives like Wasabi, Backblaze and CloudColdStorage.

4. Factor in AI and Analytics Growth: Retrieval demands are rising, not falling.

5. Run a Pilot: Benchmark access speeds, costs, and API compatibility before committing at scale.


Conclusion

The economics of cloud storage are shifting. What was once “deep archive only” is becoming viable for nearline production data. By adopting cloud cold storage with no egress or restore penalties, organisations can finally align long-term data strategy with predictable budgets.

Whether it’s a hospital safeguarding decades of scans, a studio monetising its film library, or a research lab mining historic datasets for AI, the ability to store at deep archive prices and retrieve at nearline speeds is a genuine new tier in the storage landscape, one that could redefine cloud storage economics over the next decade.

Next
Next

🎄 Be Prepared: Why Christmas Stocking and Cloud Storage Have More in Common Than You Think