Digital Preservation
Key Challenges in Sustainable Digital Preservation
Digital preservation faces mounting obstacles due to increasing data volumes, rising maintenance costs, and reliance on specialist staff. Traditional systems involve proprietary storage, specialised workflows, and high licensing/maintenance costs, factors that threaten long-term sustainability, especially when future budgets, staffing, and technologies are uncertain.
The sustainability problem for digital preservation is compounded by:
Dependency on niche expertise.
Costly, inflexible systems.
Unpredictable scalability and exit strategies.
Risks of data inaccessibility or loss over time.
Sustainability, in this context, means minimising dependence on uncontrollable variables, enabling organisations to adapt to change without risking preservation integrity.
A sustainable digital preservation workflow model
More inclusive, open, and scalable digital preservation model is required. The core strategy is to embed preservation activities of digital assets into normal data workflows, making them accessible to all stakeholders regardless of digital preservation expertise.
What a digital preservation workflow should look like?
Ingest
Files are submitted through a Deposit Service, scanned for duplicates, viruses, and file formats etc.
Metadata is extracted and verified.
Using Fedora as a repository could store files in OCFL (Oxford Common File Layout) in the cloud, supporting versioning and metadata storage alongside binary content.
Use Fedora to ensure data integrity by using checksums (SHA-256 and MD5) and transactional processes.
Preservation
Verified files are then stored in a Fedora repository with preservation copies sent to Cold Cloud Storage for the long-term.
A Chain of Integrity then tracks and verifies checksums from source to preservation storage.
Access & Management
The Workbench UI offers a central access point for users to monitor workflows, appraise content, update metadata, flag issues, generate reports, and manage users.
Access platforms like CUDL (Cambridge University Digital Library) can provide public availability as needed.
Architecture & Tools
This approach to digital preservation could be built using microservices, and the system can scale up or down easily.
Components are open source and standards-compliant, ensuring transparency and flexibility, storage hardware and platforms are designed for ease of access and costed for the long haul .
Key tools include:
Fedora 6 as the repository operating system.
OCFL on S3 for file structure
Workbench UI for management
Cloud Cold Storage for low cost, sustainable, read optimised long-term storage
A Future-Proof, Inclusive Model
This model eliminates the bottleneck of requiring a “digital preservation person” to access or manage preserved content. Instead, by making the repository a shared workspace accessible through the Workbench, it democratise access and responsibility. This inclusive, automation-driven, and cloud cold storage backed system provides a resilient solution to the sustainability challenge and by providing dedicated media for each workflow, tenant or dataset. Cloud Cold Storage can recover preserved digital assets without extensive delay.
In summary, by prioritising openness, automation, shared responsibility, low cost, environmental aware storage and adaptability, this approach to a digital preservation infrastructure offers a blueprint for how institutions can sustainably manage and preserve digital assets for the long haul in the face of escalating scale, costs and complexity.
This approach to a sustainable and open digital preservation model is similar to that deployed by the Cambridge University Library, you can read about their implementation here.
Managed digital preservation models can also be set up by companies like Arkivum and Penwern