Archiving & Preservation
Digital preservation, archiving tools, web archiving, and data hoarding
Subcategories
Replace Popular SaaS
15 Tools
ArchiveBox
27KSelf-hosted web archiving tool that saves pages as HTML, PDF, screenshots, and WARC files from bookmarks, history, or RSS feeds.
CKAN
5KCKAN is a self-hosted archiving & preservation replacement for Socrata.
Papra
4KSelf-hosted document management tool that provides document management platform.
Wayback
2.2KFor archiving & preservation, Wayback is a self-hosted solution that provides toolkit for archiving webpages to the Internet Archive, archive.today, IPFS,...
Open Archiver
1.8KOpen Archiver lets you run email archiving solution with full-text search and eDiscovery search features entirely on your own server.
mail-archiver
1.7KMail-archiver is a C#-based application that provides web application for archiving.
Bichon
1.5KBichon lets you run lightweight e-mail archiver entirely on your own server.
Ganymede
928Ganymede is a Go-based application that provides twitch VOD and live stream archiving platform. Includes a rendered chat for each archive.
Omeka S
475Released under GPL-3.0, Omeka S provides web publishing platform for institutions interested in connecting digital cultural heritage on self-hosted...
ArchivesSpace
415ArchivesSpace gives you archives information management application for managing and providing Web access to archives on your own infrastructure.
Collective Access - Providence
362Collective Access - Providence lets you run highly configurable Web-based framework for management entirely on your own server.
Piler
279Piler gives you feature-rich email archiving solution on your own infrastructure.
Eonvelope
198Released under AGPL-3.0, Eonvelope provides email archiving software on self-hosted infrastructure.
Webarchive
188Self-hosted archiving & preservation tool that provides lightweight _wayback machine_ that creates HTML and PDF files from your bookmarks.
Webcap
Webcap is a self-hosted digital archiving tool that provides web archiving and change detection.
Why Self-Host Your Archiving Infrastructure?
Digital preservation and archiving address a problem that commercial platforms cannot solve: long-term, reliable access to content on your own terms. Web content disappears constantly — pages are deleted, sites go offline, social media posts are removed. Email archives held by Gmail or Outlook are accessible only as long as those providers choose to maintain your account. Research institutions, journalists, libraries, and privacy-conscious individuals all have legitimate needs for archiving infrastructure that does not depend on commercial availability.
ArchiveBox is the most capable self-hosted web archiving tool, saving complete snapshots of web pages including HTML, screenshots, PDFs, and media files to a searchable local archive. It integrates with browser extensions to archive pages on demand and supports scheduled archiving of bookmarks or RSS feeds. The Wayback Machine’s open source software stack is also available for institutional deployment through tools like Heritrix, enabling organizations to run archive infrastructure equivalent to the Internet Archive.
For institutional and research archives, ArchivesSpace and CollectiveAccess provide professional-grade collection management systems used by museums, libraries, and archives to catalog physical and digital collections. CKAN is an open source data portal platform used by governments and research institutions to publish and manage open data. Email archiving through tools like Bichon and Piler ensures that organizational communication history is preserved in a searchable format independent of whatever email platform you currently use. Self-hosted archiving infrastructure guarantees that important content remains accessible regardless of what happens to the original source or commercial platform.