Support Long-term Thinking
Support Long-term Thinking

Long Now Partners with GitHub on its Long-term Archive Program for Open Source Code

by Ahmed Kabil on November 13th, 02019

Long Now is pleased to announce that we have partnered with GitHub on its new archive program to preserve open source software for future generations. 

The archive represents a significant step in averting a potential future digital dark age, when much of the software that powers modern civilization could be lost to bit rot. Taking its lessons from past examples when crucial cultural knowledge was lost, such as the Great Library of Alexandria (which was burned multiple times between 48 BCE 00640 CE) and the Roman recipe for concrete, the GitHub Archive is employing a LOCKSS (“Lots Of Copies Keep Stuff Safe”) approach to preserving open source code for the future. 

“We will protect this priceless knowledge by storing multiple copies, on an ongoing basis, across various data formats and locations,” GitHub says, “including a very-long-term archive designed to last at least 1,000 years.”That long-term archive is the GitHub Arctic Code Vault, in the Arctic World Archive in Svalbard, Norway—an archival facility 250 meters beneath the Arctic permafrost. The Arctic World Archive is adjacent to the Svalbard Global Seed Vault, and aims to preserve the world’s data in much the same way the Seed Vault preserves plant seeds. GitHub intends to store every public GitHub repository on film reels coated with iron oxide powder, which can be readable for 1,000 years using either a computer or a magnifying glass. Those who wish to add their code to the vault have until February 2nd, 02020 to do so. At that point, GitHub will take a snapshot of every public repository, and add it to the storage vault. GitHub plans to update the library every 5+ years.

Microsoft Research’s Project Silica storage device.

Another archival method is Microsoft Research’s newly-announced Project Silica quartz glass. Similar to the Rosetta Disk, Project Silica is designed to be a durable, long-term storage device.

Femtosecond lasers “encode data in [the] glass by creating layers of three-dimensional nanoscale gratings and deformations at various depths and angles,” Microsoft Research said in a press release. “Machine learning algorithms read the data back by decoding images and patterns that are created as polarized light shines through the glass.” GitHub intends to archive all public repositories on Microsoft’s Project Silica, which it believes could last for over 10,000 years. Like the Arctic Code Vault, GitHub plans to update the library every 5+ years.

Stewart Brand’s Pace Layers.

The GitHub archive program has adopted Long Now co-founder Stewart Brand’s pace layers framework for their code-archiving strategy. “This approach,” says GitHub, “is designed to maximize both flexibility and durability by providing a range of storage solutions, from real-time to long-term storage.”

GitHub’s Pace Layers approach to code-archiving.

Brand’s fast and slow layers are reconceptualized as hot, warm and cold. The hot layers (GitHub, GitHub Torrent, and GitHub archive) update in near-real time. The warm layers (the Internet Archive and the Software Heritage Foundation) update monthly to yearly. The cold layers (Oxford University’s Bodleian Library, the Arctic World Archive in Svalbard, and Microsoft Research’s Project Silica storage) update every five plus years. 

To ensure the future can use the software in its archive, GitHub has convened an Archive Program advisory panel of experts in technology and the humanities, including Long Now Executive Director Alexander Rose. The archive will include technical guides and a Tech Tree— “a roadmap and Rosetta Stone for future curious minds inheriting the archive’s data.”

An overview of the archive and how to use it, the Tech Tree will serve as a quickstart manual on software development and computing, bundled with a user guide for the archive. It will describe how to work backwards from raw data to source code and extract projects, directories, files, and data formats.

Inspired by Long Now’s Manual For Civilization, the archive will also include information on how to rebuild technologies from scratch.  

“It’s our hope,” GitHub says, “that [the Archive] will, both now and in the future, further publicize the worldwide open source movement; contribute to greater adoption of open source and open data policies worldwide; and encourage long-term thinking.”