Columbia University Libraries
Columbia University Libraries ran a large, multi-phase migration of nearly 4,000 records — describing over 70,000 linear feet of material — into a Lyrasis-hosted ArchivesSpace instance, drawn from a mix of Archivists' Toolkit, MARC records and homegrown XML. The team used Python, XSLT and the Google Sheets API to report on, merge and remediate descriptive data that had diverged over decades; responses gathered in collaborative spreadsheets fed straight back into automated remediation, minimising manual work. The project established ArchivesSpace as the single source of truth feeding downstream discovery layers. Documented in the Code4Lib Journal, it is a benchmark for large research-library migrations.