The problem did not arrive overnight. Across Milan's network of publicly funded cultural institutions — from the Pinacoteca di Brera to the Civico Archivio Fotografico housed within the Castello Sforzesco complex — digital catalogues accumulated duplicate image files at a pace that outran any coherent management strategy. By early 2026, administrators at several institutions were confronting repositories where, in some cases, the same photograph or artwork scan appeared stored under three or four separate file names, in different resolutions, sometimes mis-tagged with conflicting metadata.
The timing matters. Milan is less than six months from hosting events tied to the Milan-Cortina 2026 Winter Olympics, and city tourism authorities have committed to upgrading public-facing digital portals — including those run by Palazzo Reale and the Museo del Novecento on Piazza del Duomo — to handle significantly higher international traffic. Serving up duplicate or conflicting images to visitors searching collections online is not a minor embarrassment; it actively undermines the city's projection of itself as a world-class cultural capital.
The roots of the duplication crisis trace back to a digitisation push that accelerated after 2015, when the Comune di Milano launched its first coordinated open-data cultural heritage programme. Individual institutions digitised their own holdings on their own timelines, using different scanning contractors, different file-naming conventions and different content management systems. When the city later tried to aggregate those collections into shared platforms — most visibly through the Milano Digital Week infrastructure and the Fondazione Giangiacomo Feltrinelli's collaborative archive projects on Viale Pasubio — the lack of standardisation meant that automated import tools frequently created copies rather than matching records.
A Problem Compounded by Growth and Good Intentions
Porta Nuova's emergence as a technology and creative district through the 2010s brought additional pressure. Startups and design studios working out of buildings along Via Pirelli and the Isola neighbourhood began licensing images from institutional archives for commercial projects. Each licensing transaction often generated a fresh derivative file that found its way back into the originating archive — sometimes deposited by well-meaning partners trying to return enhanced scans, sometimes simply duplicated during file transfer. The fashion economy added its own layer: the Camera Nazionale della Moda Italiana, based in Milan, maintains its own image library of runway documentation, and cross-referencing with municipal archives has historically been informal at best.
The scale of the redundancy is significant. A 2025 audit conducted under the city's Piano Triennale per la Cultura 2024–2026 identified more than 340,000 duplicate or near-duplicate image files across six major institutional repositories — a figure that represented roughly 18 percent of total stored image assets reviewed. Storage costs alone for maintaining those redundant files were estimated in the audit at approximately €280,000 annually, before accounting for staff time spent managing conflicting records.
The Cleanup and What Comes Next
A working group convened by the Assessorato alla Cultura del Comune di Milano has been piloting a deduplication protocol since March 2026, testing perceptual hashing software — technology that identifies visually identical or near-identical images regardless of file name or format — across the Civico Archivio Fotografico holdings first, given that institution's relatively contained and well-documented collection. The pilot covers approximately 60,000 image records. Results from that initial phase are expected before September 2026.
The broader rollout, if the pilot delivers acceptable accuracy rates, is scheduled to extend to Palazzo Reale's digital catalogue and the Museo del Novecento through the winter of 2026–27. Institutions have been advised to freeze new cross-institutional imports until the deduplication framework is in place — a practical step that will also require updating agreements with external digitisation contractors operating in the city.
For anyone currently relying on Milan's public image archives for research, design work or journalism, the practical advice is straightforward: treat any image sourced from the city's shared portals before September 2026 as potentially unverified against the cleaned master catalogue. Cross-check file metadata against the originating institution's own database directly, and flag any obvious duplicates to the relevant archive. The cleanup is real and it is moving — but it is not finished.