Milan's municipal digital archive holds tens of thousands of photographs — and nobody is quite sure how many of them are the same image stored twice. That is the uncomfortable starting point for a remediation effort now underway across several city departments, driven in part by the pressure of Milan-Cortina 2026 Winter Olympic preparation and the need for a coherent, searchable visual identity before the Games open this coming February.
The problem did not appear overnight. It accumulated across roughly fifteen years of piecemeal digitisation, as individual agencies, cultural institutions and urban development bodies each ran their own capture and storage programmes with little coordination between them. The result is a sprawling patchwork of overlapping repositories rather than a single authoritative library.
The Road to Redundancy
The origins trace back to the mid-2000s, when the Comune di Milano began digitising its historical photographic holdings held at the Archivio Fotografico Civico in the Castello Sforzesco complex. That effort ran in parallel — but separately — from digitisation work commissioned by the Fondazione Fiera Milano for its own promotional archives. When the Porta Nuova regeneration district came online after 2012, the consortium behind it, Coima REM, produced its own voluminous image sets documenting construction and public art installations. None of these workflows were designed to talk to each other.
By 2019, Palazzo Marino — Milan's city hall on Piazza della Scala — had at least three internal content management systems holding overlapping image sets: one legacy system managed by the communications directorate, a newer cloud-based repository introduced under a digital transformation contract, and a separate archive maintained by the tourism and events office. Staff would routinely upload the same photograph of, say, the Navigli canal district or the Duomo forecourt to whichever system was most immediately accessible, without cross-checking whether the file already existed elsewhere.
The fashion economy added another layer of complexity. During Milan Fashion Week — held twice yearly at venues including Fieramilanocity in the Portello district and scattered showrooms across Brera and Tortona — press office teams from both the Comune and Camera Nazionale della Moda Italiana generated thousands of images per edition. Licensing restrictions and brand sensitivity meant these files were rarely consolidated with the civic archive, creating yet another silo.
The Olympic Deadline Focuses Minds
The urgency sharpened in late 2024 when the Milan-Cortina 2026 organising committee, Fondazione Milano Cortina, began assembling a unified media portal for rights-holding broadcasters and journalists. Technical staff tasked with populating that portal reported encountering duplicate image rates that, in some subject categories, exceeded forty percent of the total file count — meaning nearly half the storage allocated to certain topics was consumed by redundant copies of existing assets.
Addressing duplication at that scale is not a minor housekeeping exercise. Cloud storage costs in enterprise contracts of the kind the city holds with its infrastructure providers typically run on a per-gigabyte basis, and municipal IT procurement records lodged with the city council in 2025 noted annual storage expenditure across the main civic digital platforms had risen to figures comfortably above €2 million. Eliminating verified duplicates would directly reduce that overhead.
The technical approach now being piloted involves perceptual hashing — a method that generates a compact fingerprint for each image based on visual content rather than file name or metadata, allowing near-identical photographs to be identified even when they have been re-exported, resized or lightly edited. The tool being trialled is being applied first to the Porta Nuova and Navigli documentation sets, two of the most heavily duplicated subject areas.
For anyone working with Milan's civic image collections in the coming months, the practical implication is disruption before improvement. Archivists at the Castello Sforzesco and communications staff at Palazzo Marino have been advised to expect periods when access to certain image categories is restricted as deduplication runs are completed and canonical versions are designated. The organising committee's media portal is expected to go fully live in October 2026, giving the remediation project roughly three months of operational runway — tight, but workable if the piloted approach holds up at scale.