Milan's major public institutions and cultural bodies are deep into a coordinated cleanup of their digital image archives, targeting a backlog of duplicate photographs that accumulated steadily from roughly 2015 onward — a period that coincided with the most intensive phase of urban redevelopment the city has seen since the postwar reconstruction.
The problem did not arrive overnight. It is the direct result of how Milan documented itself during an era of almost relentless transformation. When the Porta Nuova district finished reshaping the skyline above Piazza Gae Aulenti, when the Fondazione Prada opened its permanent Largo Isarco campus in May 2015, and when the city simultaneously ran Expo 2015 across 110 hectares in Rho-Pero, the volume of institutional photography — commissioned shoots, press releases, architect renders, contractor progress images — expanded faster than any single archive team could manage.
The Pipeline That Created the Mess
Multiple departments, often working in parallel under deadline pressure, uploaded the same source files to different servers. The city's communications office at Palazzo Marino on Piazza della Scala, the municipal planning directorate, and individual project contractors each maintained separate repositories with no shared taxonomy. Identical images — sometimes the same RAW file processed differently, sometimes genuine pixel-for-pixel copies — ended up catalogued under different filenames and assigned contradictory metadata.
The situation compounded through the subsequent years. The Fondazione Milano school network's digital transition after 2018 added thousands of event photographs across its 24 campuses. The Triennale di Milano on Viale Alemagna, which hosted the 2019 World Design Summit, generated its own sprawling image cache. Each institution followed its own ingestion protocol. By the time archivists began seriously auditing the problem in 2023, preliminary internal assessments across three mid-sized Milanese cultural bodies found that duplicate or near-duplicate images accounted for somewhere between 30 and 40 percent of stored assets — a figure consistent with what digital asset management consultancies have reported across comparable European municipal archives.
The Milan-Cortina 2026 Winter Olympics preparation accelerated the urgency. The Fondazione Milano Cortina 2026, headquartered in Via Albricci near Piazza Diaz, has been producing documentation of venue construction and cultural programming at a pace that risks replicating the same archival disorder if left unmanaged. Sports photography from the newly upgraded PalaItalia Santa Giulia arena and the Cortina alpine venues has been flowing into systems that were not designed to flag duplicates at point of upload.
What Deduplication Actually Involves
The technical solution — perceptual hashing, which generates a short numerical fingerprint for each image and flags near-matches above a set similarity threshold — has existed for over a decade. The difficulty was never purely technological. It was institutional. Getting Comune di Milano, the autonomous foundations, the Olympics organising body, and commercial partners to agree on a shared metadata standard and a single deduplication workflow required the kind of cross-institutional negotiation that moves slowly in a city where the centre-left municipal government and the centre-right Lombardy regional administration have maintained persistent friction over overlapping jurisdictions.
A working group that includes representatives from the Comune's digital services unit and from the Politecnico di Milano's design faculty — which has researched digital archiving standards since at least 2020 — formalised its recommendations in early 2026. The target completion date for the first full deduplication pass across the priority archives is set for September 2026, timed to coincide with the opening ceremonies of the Winter Games.
For anyone managing image assets tied to Milan's fashion and design economy — the showrooms along Via della Spiga, the press offices handling Milan Fashion Week in February and September — the practical lesson from the institutional experience is straightforward: establish a single point of ingestion with automated hash-checking before files reach permanent storage. Retrofitting is expensive. The Triennale estimated that its own manual deduplication audit in 2024 consumed approximately 400 staff hours before automated tools took over. Catching duplicates at the door costs a fraction of that.