The Daily Milan

Milan news, every day

News

How Milan's Visual Archives Got Flooded With Duplicate Images — And Why Fixing It Took Years

From the Porta Nuova construction boom to the Milan-Cortina 2026 build-up, a decade of rapid documentation left the city's institutional photo libraries tangled in redundancy.

By Milan News Desk · Published 4 July 2026, 9:41 pm

3 min read

How Milan's Visual Archives Got Flooded With Duplicate Images — And Why Fixing It Took Years
Photo: Photo by Marco Ottaviano on Pexels

Milan's major public institutions and cultural bodies are deep into a coordinated cleanup of their digital image archives, targeting a backlog of duplicate photographs that accumulated steadily from roughly 2015 onward — a period that coincided with the most intensive phase of urban redevelopment the city has seen since the postwar reconstruction.

The problem did not arrive overnight. It is the direct result of how Milan documented itself during an era of almost relentless transformation. When the Porta Nuova district finished reshaping the skyline above Piazza Gae Aulenti, when the Fondazione Prada opened its permanent Largo Isarco campus in May 2015, and when the city simultaneously ran Expo 2015 across 110 hectares in Rho-Pero, the volume of institutional photography — commissioned shoots, press releases, architect renders, contractor progress images — expanded faster than any single archive team could manage.

The Pipeline That Created the Mess

Multiple departments, often working in parallel under deadline pressure, uploaded the same source files to different servers. The city's communications office at Palazzo Marino on Piazza della Scala, the municipal planning directorate, and individual project contractors each maintained separate repositories with no shared taxonomy. Identical images — sometimes the same RAW file processed differently, sometimes genuine pixel-for-pixel copies — ended up catalogued under different filenames and assigned contradictory metadata.

The situation compounded through the subsequent years. The Fondazione Milano school network's digital transition after 2018 added thousands of event photographs across its 24 campuses. The Triennale di Milano on Viale Alemagna, which hosted the 2019 World Design Summit, generated its own sprawling image cache. Each institution followed its own ingestion protocol. By the time archivists began seriously auditing the problem in 2023, preliminary internal assessments across three mid-sized Milanese cultural bodies found that duplicate or near-duplicate images accounted for somewhere between 30 and 40 percent of stored assets — a figure consistent with what digital asset management consultancies have reported across comparable European municipal archives.

The Milan-Cortina 2026 Winter Olympics preparation accelerated the urgency. The Fondazione Milano Cortina 2026, headquartered in Via Albricci near Piazza Diaz, has been producing documentation of venue construction and cultural programming at a pace that risks replicating the same archival disorder if left unmanaged. Sports photography from the newly upgraded PalaItalia Santa Giulia arena and the Cortina alpine venues has been flowing into systems that were not designed to flag duplicates at point of upload.

What Deduplication Actually Involves

The technical solution — perceptual hashing, which generates a short numerical fingerprint for each image and flags near-matches above a set similarity threshold — has existed for over a decade. The difficulty was never purely technological. It was institutional. Getting Comune di Milano, the autonomous foundations, the Olympics organising body, and commercial partners to agree on a shared metadata standard and a single deduplication workflow required the kind of cross-institutional negotiation that moves slowly in a city where the centre-left municipal government and the centre-right Lombardy regional administration have maintained persistent friction over overlapping jurisdictions.

A working group that includes representatives from the Comune's digital services unit and from the Politecnico di Milano's design faculty — which has researched digital archiving standards since at least 2020 — formalised its recommendations in early 2026. The target completion date for the first full deduplication pass across the priority archives is set for September 2026, timed to coincide with the opening ceremonies of the Winter Games.

For anyone managing image assets tied to Milan's fashion and design economy — the showrooms along Via della Spiga, the press offices handling Milan Fashion Week in February and September — the practical lesson from the institutional experience is straightforward: establish a single point of ingestion with automated hash-checking before files reach permanent storage. Retrofitting is expensive. The Triennale estimated that its own manual deduplication audit in 2024 consumed approximately 400 staff hours before automated tools took over. Catching duplicates at the door costs a fraction of that.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Milan

This article was produced by the The Daily Milan editorial desk and covers news in Milan. See our editorial standards for how we use AI.

The Daily Milan brief

The day's Milan news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Milan and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Milan news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Milan and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Milan

More in News

Enjoyed this story? Get tomorrow's briefing free.