Milan's public and private institutions collectively hold tens of millions of digitised images — fashion lookbooks, architectural renders, Olympic venue schematics, civic heritage scans — and a growing share of that library is redundant. Duplicate image proliferation, long dismissed as a storage nuisance, has become a measurable operational problem. The city is now in the middle of a serious, if unglamorous, reckoning with it.
The trigger is the Milan-Cortina 2026 Winter Olympics. The Fondazione Milano Cortina 2026, which manages communications and logistics for the Games, began a systematic digital-asset audit in late 2025 after internal reviews found that image databases used by contractors, venues and press offices across the Lombardy region contained significant duplication — slowing approval workflows and creating version-control failures on time-sensitive construction documents. The foundation launched a deduplication programme in January 2026, working with Milan-based technology firm Var Group on automated hash-matching and perceptual-similarity scanning across its centralised content management system.
Why Milan Is Further Along Than Paris or New York
The comparison with other major cities is instructive. Paris, preparing for ongoing legacy projects from its 2024 Olympics, has dealt with similar archive bloat inside the Délégation interministérielle aux Jeux Olympiques — but the French bureaucracy kept digital-asset management siloed by ministry, meaning deduplication happened department by department rather than system-wide. New York's Department of Cultural Affairs manages digitised collections across five boroughs but has not, as of mid-2026, announced a unified deduplication standard across partner institutions.
Milan's advantage is structural. The city's fashion and design economy runs on image discipline. Brands headquartered in the Quadrilatero della Moda — the rectangle bounded by Via Montenapoleone, Via della Spiga, Via Sant'Andrea and Corso Venezia — have enforced strict digital asset management protocols since at least 2018, when Prada Group and Moncler both adopted centralised DAM platforms to cut production costs in global campaign rollouts. Those private-sector habits have quietly influenced how civic institutions approach the same problem.
The Pinacoteca di Brera, on Via Brera in the Brera district, began its own deduplication project in March 2026 as part of a broader digitisation programme co-funded by the Ministero della Cultura. Staff there found that roughly 18 percent of images in the working archive were duplicates or near-duplicates — prints scanned multiple times over successive digitisation rounds since 2009. The museum partnered with the Politecnico di Milano's Image and Sound Processing Group, based on the Leonardo campus near Città Studi, to build a detection pipeline using perceptual hashing combined with metadata cross-referencing.
The Numbers and the Next Steps
Eighteen percent is consistent with figures from comparable heritage institutions globally. The Europeana digital library, which aggregates cultural-heritage content from across the European Union, published guidance in 2024 estimating that member institutions carried duplication rates ranging from 12 percent to 25 percent depending on how many legacy digitisation rounds they had completed. Milan's civic institutions appear to sit in the middle of that range.
The practical cost is real. Storage is not free — enterprise cloud storage for image-heavy archives typically runs at roughly €0.02 to €0.05 per gigabyte per month at scale, depending on provider and redundancy tier, meaning a 100-terabyte archive with 20 percent duplication carries a persistent and preventable overhead. Beyond cost, duplicate images degrade search precision in public-facing collections, frustrate journalists pulling assets on deadline, and create legal exposure when rights metadata conflicts across copies.
For institutions still at the beginning of this process, the path forward involves three concrete steps. First, commission a hash-based duplicate scan before any new storage contract renewal — many cloud providers now include basic deduplication tools in enterprise agreements. Second, establish a single master asset register rather than letting departments maintain parallel libraries. Third, build deduplication checkpoints into ingest workflows so the problem does not regenerate. The Brera's partnership with the Politecnico is a replicable model: academic computing resources, practical institutional data, shared publication of results. Other cities should be watching Porta Nuova and the Brera quarter closely — Milan is, for once, ahead of the curve on something nobody thought to race.