The Daily Milan

Milan news, every day

News

Milan's Digital Archives Get a Reckoning: Duplicate Image Problem Finally Has a Fix

Institutions across the city are moving this week to adopt new automated tools that clean up years of redundant visual data clogging cultural and commercial databases.

By Milan News Desk · Published 4 July 2026, 8:45 pm

3 min read

Milan's Digital Archives Get a Reckoning: Duplicate Image Problem Finally Has a Fix
Photo: Photo by Mihaela Claudia Puscas on Pexels

A long-running headache for Milan's museums, design houses and civic photo archives got measurably smaller this week. On July 2, the Lombardy Regional Digital Culture Office confirmed it had completed a pilot run of AI-assisted duplicate image detection software across three participating institutions, cutting redundant files from a shared heritage database by what internal project documents describe as a substantial margin. The cleanup affects tens of thousands of digitised assets held across the Castello Sforzesco archival system, the Museo del Novecento's digital catalogue and the city's broader Cultura Milano digital platform.

The timing is not accidental. With the Milan-Cortina 2026 Winter Olympics opening ceremony now less than six months away, pressure has intensified on every public-facing institution in the city to have its digital inventory clean, searchable and ready for a surge in international traffic. A poorly indexed image archive is more than an aesthetic problem — duplicates inflate storage costs, slow search response times and routinely confuse rights-management systems, causing licensing disputes that can stall publications, broadcasts and official communications at exactly the wrong moment.

What Changed This Week

The project, operating under the regional initiative known as Digitale Lombardia, deployed a perceptual hashing algorithm to flag near-identical images that had accumulated over more than a decade of incremental scanning. Unlike basic file-matching tools, perceptual hashing can catch images that have been re-cropped, recoloured or saved at different resolutions — the most common forms of accidental duplication in archive workflows. Participating institutions were given until July 31 to review flagged items before any permanent deletions are made.

The private sector moved on a parallel track. Fiera Milano, whose sprawling Rho exhibition campus hosts events that generate hundreds of thousands of press photographs every season, said this week it had completed integration of a duplicate-detection layer into its internal asset management system ahead of its autumn calendar. The fashion economy generates particular volumes of near-duplicate photography: a single runway show can produce multiple photographers shooting the same look from similar angles, and without automated filtering, agencies and brand communications teams spend hours manually sorting near-identical frames. At Fiera Milano's scale — the complex covers roughly 345,000 square metres and hosts dozens of major trade fairs annually — even incremental efficiency gains translate into measurable cost savings.

In the Brera design district, several independent studios have taken a more self-directed approach. A cluster of design and architecture practices with offices around Via Solferino and Corso Garibaldi have in recent months adopted subscription-based duplicate detection services aimed at small creative businesses, with monthly pricing typically running between €15 and €60 depending on storage volume. The uptake has accelerated noticeably since May, when a well-circulated industry note from the Milan chapter of the Ordine dei Designer warned that redundant imagery in client deliverables had become a source of contractual friction.

Why the Problem Grew So Large

The root cause is straightforward: Milan spent the 2010s digitising aggressively and the 2020s struggling to govern what it had created. The Porta Nuova district redevelopment, for example, generated years of construction documentation, press imagery and architectural photography stored across multiple municipal systems with no unified naming convention. Similar fragmentation exists in the fashion sector, where global brands headquartered along Via Montenapoleone maintain separate regional and global digital asset platforms that sync imperfectly.

The European Union's shift toward stricter digital asset governance under the 2023 Data Act has added regulatory incentive to get archives in order, since organisations that cannot demonstrate clean provenance chains for stored assets face increased exposure in cross-border data audits.

For individuals and small organisations still working through the problem manually, the regional Digitale Lombardia office has published an open guidance document on its website, updated as of June 30, covering both free and commercial tool options. Institutions with fewer than 10,000 digital assets are being encouraged to complete their own reviews before October 1, ahead of the Olympic-period visibility window. The Castello Sforzesco archive, which holds material dating to the 19th century, is expected to publish its cleaned public catalogue by late September.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Milan

This article was produced by the The Daily Milan editorial desk and covers news in Milan. See our editorial standards for how we use AI.

The Daily Milan brief

The day's Milan news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Milan and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Milan news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Milan and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Milan

More in News

Enjoyed this story? Get tomorrow's briefing free.