Across Milan's public and private institutions, duplicate image files now account for roughly 30 to 40 percent of all stored digital assets, according to estimates cited by archival technology specialists working with Italian cultural bodies. That figure, unremarkable to the casual observer, is costing the city's major organisations millions of euros annually in unnecessary storage, licensing fees and staff hours — and the problem is accelerating ahead of the Milan-Cortina 2026 Winter Olympics, which opens in February.
The timing matters because the past 18 months have seen an extraordinary burst of digitisation activity across the city. The Comune di Milano's urban planning directorate has been converting decades of paper records — everything from the original Porta Nuova rezoning documents to current construction permits around the Santa Giulia district — into digital repositories. Simultaneously, the fashion sector, anchored in the Quadrilatero della Moda around Via Montenapoleone and Via della Spiga, has been archiving runway photography, look-book assets and brand-identity files at a pace that shows no sign of slowing.
The Scale of the Problem
The duplication crisis has a straightforward cause: when organisations migrate files between servers, merge legacy databases, or transfer assets from one vendor platform to another, image files get copied rather than moved. A single RAW photograph from a 2023 Milan Fashion Week show can exist in four or five versions — original, compressed JPEG, web thumbnail, retouched master, archived backup — each flagged as a distinct asset in a catalogue system. Multiply that by tens of thousands of images and you have a storage bill that swells quarter by quarter.
Fondazione Prada, which maintains an extensive digital archive tied to its Ca' Corner della Regina Venice operation and its permanent Milan venue on Largo Isarco, has publicly acknowledged the scale of its digitisation programme, though it has not released specific duplication figures. The Pinacoteca di Brera, whose collection spans more than 38,000 catalogued works, began a structured deduplication audit in late 2024 as part of its broader participation in the national MiC — Ministero della Cultura — digitisation initiative. Staff working on that project have described the process as identifying and tagging redundant files before any deletion, a step that can take three to six months for a collection of that size.
Storage costs in enterprise-grade cloud environments used by Italian cultural institutions currently run between €0.02 and €0.05 per gigabyte per month, depending on the provider and redundancy tier. A mid-sized institutional archive holding 200 terabytes of image data — not unusual for a major fashion house or a civic museum — could therefore be paying upward of €120,000 a year in pure storage costs, with a meaningful fraction of that spend covering files that are bit-for-bit identical copies of something already catalogued elsewhere in the same system.
What Milan's Institutions Are Doing About It
The practical response divides broadly into two camps. Larger organisations — including the Milan-based Camera Nazionale della Moda Italiana, which coordinates the city's four annual fashion weeks — are investing in AI-assisted deduplication tools that use perceptual hashing algorithms to identify near-duplicate images even when file names and metadata differ. These tools can process several hundred thousand files per hour and flag matches for human review before any permanent deletion occurs. Smaller institutions, including several design studios clustered around the Tortona district in the Navigli area, are relying on open-source scripts and manual audits, a slower but lower-cost approach.
The Milan-Cortina 2026 organising committee, Fondazione Milano Cortina 2026, faces a particular version of this challenge. It is generating thousands of accreditation photographs, venue survey images, and promotional assets every week, all flowing into a centralised digital asset management system that draws on contributions from multiple partner agencies. Getting deduplication protocols embedded before the Games open — specifically before the February 6 opening ceremony in Cortina d'Ampezzo — is now a stated operational priority for the committee's technology team.
For any organisation in Milan dealing with this issue, archivists recommend three immediate steps: conducting a hash-based audit before the next server migration, establishing a single-source naming convention for all incoming image files, and setting retention policies that distinguish between master files and derivative exports. The cleanup is unglamorous work. The bill for ignoring it is not.