Fashion-focused Creative Commons Social dataset

Author: TU Delft

Partner: No




Total: 4810


A fashion-focused Creative Commons dataset is designed to contain a mix of general images as well as a large component of images that are focused on fashion (i.e., relevant to particular clothing items or fashion accessories). The dataset contains 4810 images and related metadata. Furthermore, a ground truth on image’s tags is presented. Ground truth generation for large-scale datasets is a necessary but expensive task. Traditional expert based approaches have become an expensive and non-scalable solution. For this reason, we turn to crowdsourcing techniques in order to collect ground truth labels; in particular we make use of the commercial crowdsourcing platform, Amazon Mechanical Turk (AMT). Two different groups of annotators (i.e., trusted annotators known to the authors and crowdsourcing workers on AMT) participated in the ground truth creation. Annotation agreement between the two groups is analyzed. Applications of the dataset in different contexts are discussed. This dataset contributes to research areas such as crowdsourcing for multimedia, multimedia content analysis, and design of systems that can elicit fashion preferences from users.


The files are available for download via HTTP. Link: Direct link to the files: Link:

References and Citation

Use of the datasets in published work should be acknowledged by a full citation to the paper [LMG13] at the MMSys conference (Proceedings of ACM MMSys 13, February 27 - March 1, 2013, Oslo, Norway).


  • LMG13: Babak Loni, Maria Menendez, Mihai Georgescu, Luca Galli, Claudio Massari, Ismail Sengor Altingovde, Davide Martinenghi, Mark Melenhorst, Raynor Vliegendhart, Martha Larson, Fashion-focused creative commons social dataset, Proceedings of the 4th ACM Multimedia Systems Conferen (MMSys), Oslo, Norway, USA, February 27 - March 1, 2013.