In this paper we introduce a new dataset and its evaluation tools, Div150Cred, that was designed to support shared evaluation of diversification techniques in different areas of social media photo retrieval and related areas. The dataset comes with associated relevance and diversity assessments performed by human annotators. The data consists of 300 landmark locations represented via 45,375 Flickr photos, 16M photo links for around 3,000 users, metadata, Wikipedia pages and content descriptors for text and visual modalities. To facilitate distribution, only Creative Commons content was included in the dataset. The proposed dataset was validated during the 2014 Retrieving Diverse Social Images Task at the MediaEval Benchmarking Initiative.
The files are available for download via HTTP. Link: http://traces.cs.umass.edu/index.php/Mmsys/Mmsys
References and Citation
Use of the datasets in published work should be acknowledged by a full citation to the authors' papers [IPL15] at the MMSys conference (Proceedings of ACM MMSys '15, Portland, Oregon, March 18-20, 2015).
IPL15: B. Ionescu, A. Popescu, M. Lupu, A. Gînscă, B. Boteanu, H. Müller. Div150Cred: A Social Image Retrieval Result Diversification with User Tagging Credibility Dataset, Proceedings of ACM MMSys '15, Portland, Oregon, March 18-20, 2015.