AVT-AI-Image-Dataset - Appeal and Quality Assessment for AI-Generated Images
This dataset examines the appeal and quality of AI-generated images, addressing the critical question of to what extent AI-generated images are realistic or of high appeal from a photographic point of view and how users perceive them. Published alongside research in IEEE Access (2023) and at QoMEX 2023, the dataset was developed as part of the Deutsche Forschungsgemeinschaft (DFG) funded research (DFG-437543412).
Description
This dataset examines the appeal and quality of AI-generated images, addressing the critical question of to what extent AI-generated images are realistic or of high appeal from a photographic point of view and how users perceive them. Published alongside research in IEEE Access (2023) and at QoMEX 2023, the dataset was developed as part of the Deutsche Forschungsgemeinschaft (DFG) funded research (DFG-437543412).
The dataset comprises 135 images generated using five different AI text-to-image generators (including DALL-E-2, Midjourney, and Craiyon) based on 27 different text prompts. Some prompts were derived from the DrawBench benchmark, ensuring diversity in scene complexity and description specificity. The generated images were combined with real photographs for comparison in subjective evaluation studies.
An online crowdsourcing test was conducted to collect subjective ratings on appeal, realism, and text prompt matching from participants. The raw annotation data is stored in the repository’s evaluation/subjective/ directories, along with evaluation scripts to reproduce the research results. Subjective ratings were compared with state-of-the-art image quality models and features to assess the validity of objective quality metrics for AI-generated content.
The primary finding indicates that some AI generators can produce realistic and highly appealing images, with Midjourney achieving particularly strong performance with appeal ratings of 4.5. However, image quality and appeal depend significantly on both the specific generator used and the text prompt provided. AI-generated images may appear artificial, exhibit low quality, or have low appeal compared to real images, with these limitations varying by generator and prompt complexity.
A companion paper at QoMEX 2023 (DOI: 10.1109/QoMEX58391.2023.10178486) extends this analysis by evaluating quality and appeal through crowdsourcing tests and correlating subjective ratings with objective quality assessment models. The research has been highly influential with 13 citations and over 1,100 downloads, contributing to understanding synthetic media quality assessment and informing the development of improved AI image generation systems. The dataset and evaluation methodology are made publicly available following an Open Science approach, enabling reproducible research on AI-generated image quality and appeal.
Access
Openly available for download from GitHub repository
License
GNU General Public License v3.0