Abstract
Analyzing unannotated large complex image collections in domains like forensics, accident investigation, or social media analysis involves interpreting complex, overlapping relationships among images: images may belong to multiple content- or context-based groupings simultaneously. Domain experts, like forensic investigators, accident investigators, investigative journalists, and social media analysts require a way to make well informed, high-impact decisions, while not necessarily being specialists in analyzing such collections. Traditional clustering assigns images to a single cluster, not representing overlapping relationships, while supervised classification and multi-label classification require annotations and often rely on generic pre-trained models that do not capture domain specific semantics of complex real-world image collections. Hypergraphs effectively capture overlapping relationships, but construction from raw, unannotated image data and translating their complexity into information and insights for domain experts, remain challenging. We propose an interactive visual analytics approach specifically designed for constructing, exploring, and analyzing hypergraphs. Core contributions include: (1) a framework for constructing and evaluating hypergraphs from raw image data, (2) CoverEdge Similarity (CES), a scalable measure for comparing constructed hypergraphs with ground truth, (3) scalable visual analytics integrating coordinated spatial, grid, and matrix visualization, and (4) practical domain insights from evaluation with real-life image collections. To determine which construction algorithm can create meaningful hypergraphs, we designed and validated a similarity measure to evaluate constructed hypergraphs against ground truth. Across annotated benchmark collections, our TEMI-adaptation as construction method performed best overall, compared to others like fuzzy c-means, and produced overlaps that were qualitatively useful for analysis. A qualitative think-aloud study with eight domain experts on real-life accident investigation image collections containing several thousand to tens of thousands of images suggests that the system supports iterative exploration and search, with participants completing most tasks within minutes. A video demo is available in the supplemental materials.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
