Evaluation Methodologies for Multilabel Classification Evaluation
This event took place on Friday 18 December 2009 at 11:30
Stefanie Nowak
Semantic indexing of multimedia content is a key research challenge in the multimedia community. Several benchmarking campaigns exist that assess the performance of these systems. My PhD thesis deals with approaches for the annotation of images with multiple visual concepts and evaluation methodologies for annotation performance assessment.After a short outline of the different parts of my thesis, I would like to illustrate three case studies that were performed based on the results of a recent benchmarking event in ImageCLEF in more detail. In ImageCLEF 2009, we conducted a task that aims at the detection of 53 visual concepts in consumer photos. These concepts are structured in an ontology which covers concepts concerning the scene description of photos, the representation of photo content and the photo quality. For performance assessment, a recently proposed ontology-based measure was utilized that takes the hierarchy and the relations of the ontology into account and generates a score per photo. Starting from this benchmark, three case studies have been conducted related to evaluation methodologies. The first study deals with the ground truth assessment for benchmark datasets. We investigate how much annotations from experts differ from each other, how different sets of annotations influence the ranking of systems and whether these annotations can be obtained with a crowdsourcing approach. A second case study examines the behaviour of different evaluation measures for multilabel evaluation and points out their strengths and weaknesses. Concept-based and example-based evaluation measures are compared based on the ranking of systems. In the third case study, the ontology-based evaluation measure is extended with semantic relatedness metrics. We apply several semantic relatedness measures based on web-search engines, WordNet and Wikipedia and evaluate the characteristics of the measures concerning stability and ranking.
This event took place on Friday 18 December 2009 at 11:30
Stefanie Nowak
Semantic indexing of multimedia content is a key research challenge in the multimedia community. Several benchmarking campaigns exist that assess the performance of these systems. My PhD thesis deals with approaches for the annotation of images with multiple visual concepts and evaluation methodologies for annotation performance assessment.After a short outline of the different parts of my thesis, I would like to illustrate three case studies that were performed based on the results of a recent benchmarking event in ImageCLEF in more detail. In ImageCLEF 2009, we conducted a task that aims at the detection of 53 visual concepts in consumer photos. These concepts are structured in an ontology which covers concepts concerning the scene description of photos, the representation of photo content and the photo quality. For performance assessment, a recently proposed ontology-based measure was utilized that takes the hierarchy and the relations of the ontology into account and generates a score per photo. Starting from this benchmark, three case studies have been conducted related to evaluation methodologies. The first study deals with the ground truth assessment for benchmark datasets. We investigate how much annotations from experts differ from each other, how different sets of annotations influence the ranking of systems and whether these annotations can be obtained with a crowdsourcing approach. A second case study examines the behaviour of different evaluation measures for multilabel evaluation and points out their strengths and weaknesses. Concept-based and example-based evaluation measures are compared based on the ranking of systems. In the third case study, the ontology-based evaluation measure is extended with semantic relatedness metrics. We apply several semantic relatedness measures based on web-search engines, WordNet and Wikipedia and evaluate the characteristics of the measures concerning stability and ranking.
Future Internet
KnowledgeManagementMultimedia &
Information SystemsNarrative
HypermediaNew Media SystemsSemantic Web &
Knowledge ServicesSocial Software
Social Software is...

Interacting with other people not only forms the core of human social and psychological experience, but also lies at the centre of what makes the internet such a rich, powerful and exciting collection of knowledge media. We are especially interested in what happens when such interactions take place on a very large scale -- not only because we work regularly with tens of thousands of distance learners at the Open University, but also because it is evident that being part of a crowd in real life possesses a certain 'buzz' of its own, and poses a natural challenge. Different nuances emerge in different user contexts, so we choose to investigate the contexts of work, learning and play to better understand the trade-offs involved in designing effective large-scale social software for multiple purposes.
Check out these Hot Social Software Projects:
List all Social Software Projects
Check out these Hot Social Software Technologies:
List all Social Software Technologies
List all Social Software Projects
Check out these Hot Social Software Technologies:
List all Social Software Technologies



