Full Seminar Details
João Magalhães
Imperial College London, and KMi, The Open University
This event took place on Wednesday 27 June 2007 at 11:30
To solve the problem of indexing collections with diverse text documents, image documents, or documents with both text and images, one needs to develop a model that supports heterogeneous types of documents.
In this paper, we show how information theory supplies us with the tools necessary to develop a unique model for text, image, and text/image retrieval. In our approach, for each possible query keyword we estimate a maximum entropy model based on exclusively continuous features that were pre-processed. The unique continuous feature-space of text and visual data is constructed by using a minimum description length criterion to find the optimal feature-space representation (optimal from an information theory point of view). We evaluate our approach in three
experiments: only text retrieval, only image retrieval, and text combined with image retrieval.
Maven of the Month
We are also inviting top experts in AI and Knowledge Technologies to discuss major socio-technological topics with an audience that comprises both members of the Knowledge Media Institute, as well as the wider staff at The Open University. Differently from our seminar series, these events follow a Q&A format.