KMi Publications

Tech Reports

Tech Report KMI-06-01 Abstract


Exploiting Semantic Association To Answer Vague Queries
Techreport ID: KMI-06-01
Date: 2006
Author(s): Jianhan Zhu, Marc Eisenstadt, Dawei Song, Chris Denham
Download PDF

Although today's web search engines are very powerful, they still fail to provide intuitively relevant results for many types of queries, especially ones that are vaguely-formed in the user's own mind. We argue that associations between terms in a search query can reveal the underlying information needs in the users' mind and should be taken into account in search. We propose a multi-faceted approach to detect and exploit such associations. The CORDER method measures the association strength between query terms, and queries consisting of terms having low association strength with each other are seen as 'vague queries'. For a vague query, we use WordNet to find related terms of the query terms to compose extended queries, relying especially on the role of least common subsumers (LCS). We use relation strength between terms calculated by the CORDER method to refine these extended queries. Finally, we use the Hyperspace Analogue to Language (HAL) model and information flow (IF) method to expand these refined queries. Our initial experimental results on a corpus of 500 books from Amazon shows that our approach can find the right books for users given authentic vague queries, even in those cases where Google and Amazon's own book search fail.

Publication(s):

To appear in Proc. of The Fourth International Conference on Active Media Technology (AMT 2006), June 2006, Brisbane, Australia.
 
KMi Publications
 

Multimedia and Information Systems is...


Multimedia and Information Systems
Our research is centred around the theme of Multimedia Information Retrieval, ie, Video Search Engines, Image Databases, Spoken Document Retrieval, Music Retrieval, Query Languages and Query Mediation.

We focus on content-based information retrieval over a wide range of data spanning form unstructured text and unlabelled images over spoken documents and music to videos. This encompasses the modelling of human perception of relevance and similarity, the learning from user actions and the up-to-date presentation of information. Currently we are building a research version of an integrated multimedia information retrieval system MIR to be used as a research prototype. We aim for a system that understands the user's information need and successfully links it to the appropriate information sources, be it a report or a TV news clip. This work is guided by the vision that an automated knowledge extraction system ultimately empowers people making efficient use of information sources without the burden of filing data into specialised databases.

Visit the MMIS website