KMi Seminars
All You Can Eat Ontology-Building: Feeding Wikipedia to Cyc
This event took place on Wednesday 22 April 2009 at 11:30

 
Dr Catherine Legg Department of Philosophy and Religious Studies, University of Waikato, New Zealand

In order to achieve a genuinely intelligent World Wide Web, it seems that building some kind of general machine-readable ontology is an inescapable task. Yet the past 20 years have shown that hand-coding formal ontologies is not practicable. A recent explosion of free user-supplied knowledge on the Web has led to great strides in automatic ontology-building (e.g. YAGO, DBpedia), but here quality-control is still a major issue. Ideally one should automatically build onto an already intelligent base. I suggest that the long-running Cyc project can finally come into its own here, describing methods developed at the University of Waikato over the past summer whereby 35K new concepts mined from Wikipedia were added to appropriate Cyc collections, and automatically categorized as instances or subcollections. Most importantly, Cyc itself was leveraged for ontological quality control by ‘feeding’ assertions to it one by one, allowing it to ‘regurgitate’ those that are ontologically unsound. Cyc is arguably the only ontology currently sophisticated enough to be able to perform such a ‘digestive’ function, using its principled taxonomic structure and purpose-built inference engine. It is suggested that a traditional fixation of AI researchers on realizing the intelligence of the brain has perhaps caused us to overlook more humble yet genuine steps towards the AI vision which might be gained by realizing the intelligence of the stomach.

 
KMi Seminars
 

Multimedia and Information Systems is...


Multimedia and Information Systems
Our research is centred around the theme of Multimedia Information Retrieval, ie, Video Search Engines, Image Databases, Spoken Document Retrieval, Music Retrieval, Query Languages and Query Mediation.

We focus on content-based information retrieval over a wide range of data spanning form unstructured text and unlabelled images over spoken documents and music to videos. This encompasses the modelling of human perception of relevance and similarity, the learning from user actions and the up-to-date presentation of information. Currently we are building a research version of an integrated multimedia information retrieval system MIR to be used as a research prototype. We aim for a system that understands the user's information need and successfully links it to the appropriate information sources, be it a report or a TV news clip. This work is guided by the vision that an automated knowledge extraction system ultimately empowers people making efficient use of information sources without the burden of filing data into specialised databases.

Visit the MMIS website