KMi Publications

Tech Reports

Tech Report kmi-04-19 Abstract


Ontosophie: A Semi-Automatic System for Ontology Population from Text
Techreport ID: kmi-04-19
Date: 2004
Author(s): David Celjuska, Maria Vargas-Vera
Download PDF

This paper describes a system for semi-automatic population of ontologies with instances from unstructured text. It is based on supervised learning, learns extraction rules from annotated text and then applies those rules on new articles for ontology population. Therefore, the system classifies stories and populates a hand-crafted ontology with new instances of classes defined in it. It is based on three components: Marmot - a natural language processor; Crystal - a dictionary induction tool; and Badger - an information extraction tool. A part of the entire cycle is a user who accepts, rejects or modifies newly extracted and suggested instances to be populated. A description of experiments performed with text corpus consisting of 91 articles is given in turn. The results cover the paper and support a presented hypothesis of assigning a rule confidence value to each extraction rule for improving the performance.
 
KMi Publications
 

Future Internet is...


Future Internet
With over a billion users, today's Internet is arguably the most successful human artifact ever created. The Internet's physical infrastructure, software, and content now play an integral part of the lives of everyone on the planet, whether they interact with it directly or not. Now nearing its fifth decade, the Internet has shown remarkable resilience and flexibility in the face of ever increasing numbers of users, data volume, and changing usage patterns, but faces growing challenges in meetings the needs of our knowledge society. Globally, many major initiatives are underway to address the need for more scientific research, physical infrastructure investment, better education, and better utilisation of the Internet. Within Japan, USA and Europe major new initiatives have begun in the area.

To succeed the Future Internet will need to address a number of cross-cutting challenges including:

  • Scalability in the face of peer-to-peer traffic, decentralisation, and increased openness

  • Trust when government, medical, financial, personal data are increasingly trusted to the cloud, and middleware will increasingly use dynamic service selection

  • Interoperability of semantic data and metadata, and of services which will be dynamically orchestrated

  • Pervasive usability for users of mobile devices, different languages, cultures and physical abilities

  • Mobility for users who expect a seamless experience across spaces, devices, and velocities