Full Seminar Details
Computing Research Centre, The Open University, UK
This event took place on Wednesday 26 July 2006 at 11:30
The necessary precondition of the Semantic Web initiative is the availability of semantic data. Information, which at the moment is intended for human users, must be translated into a machine-readable format (RDF). Such a translation process is called semantic annotation. The amount of information on the Web makes it impossible to solve the annotation task manually. So the usage of automatic information extraction algorithms is essential. These algorithms use various natural language processing and machine learning techniques to extract information from text. The information extracted from different sources must then be integrated in a knowledge base, so that it can be queried in a uniform way. This integration process is called knowledge fusion. However, performing knowledge fusion encounters a number of problems. The origins of these problems are the following: 1. Inaccuracy of existing information extraction algorithms leads to appearance of incorrect annotations. 2. Information contained on the web pages can be imprecise, incomplete or vague. 3. Multiple sources can contradict each other. Thus, in order to perform large-scale automatic annotation it is necessary to implement a knowledge fusion procedure, which is able to deal with these problems.