KMi Seminars
Experiments in understanding and QA of a very large Ontology
This event took place on Thursday 23 September 2010 at 12:00

 
Prof. Alan Rector University of Manchester

SNOMED-CT is a very large (450,000 concept) terminology based on a subset of description logic. Until recently, it was published only in "classified" form in a set of distribution tables. Although everybody knows the hierarchies contain many anomalies, it has been almost impossible to comment on them. Recently they have published the "stated form" and a script for transforming it into OWL. At the same time a group of hospitals has published a list of the most commonly used codes for "problems" - the Core Problem List Subset. Using the module extraction mechanism in the OWL API, and the subset as a signature, a module can be extracted from the stated form which is guaranteed to be sufficient to classify it in the same way as it would be classified in the full SNOMED, but in an ontology of only 35,000 concepts. The new out SNOROCKET (an optimised EL++ classifer) classifies the subset in about 30 seconds making possible iterative exploration and modification.

Using this subset we have begun to develop methods to explore the core subset in combination with two projects. We have begun by taking common key concepts of importance for users and looking up the hierarchies to see how they were classified, then looking for analogies to any problems found. We call the method "analysis by repair". Issues discovered range from simple omissions to gross errors in the ontology schemas for anatomy. Only a few are evident locally without classification.

We have found the Protege Inferred class hierarchy the best screening tool for looking up hierarchies and the OWLViz tool the best definitive tool. Usually, but not always, a complex tangled upwards hierarchy indicates problems. We are just starting to explore the OPPL to find patterns. Performing the task on a large scale requires improved tools.

While this sub-project focuses on an ontology used for terminology, the context is that we wish to use such terminologies as just one small piece of a much larger programme of hybrid ontology based architecture that clearly distinguishes domain ontologies, such as SNOMED, from ontologies describing the use of information from the data structures for that information and that use a variety of reasoning techniques.

(Due to unforeseen circumstances we were unable to record or webcast this event, we apologise to those who were otherwise unable to attend this event in person)

 
KMi Seminars
 

Future Internet is...


Future Internet
With over a billion users, today's Internet is arguably the most successful human artifact ever created. The Internet's physical infrastructure, software, and content now play an integral part of the lives of everyone on the planet, whether they interact with it directly or not. Now nearing its fifth decade, the Internet has shown remarkable resilience and flexibility in the face of ever increasing numbers of users, data volume, and changing usage patterns, but faces growing challenges in meetings the needs of our knowledge society. Globally, many major initiatives are underway to address the need for more scientific research, physical infrastructure investment, better education, and better utilisation of the Internet. Within Japan, USA and Europe major new initiatives have begun in the area.

To succeed the Future Internet will need to address a number of cross-cutting challenges including:

  • Scalability in the face of peer-to-peer traffic, decentralisation, and increased openness

  • Trust when government, medical, financial, personal data are increasingly trusted to the cloud, and middleware will increasingly use dynamic service selection

  • Interoperability of semantic data and metadata, and of services which will be dynamically orchestrated

  • Pervasive usability for users of mobile devices, different languages, cultures and physical abilities

  • Mobility for users who expect a seamless experience across spaces, devices, and velocities