CLUK06
  Home Home
  Committees Committees
  Programme Programme
  Speakers Invited Speakers
  Paper Abstract Submission
  Presentations Presentation Guidelines
  Venue Transport & Venue
   
   
   
   
   
  Webcast Webcast Day 1
  Webcast2 Webcast Day 2
   
  Webcast archive will be available for at least 1 year after the colloquium
 
  Invited Speakers

Professor Anne DeRoeck [Presentation date: 8th March]
The Open University

Title: Dataset Profiles: investigating the role of data in experimental NLP

Abstract: It has been known for a long time that the performance of Information Retrieval and Natural Language Processing techniques in the context of a particular task is very sensitive to the characteristics of the data on which they are used. Though widely accepted, this fact has never been taken to its logical conclusion and in evaluation, for instance, experimental results are reported without reference to the impact of the underlying datasets or collections. This raises some very serious methodological, and practical issues around replicability. These could be addressed if we had reliable ways of profiling datasets, using measures that highlight relevant differences between collections. A first step would be to investigate what such measures might look like for a given range of tasks or techniques.

In this talk, I will show that even standard textual datasets such as the TIPSTER collection differ in ways that challenge widely accepted assumptions about the general applicability of techniques, and that similar differences in data profile will show up between texts in the same genre but in different languages. In exploring what might be suitable profiling measures, I will set out some desirable properties that such measures should have. I will then introduce our work on modelling term burstiness, and explore what term distribution, and variations in burstiness patterns in the occurrence of a term can tell us about genres and datasets.

Professor Jon Oberlander [Presentation date: 9th March]
University of Edinburgh

Title: The computational linguistics of affect: a personal view
 
     
 
 
 
     
 
Knowledge Media Institute
 
EPSRC
 
CRC
 
The Open University