Primary aim: Linking Multimedia Resources based on shared visual material: Uncover the entry point of lecture videos that use the slide that you grapple with; identify youtube/TV/twitter streams that use the same images or videos than the news story that you are looking at; etc
Brief description: This PhD project researches near duplicate detection in visual material. The challenges here are mostly in terms of scaling, incremental indexing, and considering cropping, brightness & contrast change, re-coding and rescaling as the most likely factors of representational change. Novel research will be carried out on creating fingerprints from compressed representations directly, eg, from cosine transform coefficients of jpg and the i-frames of mpeg, and from the motion vectors that are used in the temporal compression aspects. This research can also be deployed to link images to videos and vice versa. This is particularly useful for linking discussion threads and media content from potentially different groupings in terms of political disposition, geographical spread, different languages used that share the same material.
Useful skills to work on this PhD project: PhD research is about generating new knowledge, hence we are looking for creative individuals, able to generate and pursue original ideas and to organise their own work with minimal day-to-day supervision. It is also desirable to have strong programming skills in Java/C++, experience with machine learning and the corresponding mathematics, and excellent verbal and written communication skills.
S Rüger, 2010: Multimedia information retrieval. Lecture notes in the series Synthesis, Lectures on Information Concepts, Retrieval, and Services. Morgan and Claypool Publishers
For informal enquiries please contact:
Prof Stefan Rueger
Professor of Knowledge Media
Tel. +44 (0)1908 655945.
Primary aim: Media mining of large media repositories to discover significant aspects of our culture.
Brief description of the PhD project: If we read one book every day in our reading life (age 5 to 80, say) we can read 27,000 books over a lifetime. There is a similar limit for watching video, yet there are millions more books and visual resources available. This project poses the question what can computers learn by reading one million books, by watching one million hours of news, by looking at one million images on flickr or by reading all the Wikipedia pages there are. The example image above represents the number of references to locations in the English Wikipedia as a cartogram, which – together with similar analysis of Wikipediae in other languages – has allowed proposing a model for human perception of geography. Media mining offers a raft of interesting questions: Can we identify the meaning of words by seeing them in a particular context? Can we infer which areas of the world are particularly interesting by analysing the frequency of geo-coordinates of flickr images with particular tags? Can we create a world map of spheres of influence by classifying news stories and linking them to geographic areas? There are countless other similar questions. The main task of the PhD project is to look at a few interesting ones and develop techniques to answer these questions automatically.
Useful skills to work on this PhD project: PhD research is about generating new knowledge, hence we are looking for creative individuals, able to generate and pursue original ideas and to organise their own work with minimal day-to-day supervision. It is also desirable to have strong programming skills in Java/C++, experience with machine learning and the corresponding mathematics, and excellent verbal and written communication skills.
S Rüger, 2010: Multimedia information retrieval. Lecture notes in the series Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan and Claypool Publishers
S Overell and S Rüger: V iew of the world according to Wikipedia: are we all little Steinbergs? International Journal of Computational Science, 2(3), pp 193–197, 2011. DOI: 10.1016/j.jocs.2011.05.006
For informal enquiries please contact:
Prof Stefan Rueger
Professor of Knowledge Media
Tel. +44 (0)1908 655945.
Primary aim: To explore boundaries and limits of automated analysis of personal visual food logs.
Brief description: This project tries to carry out as much analysis as possible from visual food logs with little to no manual interference.
There are any number of challenging problems in this setting: automated selection of all the food images from your raw photo stream (classification); identifying the type of food (vegetables, fruit, alcohol, dairy products, sandwich, ...); creating ground truth and training examples for machine learning algorithms by, eg, utilising recipe databases with pictures of food; identifying typical size or volume from EXIF data (focal length, distance, depth estimation and extent of “food pixels”); near duplicate detection of packaged food (eg, sweets); creating a lifestyle analysis (frequency, times and locations of food/beverage intake); identifying repetitions of food intake (eg, favourite pizza every Friday night) thus enabling the algorithm to recognise previously marked-up meals; researching and developing ways to unobtrusively support the food log, eg, from a special ruler that you put next to the food when you take a picture or through a specific food-log app that allow you to both take the picture and mark-up your food log with only a few interactions.
These are indicative ideas for the types of problems that you might wish to tackle within the scope of this project. You are welcome to develop own avenues, for example, you might even want to explore ways to utilise or develop sensors that tell you about carbohydrate, fat or protein content of food.
Useful skills to work on this PhD project: PhD research is about generating new knowledge, hence we are looking for creative individuals, able to generate and pursue original ideas and to organise their own work with minimal day-to-day supervision. It is also desirable to have strong programming skills in Java/C++, experience with machine learning and the corresponding mathematics, and excellent verbal and written communication skills.
Keigo Kitamura, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2009. FoodLog: capture, analysis and retrieval of personal food images via web. In Proceedings of the ACM multimedia 2009 workshop on Multimedia for cooking and eating activities (CEA '09). ACM, New York, NY, USA, 23-30. DOI=10.1145/1630995.1631001
Your favourite text books on Machine Learning in Computer Vision
S Rüger, 2010: Multimedia information retrieval. Lecture notes in the series Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan and Claypool Publishers
For informal enquiries please contact:
Prof Stefan Rueger
Professor of Knowledge Media
Tel. +44 (0)1908 655945.