Jianhan Zhu's profile document Description for Jianhan Zhu Jianhan Zhu Jianhan Zhu Jianhan Zhu Research Fellow I have worked on the AKT, Dot.Kom, and ELeGI projects. Currently I am interested in expert search in organizational intranets. A novel two-stage language model is proposed for effective expert search, and achieved excellent results in TREC Enterprise Track Expert Search task. I have led the development of the CORDER (Community Relation Discovery by Named Entities) system which measures associations between named entities in a text corpus by integrating their occurrences, distances in text, and frequencies in documents. I authored the ESpotter tool, which provides effective named entity recognition for a domain by adapting its lexicon and patterns to the domain through a URL tree. The Open University account for Jianhan Zhu jz232 Jianhan Zhu's membership at KMi Jianhan Zhu's participation in AKT AKT AKT 2000-10-01 2006-09-30 Advanced Knowledge Technologies The AKT project aims to develop the next generation of knowledge technologies to support organizational knowledge management. AKT will look at all aspects of knowledge management from acquiring and maintaining knowledge to publishing and sharing it. We intend to address all these closely related issues in an integrated approach, making use of recent developments in artificial intelligence, psychology, linguistics, multimedia and Internet technology. The AKT consortium comprises five UK universities and is funded by a 7M GBP, 6-year EPSRC grant in the context of the Interdisciplinary Research Collaborations programme. Jianhan Zhu's participation in DOT.KOM DOT.KOM DOT.KOM 2002-10-01 2005-03-31 Designing adaptive infOrmation exTraction from text for KnOwledge Management DotKom aims to support knowledge management within large corporation through a combination of information extraction and knowledge management technologies. A current problem with both of these technologies is that they are hard-to-use and require extensive expertise. DotKom will provide user-friendly *adaptive* information extraction tools which give instantaneous feedback on the current status of the information extraction learning process and the automatically constructed knowledge acquisition mechanisms. Jianhan Zhu's participation in CORDER CORDER CORDER 2005-12-22 COmunity Relation Discovery by named Entity Recognition CORDER (COmmunity Relation Discovery by named Entity Recognition) is an un-supervised machine learning algorithm that exploits named entity recognition and co-occurrence data to associate individuals in a community with their expertise and associates. CORDER discovers relations from the Web pages of the community. Its approach is based on co-occurrences of NEs and the distances between them. For a given NE, there are a number of co-occurring NEs. We assume that NEs that are closely related to each other tend to appear together more often and closer to each other in Web pages. We calculate a relation strength for each co-occurring NE based on its co-occurrences and distances from the given NE. The co-occurring NEs are ranked by their relation strengths. Jianhan Zhu's participation in The KMi semantic web The KMi semantic web The KMi semantic web 2005-01-19 The automated semantic data integration service The KMi semantic web generates and maintains semantic markup extracted from a variety of sources, including both departmental databases and HTML pages. In contrast with most other semantic web sites, the maintenance of the KMi semantic web is fully automated, relying on the integration of a number of technologies, including data integration and information extraction. A semantic portal provides integrated access tothe KMi semantic web and provides a number of mechanisms to browse, search and query the semantic web, including the AquaLog question answering system and a variety of semantic search engine Jianhan Zhu's participation in ESpotter ESpotter ESpotter 2005-12-22 Adaptive Named Entity Recognition for Web Browsing Named entity recognition (NER) systems are commonly designed with a "one-size-fits-all" philosophy. Lexicons and patterns manually crafted or learned from a training set of documents are applied to any other document without taking into account its background and user needs. However, when applying NER to Web pages, due to the diversity of these Web pages and user needs, one size frequently does not fit all. We present a system called ESpotter, which improves NER on the Web by adapting lexicons and patterns to domains on the Web and user preferences. Our results show that ESpotter provides more accurate and efficient NER on Web pages from various domains than current NER systems. Jianhan Zhu's participation in BuddyFinder-CORDER BuddyFinder-CORDER BuddyFinder-CORDER 2005-09-01 Find the right people with the right knowledge in the right place at the right time Online social networking tools are extremely popular, but can miss potential discoveries latent in the social 'fabric'. Matchmaking services can do naive profile matching with old database technology, and modern ontological markup, though powerful, can be onerous at data-input time. BuddyFinder-CORDER can automatically produce a ranked list of buddies to match a user's search requirements specified in a term-based query, even in the absence of stored user-profiles. We integrate an online social networking search tool called BuddyFinder with a text mining method called CORDER to rank a list of online users based on 'inferred profiles' of these users in the form of scavenged Web pages. Jianhan Zhu's participation in Vague Query Responder Vague Query Responder Vague Query Responder 2006-02-01 Bookshop owners can outperform Amazon and Google when the queries are vague - so can our software Although today's web search engines are very powerful, they still fail to provide intuitively relevant results for many types of queries, especially ones that are vaguely-formed in the user's own mind. We argue that associations between terms in a search query can reveal the underlying information needs in the users' mind and should be taken into account in search. Our initial experimental results on a corpus of 500 books from Amazon shows that our approach can find the right books for users given authentic vague queries, even in those cases where Google and Amazon's own book search fail.