Member
Petr Knoth
Professor of Data Science
Petr Knoth is Professor of Data Science at the Knowledge Media institute, The Open University. He leads the Big Scientific Data and Text Analytics Group (BSDTAG) which conducts research and develops new technologies powered by AI in the area of the machine processing of scientific information. He is the Founder and Head of CORE (core.ac.uk), a large not-for-profit full text indexing system for open access papers with millions of monthly active users. CORE makes research papers available for people to freely discover and access, and for machines to text-mine.
In this capacity, Petr has been involved in numerous knowledge exchange cooperations with enterprises, funders and not-for-profit organisations, supporting a wide variety of use cases requiring scalable access to research content.
Petr has a deep interest in the use of AI to improve research workflows. He has been involved as a researcher and as a PI in over 25 European Commission, national and international funded research projects in the areas of NLP, AI, Open Science and Technology Enhanced Learning.
Keys: Natural Language Processing, Text and data mining Open Access, Open Science, Scholarly communication Information Retrieval, Information Extraction, Recommendation systems, Scientometrics
Team: Valeriy Budko, Matteo Cancellieri, Viktoriia Pavlenko, David Pride, Anton Zhuk
Projects
Technologies
Frictionless Data Exchange Across Research Data, Software and Scientific Paper Repositories
News
02 Jul 2025
20 May 2025
28 Apr 2025
04 Apr 2025
20 Mar 2025
Publications
Ghafourian, Y., Hanbury, A. and Knoth, P. (2025) Ranking To Learn: Human Experts, Search Engines, or LLMs for Learning Guidance, TPDL 2025: The 29th International Conference on Theory and Practice of Digital Libraries, Tampere, Finland
Cancellieri, M., El-Ebshihy, A., Fink, T., Fröbe, M., Galuščáková, P., Goeuriot, L., Iommi, D., Keller, J., Knoth, P., Mulhem, P., Piroi, F., Pride, D. and Schaer, P. (2025) Extended Abstract of LongEval at CLEF 2025: Longitudinal Evaluation of IR Systems on Web and Scientific Data, 16th International Conference of the CLEF Association (CLEF 2025), Madrid, Spain
Cancellieri, M., Docekal, M., Pride, D., Gruenpeter, M., Douard, D. and Knoth, P. (2025) Interoperable verification and dissemination of software assets in repositories using COAR Notify, The 20th International Conference on Open Repositories, Chicago, Illinois, USA
Cancellieri, M., Pride, D. and Knoth, P. (2025) Identifying and extracting Data Access Statements from full-text academic articles, Open Repositories 2025, Chicago, Illinois
Knoth, P., Walk, P., Cancellieri, M., Upshall, M., Torchylo, H., Beamer, J., Shearer, K. and Joseph, H. (2025) USRN Discovery Pilot: Increasing the Discoverability of Open Access Content Through a National Network, The 20th International Conference on Open Repositories, Chicago, Illinois, USA