A B C D E F G H I J K L M N O P Q R S T U V W X Y Z all

Member

Petr KnothMember status icon

Professor of Data Science
Petr Knoth Photograph

Telephone Icon +44 (0)1908 654548

Website Icon Camera Icon RDF Icon

Twitter Icon LinkedIn Icon

Petr Knoth is Professor of Data Science at the Knowledge Media institute, The Open University. He leads the Big Scientific Data and Text Analytics Group (BSDTAG) which conducts research and develops new technologies powered by AI in the area of the machine processing of scientific information. He is the Founder and Head of CORE (core.ac.uk), a large not-for-profit full text indexing system for open access papers with millions of monthly active users. CORE makes research papers available for people to freely discover and access, and for machines to text-mine.

In this capacity, Petr has been involved in numerous knowledge exchange cooperations with enterprises, funders and not-for-profit organisations, supporting a wide variety of use cases requiring scalable access to research content.

Petr has a deep interest in the use of AI to improve research workflows. He has been involved as a researcher and as a PI in over 25 European Commission, national and international funded research projects in the areas of NLP, AI, Open Science and Technology Enhanced Learning.

Keys: Natural Language Processing, Text and data mining Open Access, Open Science, Scholarly communication Information Retrieval, Information Extraction, Recommendation systems, Scientometrics

Team: , Matteo Cancellieri, , David Pride,

News

08 Oct 2025


25 Sep 2025


02 Jul 2025


20 May 2025


28 Apr 2025

View all Articles

Publications

Cancellieri, M., El-Ebshihy, A., Fink, T., Fröbe, M., Galuščáková, P., Gonzalez-Saez, G., Goeuriot, L., Iommi, D., Keller, J., Knoth, P., Mulhem, P., Piroi, F., Pride, D. and Schaer, P. (2025). LongEval at CLEF 2025: Longitudinal Evaluation of IR Systems on Web and Scientific Data. In: 16th International Conference of the CLEF Association, CLEF 2025, 9-12 Sep 2025, Madrid, Spain. https://oro.open.ac.uk/107796/.

Pride, D., Guenci, M., Docekal, M., Peroni, S. and Knoth, P. (2025). Identifying and classifying software mentions in full text scholarly documents. In: 25th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2025), 15-19 Dec 2025, Online. https://oro.open.ac.uk/107241/.

Staudinger, M., Kusa, W., Cancellieri, M., Pride, D., Knoth, P. and Hanbury, A. (2025). Compare: A Framework for Scientific Comparisons. In: CIKM ’25, 10-14 Nov 2025, Seoul, Republic of Korea. https://oro.open.ac.uk/107232/.

Ghafourian, Y., Hanbury, A. and Knoth, P. (2025). Ranking to Learn: Human Experts, Search Engines, or LLMs for Learning Guidance. In: TPDL 2025 – 29th International Conference on Theory and Practice of Digital Libraries, 23 to 26 Sep 2025, Tampere, Finland. https://oro.open.ac.uk/106883/.

Cancellieri, M., Klein, M., Prater, S., Sherrick, A. and Knoth, P. (2025). Managing Access to Open Repositories in the Age of Generative AI. In: 20th International Conference on Open Repositories, 15-18 Jun 2025, Chicago, Illinois, USA. https://oro.open.ac.uk/105562/.

Cancellieri, M., El-Ebshihy, A., Fink, T., Fröbe, M., Galuščáková, P., Goeuriot, L., Iommi, D., Keller, J., Knoth, P., Mulhem, P., Piroi, F., Pride, D. and Schaer, P. (2025). Extended Abstract of LongEval at CLEF 2025: Longitudinal Evaluation of IR Systems on Web and Scientific Data. In: 16th International Conference of the CLEF Association (CLEF 2025), 9-12 Sep 2025, Madrid, Spain. https://oro.open.ac.uk/105538/.

Ghafourian, Y., Hanbury, A. and Knoth, P. (2025). Ranking To Learn: Human Experts, Search Engines, or LLMs for Learning Guidance. In: TPDL 2025: The 29th International Conference on Theory and Practice of Digital Libraries, 23-26 Sep 2025, Tampere, Finland. https://oro.open.ac.uk/105535/.

Knoth, P., Cancellieri, M. and Pride, D. (2025). Open Access Repositories Tracking Project. In: The 20th International Conference on Open Repositories, 15-18 Jun 2025, Chicago, Illinois, USA. https://oro.open.ac.uk/105533/.

Knoth, P., Walk, P., Cancellieri, M., Upshall, M., Torchylo, H., Beamer, J., Shearer, K. and Joseph, H. (2025). USRN Discovery Pilot: Increasing the Discoverability of Open Access Content Through a National Network. In: The 20th International Conference on Open Repositories, 15-18 Jun 2025, Chicago, Illinois, USA. https://oro.open.ac.uk/105531/.

Cancellieri, M., Docekal, M., Pride, D., Gruenpeter, M., Douard, D. and Knoth, P. (2025). Interoperable verification and dissemination of software assets in repositories using COAR Notify. In: The 20th International Conference on Open Repositories, 18-21 Jun 2025, Chicago, Illinois, USA. https://oro.open.ac.uk/105529/.

Cancellieri, M., Pride, D. and Knoth, P. (2025). Identifying and extracting Data Access Statements from full-text academic articles. In: Open Repositories 2025, 15-18 Jun 2025, Chicago, Illinois. https://oro.open.ac.uk/105487/.

Cancellieri, M., El-Ebshihy, A., Fink, T., Galuščáková, P., Gonzalez-Saez, G., Goeuriot, L., Iommi, D., Keller, J., Knoth, P., Mulhem, P., Piroi, F., Pride, D. and Schaer, P. (2025). LongEval at CLEF 2025: Longitudinal Evaluation of IR Model Performance. In: Hauff, Claudia; Macdonald, Craig; Jannach, Dietmar; Kazai, Gabriella; Nardini, Franco Maria; Pinelli, Fabio; Silvestri, Fabrizio and Tonellotto, Nicola eds. Advances in Information Retrieval: 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6–10, 2025, Proceedings, Part V. Lecture Notes in Computer Science (LNCS), 15576. Cham, CH: Springer, pp. 382–388. https://oro.open.ac.uk/104944/.

Kusa, W., E. Mendoza, O., Samwald, M., Knoth, P. and Hanbury, A. (2024). CSMED: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews. In: 37th Conference on Neural Information Processing Systems (NeurIPS 2023): Track on Datasets and Benchmarks., 10 Dec 2023, New Orleans, USA.. https://oro.open.ac.uk/102450/.

Knoth, P., Klein, M., Macgregor, G., Cancellieri, M. and Walk, P. (2024). How to make repository content indexed and discoverable. In: The 19th International Conference on Open Repositories, 03-06 Jun 2024, Göteborg, Sweden. https://oro.open.ac.uk/102474/.

George, M., Knoth, P., Walk, P., Dowson, N., Eadie, M., Jones, B. and Martínez-García, A. (2024). Exploring the concept of 'custodianship' in harvesting repository resources and graphing their relations: Rioxx version 3. 0. In: The 19th International Conference on Open Repositories, 03-06 Jun 2024, Göteborg, Sweden. https://oro.open.ac.uk/102475/.

Knoth, P., Laurent, R., Lopez, P., Di Cosmo, R., Smrz, P., Umerle, T., Harrison, M., Monteil, A., Cancellieri, M. and Pride, D. (2025). Making Software FAIR: A machine-assisted workflow for the research software lifecycle. In: 19th International Conference on Open Repositories (OR2024), 3-6 Jun 2024, Göteborg, Sweden. https://oro.open.ac.uk/102429/.

Ross-Hellauer, T., Klebel, T., Knoth, P. and Pontika, N. (2024). Value dissonance in research(er) assessment: individual and perceived institutional priorities in review, promotion, and tenure. Science and Public Policy, 51(3), pp. 337–351. https://oro.open.ac.uk/94443/.

Mendoza, Ó.E., Kusa, W., El-Ebshihy, A., Wu, R., Pride, D., Knoth, P., Herrmannova, D., Piroi, F., Pasi, G. and Hanbury, A. (2022). Benchmark for Research Theme Classification of Scholarly Documents. In: COLING 2022: 29th International Conference on Computational Linguistics, 12-17 Oct 2022, Gyeongju, South Korea. https://oro.open.ac.uk/94380/.

Ghafourian, Y., Hanbury, A. and Knoth, P. (2023). Ranking for Learning: Studying Users’ Perceptions of Relevance, Understandability, and Engagement. In: International Conference on Theory and Practice of Digital Libraries TPDL 2023: Linking Theory and Practice of Digital Libraries, 26-29 Sep 2023, Zadar, Croatia. https://oro.open.ac.uk/94410/.

Ghafourian, Y., Hanbury, A. and Knoth, P. (2023). Readability Measures as Predictors of Understandability and Engagement in Searching to Learn. In: Linking Theory and Practice of Digital Libraries 27th International Conference on Theory and Practice of Digital Libraries, TPDL 2023, 26-29 Sep 2023, Zadar, Croatia. https://oro.open.ac.uk/94411/.

View By

Research Themes

#kmiou on Bluesky

CONTACT US

Knowledge Media Institute
The Open University
Walton Hall
Milton Keynes
MK7 6AA
United Kingdom

Tel: +44 (0)1908 653800

Fax: +44 (0)1908 653169

Email: KMi Support

COMMENT

If you have any comments, suggestions or general feedback regarding our website, please email us at the address below.

Email: KMi Development Team