Petr Knoth's profile document
Description for Petr Knoth
Petr Knoth
Petr Knoth
Petr
Knoth
Professor of Data Science
petrknoth
Petr Knoth is a full professor at the Knowledge Media institute, The Open University. He leads the Big Scientific Data and Text Analytics Group (BSDTAG) which conducts research and develops new technologies powered by AI in the area of the machine processing of scientific information. He is the Founder and Head of CORE (core.ac.uk), a large not-for-profit full text indexing system for open access papers with millions of monthly active users. CORE makes research papers available for people to freely discover and access, and for machines to text-mine.
In this capacity, Petr has been involved in numerous knowledge exchange cooperations with enterprises, funders and not-for-profit organisations, supporting a wide variety of use cases requiring scalable access to research content.
Petr has a deep interest in the use of AI to improve research workflows. He has been involved as a researcher and as a PI in over 25 European Commission, national and international funded research projects in the areas of NLP, AI, Open Science and Technology Enhanced Learning.
298ab13fda37cacecc2fb46271349e2a2d4db7e6
The Open University account for Petr Knoth
pk3295
Petr Knoth's membership at KMi
Petr Knoth on LinkedIn
Petr Knoth on SlideShare
@petrknoth (Petr Knoth on Twitter)
Petr Knoth's participation in UK Aggregation 2
UK Aggregation 2
UK Aggregation 2
2014-07-14
2015-08-31
Large scale aggregation of open access research papers
UK Aggregation 2 aims at maintaining the CORE aggregation system developed through a series of projects and currently operated in the Knowledge Media institute, The Open University (OU). UK Aggregation 2 is a follow-on of the UK Aggregation project, which analysed the cost implications and scenarios for running the CORE aggregation system and resulted in the discussions between Jisc and OU about creating a sustainable service to be jointly delivered from July 2015. In addition to basic maintenance, the goals of the project are to continue:
- the harvesting of metadata and content from repositories
- the expansion of the supported repositories
- the serving of the existing CORE applications
- the integration with existing services in the repository ecosystem
- the monitoring of the progress of the aggregation activities through a set of benchmarks
- dissemination and promotion of the CORE services towards various target audiences including researchers, developers, text-miners, repository managers, funders, etc.
In addition, the project aims to provide:
- the implementation of the Jisc/OU recommendations on branding
- a new feature offering the UK view on an international aggregation
- the communication/collaboration with key projects, partners and stakeholders in the area, such as OpenAIRE, as needed or recommended by Jisc to ensure the main use cases around CORE can be or are effectively exploited.
- a service-ready aggregation solution by the end of the project with the capabilities to support multiple use cases.
Petr Knoth's participation in Eurogene
Eurogene
Eurogene
2007-10-01
2010-09-30
The first Pan-European Learning Service in the Field of Genetics
EuroGene is a European Commission supported e-ContentPlus project concerned with providing high quality semantically enriched educational content in genetics. The objective of the EUROGENE project is to migrate toward the more efficient development of high quality didactic material on genetics through the guided editing and assembly of educational packages based on the IMS learning design metadata framework and the sharing of different types of learning objects between content owners, in 14 languages. The EuroGene consortium brings together16 partners in the field of genetics from 11 different countries.
The primary role of KMI within EuroGene is to apply tools and methods for content annotation, content authoring and assembly and the navigation different learning pathways through the available content.
Petr Knoth's participation in Tech-It-Easy
Tech-It-Easy
Tech-It-Easy
2009-06-01
2011-05-31
information system, based on analytical and knowledge-based tools
The TECH-IT-EASY project will develop an information system, based on analytical and knowledge-based tools, able to support electromechanical European SMEs to in structuring and systematising the internal product innovation process. The result will be a fully operating information system, consisting of:
<ul>
<li>A methodological tool box to structure and define SMEs technology products, abstracting them from the specific industrial context, to allow the usage of external knowledge for technology innovation. Such methodological tool-box will be based on the combined application of QFD (Quality Function Deployment) market-pull approach with technologypush potentials of TRIZ (Theory of Inventive Problem Solving).</li>
<li>An information agent that analyses digital information within the enterprise, at information providers, and on the Web. This system will work context-sensitive in terms of considering the role and the experience of the person that carries out the search as well as the current product under study. Such information agent will make usage of ontology as symbolic knowledge representation of the overall innovation knowledge domain.</li>
<li>An Innovation-Process support tool, to guide users through the whole innovation process, connecting the technology system related to the product under study with the knowledge base constituted by market and technology information (gathered through the information agent), and by a set of pre-acquired knowledge (such as the so-called "trends of evolution" of technological systems) allowing SMEs to identify innovation opportunities out of their internal knowledge</li>
</ul>Main Scientific Achievement will be the development of an information system that is able to deal with weakly structured content as well as formalized databases, such as the ones related to the overall innovation-process, and the structuring the overall innovation-process through the usage of ontologies as formal way of representing knowledge in a way understandable by machines.
Petr Knoth's participation in CORE - COnnecting REpositories
CORE - COnnecting REpositories
CORE - COnnecting REpositories
Linking semantically similar publications from Open Access repositories using text mining from full-text and representing the relations as Linked Data
CORE (core.ac.uk) aims to aggregate all open access research outputs from repositories and journals worldwide and make them available to the public. In this way CORE facilitates free unrestricted access to research for all.
Petr Knoth's participation in REF 2021 Predictions
REF 2021 Predictions
REF 2021 Predictions
2017-01-01
2021-01-01
Web-scale research analytics for identifying high performance and trends: data-driven approaches to Scientometrics.
Over the recent years, there has been a growing interest in developing new scientometric measures that go beyond the traditional citation-Ââ€based bibliometric measures. This interest is motivated on one side by the wider availability or even emergence of new information evidencing research performance, such as article downloads, views, and twitter mentions, and on the other side by the continued frustrations and problems surrounding the application of citation-Âbased metrics to evaluate research performance in practice.
The research looks into new ways of utilizing full-Âtexts of research papers to evaluate research impact at the granularity of individual papers, researchers as well as institutions. It will consider the evolution of evidence influencing research metrics in time and the emergence of new trends and new research communities as valuable signals.
Petr Knoth's participation in Eurogene Software
Eurogene Software
Eurogene Software
Eurogene e-learning system in the domain of genetics
Eurogene is an e-learning system in the domain of genetics that provides free multimedia learning resources in nine languages for statistical, medical and molecular genetics and delivers them to students and professionals. The Eurogene content includes presentations, reviewed research articles, images, videos and learning packages submitted by world-leading geneticists.
An essential part of the Eurogene system is a multilingual search engine that allows to search for content in one language while retrieving the results in other languages. This is complemented by the use of a machine translation system fine-tuned for genetic terminology. The search engine uses a query language similar to PubMed.
Eurogene also aims at providing intelligent ways of navigation through the e-Learning system. As new learning resources are being continuously submitted to the system, it is not possible to maintain links between them manually. Eurogene automatically links resources that are semantically similar using natural language processing.
Petr Knoth's participation in DECIPHER
DECIPHER
DECIPHER
2011-01-01
2013-12-31
Digital Environment for Cultural Interfaces; Promoting Heritage, Education and Research
Digital heritage and semantic web technologies hold out the promise of nearly unlimited access to cultural knowledge. The problem is that cultural meaning does not reside in individual objects but in the patterns of knowledge and events, belief and thought that link them to each other and to the observer. This is why story is so important to the communication of, and meaningful understanding of culture.
DECIPHER is developing new solutions to the whole range of narrative construction, knowledge visualisation and display problems. It will change the way people access digital heritage by combining much richer, event-based metadata with causal reasoning models.
This will result in a reasoning engine, virtual environment and interfaces that can help curators and visitors to present digital heritage objects as part of a coherent narrative that is directly related to the user's interests. This will allow the user to interactively assemble, visualise and explore, not just collections of objects, but the knowledge structures that connect and give them meaning.
Petr Knoth's participation in RETAIN
RETAIN
RETAIN
2011-03-01
2013-08-31
Retaining Students through Intelligent Interventions
The RETAIN project aims to extend the Open University’s existing Business Intelligence systems, with a particular focus on improving student retention. RETAIN will make it possible to integrate additional data sources, such as data from VLE’s, with existing statistical methods and to further extend the functionality by using predictive modelling to identify students who are at risk of non-completion of their courses. This will allow for better targeted interventions towards these students. A demonstrator will be developed, for visualising retention data that allows viewing of both aggregated and individual student data on selected dimensions. This will be usable by tutors and programme managers for determining strategy in both the short-term and long-term, on individual, course and faculty levels. The developed tools and methods will be trialled with a view to longer term uptake and further extensions to Business Intelligence functionality. The predicted benefits are improved retention and progression, leading to a financial cost savings for the OU and a better student experience.
Petr Knoth's participation in CORE
CORE
CORE
2011-02-01
2020-07-31
The world's largest collection of open access research papers
CORE hosts the world's largest collection of open access research outputs, which are used and referenced by people globally, including researchers, libraries, software developers, funders and many more. CORE delivers a number of key measurable benefits to institutions, repositories, and researchers through its services. The value of CORE is not only provided by its services, but mostly by helping others in the delivery of their use cases. This makes CORE an enabling infrastructure, allowing for text mining, business intelligence, compliance monitoring and research analytics.
Benefits
The CORE services:
* provide real-time machine access to metadata and full texts of research papers in CORE.
* help to download CORE data and run processes in your own infrastructure, access data across all of our data providers, prototype new methods, data analysis and text mining
* recommend papers to read based on users' interests;support users in discovering articles of interest from across the network of open access repositories
* increase the visibility of content in open access repositories and journals
* assist users in finding freely accessible copies of research papers that are often behind a paywall
* provide an online interface offering valuable technical information and statistics to content providers
We aim to:
* Support the right of citizens to access the results of research towards which they contributed by paying taxes
* Provide support to both content consumers and content providers by working collaboratively with them
* Contribute to a cultural change by promoting open access, a fast-growing movement for good
* Make use of artificial intelligence and machine learning techniques to enrich and organise research content and support users in discovering knowledge of their interest
Petr Knoth's participation in ServiceCORE
ServiceCORE
ServiceCORE
2011-11-01
2012-07-31
Services for Connected Repositories
The ServiceCORE project aims to develop a new nation-wide aggregation service that will improve the discovery of research publications stored across British Open Access repositories. The ServiceCORE project will extend the solution provided by the CORE system, developed in the first stage of the Resource Discovery programme. CORE is a pilot system that harvests both content and metadata from British repositories and makes them accessible through three applications - a Web portal, a Mobile application and a Plugin for institutional repositories. The ServiceCORE project will extend this system with:
(a) a new Web Service layer working on top of the CORE Linked Data repository, providing programmable access to both content and metadata,
(b) an enhanced related resource discovery system based on text-mining,
(c) a pilot tool for automatic subject-based classification of content using text categorisation techniques.
The ServiceCORE will also increase the CORE repository coverage to at least 80% of British OAI-PMH compliant repositories and will improve the policies for content updating.
Petr Knoth's participation in DiggiCORE
DiggiCORE
DiggiCORE
2012-01-01
2014-03-31
Digging into Connected Repositories
The goal of DiggiCORE is to analyse a vast set of research publications from the Open Access domain using natural language processing and social network analysis methods to identify patterns in the behaviour of research communities, to recognise trends in research disciplines, to learn new insights about the citation behaviours of researchers etc.
Petr Knoth's participation in FOSTER
FOSTER
FOSTER
2014-02-01
2016-01-31
Facilitate Open Science Training for European Research
FOSTER aims to support different stakeholders, especially young researchers, in adopting open access in the context of the European Research Area (ERA) and in complying with the open access policies and rules of participation set out for Horizon 2020 (H2020).
FOSTER will establish a European-wide training programme on open access and open data, consolidating training activities at downstream level and reaching diverse disciplinary communities and countries in the ERA.
The training programme will include different approaches and delivery options: elearning, blearning, self-learning, dissemination of training materials/contents, helpdesk, face-to-face training, especially training-the-trainers, summer schools, seminars, etc.
OBJECTIVES
- Support different stakeholders, especially young researchers, in adopting open access in the context of the European Research Area (ERA) and in complying with the open access policies and rules of participation set out for Horizon 2020;
- Integrate open access principles and practice in the current research workflow by targeting the young researcher training environment;
- Strengthen the institutional training capacity to foster compliance with the open access policies of the ERA and Horizon 2020 (beyond the FOSTER project);
- Facilitate the adoption, reinforcement and implementation of open access policies from other European funders, in line with the EC�s recommendation.
MAIN ACTIVITIES
1 � Identifying already existing contents that can be reused in the context of the training activities and repackaging, reformatting them to be used within FOSTER, and develop/create/ enhance contents if/where they are needed.
2 � Creation of the FOSTER Portal to support e-learning, blended learning, self-learning, dissemination of training materials/contents and Helpdesk.
3 � Delivery of face-to-face training, especially training trainers/multipliers that can carry on further training and dissemination activities, within their institutions, countries and/or disciplinary communities.
Project number: 612425
Start Date: 01/02/2014
Duration: 24 months
Funding from the EC: 1.499.860,00�
Petr Knoth's participation in Europeana Cloud
Europeana Cloud
Europeana Cloud
2013-02-01
2016-02-01
eCloud
Europeana Cloud is a Best Practice Network, submitted under Objective 2.1.a and coordinated by the Europeana Foundation, designed to establish a cloud-based system for Europeana and its aggregators. Europeana Cloud will provide new content, new metadata, a new linked storage system, new tools and services for researchers and a new platform - Europeana Research. Content providers and aggregators, across the European information landscape, urgently need a cheaper, more sustainable infrastructure that is capable of storing both metadata and content. Researchers require a digital space where they can undertake innovative exploration and analysis of Europe's digitised content. Europeana needs to get closer to the target of 30 million items by 2015. Europeana Cloud meets these needs.
The key objectives of Europeana Cloud are:
1. To provide access, at Europeana, to 1.1m new metadata records and 5m research focussed items from across European Universities, libraries, data centres and publishers;
2. To create a cloud based infrastructure capable of delivering cost-efficient content and metadata storage for
stakeholders across Europe;
3. To understand and incorporate the legal, strategic and economic issues of a
cloud-based system for content for cultural heritage institutions and domain aggregators;
4. To achieve a broad consensus among European content aggregators and research networks on the advantages of a cloud based solution;
5. To develop a digital platform, named Europeana Research, to discover and use Europeana research
content;
6. Via this cloud to provide tools and services for researchers that permit innovative research that exploits digitised content in Europeana. This is a vital project for the Europeana network of content providers
and aggregators, moving to an infrastructure that can deal not just with descriptive metadata but actual digitised content as well.
Petr Knoth's participation in UK Aggregation
UK Aggregation
UK Aggregation
2013-10-01
2014-02-28
Developing the UK Aggregator of Open Access metadata and content from repository systems
UK Aggregation will provide technology for aggregating metadata and content from UK institutional repositories to be used as a component of the Jisc Repository Shared Services Infrastructure. The resulting aggregation will be used as a building block to satisfy various use cases, such as (a) the provision of a centrally managed cache of metadata and content for search engine optimisation, (b) the delivery of source data
for text-mining from open access papers, (c) the monitoring of metadata (OpenAire, RIOXX) or open access policy compliance (HEFCE. RCUK) or (d) the analysis of growth and usage of repository data. In technical terms, the project will build on the existing CORE aggregator, which is already largely in a service ready state. Therefore, the project will largely focus on integration and interoperability of the aggregation service with relevant services maintained by different stakeholders. It will consult these stakeholders and propose an integration plan that will be realised in the next stage (after the end of this project). Finally, the project team will, based on consultations with Jisc, decide on the economical and governance structure under
which the service will operate in the future.
Petr Knoth's participation in OARR
OARR
OARR
2013-10-01
2014-08-31
Open Acess Repository Registry
The OARR project will provide a data-driven open access repository registry infrastructure which will allow services to share or reuse underlying data. The project will take what we have learned from building and maintaining OpenDOAR to the next level to provide an advanced, data-driven infrastructure which will maximise the potential for use with 3rd party services such as aggregators, cross-search tools, multiple-deposit interfaces, etc. by making available authoritative and quality controlled data through a RESTful API.
Petr Knoth's participation in OpenMinTeD
OpenMinTeD
OpenMinTeD
2016-06-01
2018-05-31
Open Mining Infrastructure for Text & Data
OpenMinted sets out to create an open, service-oriented e-Infrastructure for Text and Data Mining (TDM) of scientific and scholarly content. Researchers can collaboratively create, discover, share and re-use Knowledge from a wide range of text-based scientific related sources in a seamless way.
Petr Knoth's participation in FIT4RRI
FIT4RRI
FIT4RRI
2017-05-01
2020-04-30
Fostering Improved Training Tools for Responsible Research and Innovation
Bridging the gap between RRI and Open Science to manage the rapid transformation processes affecting science, requires a critical mass of experts that is available at the European level. Acknowledged experts from both Universities and Institutes have multi-disciplinary expertise.
Petr Knoth's participation in FOSTER Plus
FOSTER Plus
FOSTER Plus
2017-05-01
2019-04-30
FOSTER Plus project focuses on promoting the practical implementation of Open Science, with activities targeting academic staff and young scientists.
FOSTER Plus (Fostering the practical implementation of Open Science in Horizon 2020 and beyond) is a 2-year, EU-funded project, carried out by 11 partners across 6 countries. The primary aim is to contribute to a real and lasting shift in the behaviour of European researchers to ensure that Open Science (OS) becomes the norm.
Research communities, research performing institutions, and research funders have each recognised that OS skills are increasingly essential for researchers to undertake responsible research and innovation. While there is increasing agreement around the need to improve OS skills amongst all stakeholders, the adoption of OS approaches has been quite limited to date. Indeed general awareness of OS approaches has improved among EU researchers. However, there is still a lack of practical guidance and training to help researchers learn how to open up their research within a particular domain or research environment. For this reason, FOSTER Plus places specific emphasis on creating discipline-specific guidance and is partnering with expert organisations representing the scientific areas of life science, social science and humanities.
FOSTER Plus will enhance existing materials and co-produce new training content. The resources will be discipline-specific and their practical and tangible outcomes can directly be applied into researchers’ daily practices. The training activities will be addressed to all relevant stakeholders in the European Research Area, with a focus on young scientists, academic staff and policy makers. A strong train-the-trainer approach and network of open science trainers to act as ambassadors will help to reach a wide audience.
Petr Knoth's participation in ON-MERRIT
ON-MERRIT
ON-MERRIT
2019-10-01
2022-04-30
Observing and Negating Matthew Effects in Responsible Research and Innovation Transition
ON-MERRIT targets an equitable scientific system that rewards based on merit rather than the "Matthew Effect" of cumulative advantage. The project aims to analyse the role of Matthew Effect in Open Science/RRI, and look for and test the use of new and more equitable OS/RRI indicators. Responsible Research and Innovation (RRI), including elements like Open Science and Gender Equality, promises to fundamentally transform scholarship to bring greater transparency and participation to research processes, and increase the impact of outputs. Yet just making processes open will not per se drive re-use or participation unless also accompanied by the capacity (in terms of knowledge, skills, motivation and technological readiness) to do so. ON-MERRIT will hence investigate how existing inequalities drive outcomes in the uptake of Open Science and Responsible Research and Innovation across academia, industry and policy-making. Once this evidence has been gathered, alternative policy-proposals to counteract any negative effects will be tested through modelling, and final recommendations made to policy-makers, funders and institutions.
Petr Knoth's participation in FOSTER (fosteropenscience.eu)
FOSTER (fosteropenscience.eu)
FOSTER (fosteropenscience.eu)
2014-02-01
FOSTER was a coordination initiative that promoted the integration of open access principles and practice in the current research workflow.
FOSTER's Objectives:
- support a culture change, whereby the practical aspects of Open Science are fully implemented and ultimately rewarded, by providing an advanced-level, outcome-oriented training programme based on courses and activities for which participants can attain digital badges;
- consolidate and sustain a training support network comprised of Open Science ambassadors from a range of research performing organisations and research infrastructures;
- strengthen the training capacity by addressing the current skills and content gaps, both at community/discipline and institutional levels, on the practical implementation of Open Science.
Petr Knoth's participation in Frictionless Data Exchange Across Research Data, Software and Scientific Paper Repositories
Frictionless Data Exchange Across Research Data, Software and Scientific Paper Repositories
Frictionless Data Exchange Across Research Data, Software and Scientific Paper Repositories
2018-02-01
This project was funded under the EOSC Pilot H2020 project as demnstrator.
The work produced the following outputs:
- Implementation and a deployed online version of the demonstrator service exhibiting fast and highly scalable exchange of metadata and content across repositories storing research data, papers and scientific software.
- A piece of evidence and argument for modernising existing communication mechanisms routinely used by repositories using our solution. This was delivered in the form of an empirical evaluation, comprising the source code, the experimental data and a formal research publication describing the experiment.
The impact of this work was:
- A clear path to go beyond the current state-of-the-art in efficient and effective information exchange between EOSC data
providers and services.
- Scalable client/server implementation(s) of the ResourceSync protocol for easy adoption at the side of data providers.
- Raised awareness of existing problems and the offered solution
Petr Knoth's participation in VocTeach
VocTeach
VocTeach
Making it quick and easy to locate quality vocational teaching resources.
The VocTeach platform enables vocational educators to share information about curriculum resources for their blended teaching. Initially, VocTeach supports educators by curating and recommending the resources they need to teach English and Maths Functional Skills. More skillsets will be added in due course. The project is funded by UFI and Open University.