FAIR DATA Fund use case: Catalyzing plant natural product discovery with a community-driven knowledge base
Authors: Elena Del Pup (4th edition FAIR Data Fund grantee); Justin J. J. van der Hooft, Marnix H. Medema; Wageningen University & Research. Editor: Iulia Popescu
Plants specialized metabolism is a major source of medicines, novel foods, and natural products. With the rapid growth of multi-omics data and paired plant transcriptomics–metabolomics datasets, there are unprecedented opportunities to uncover the biosynthetic steps for the production of these molecules. However, systematic exploration of this data remains problematic because information about plant chemistry, biosynthesis, and experimental data is scattered across individual datasets and publications, making it difficult to leverage it for predictions and AI methods.
In this project, we are building a linked, queryable knowledge base that integrates paired multi-omics, curated pathway knowledge, and community annotations. To achieve this, we are building Plant Wikipathways, a queryable knowledge base of experimentally characterized plant pathways, by linking it to repositories across biosynthesis. Users will be able to query across experimental and public data to extract promising pathway predictions allowing for further validation. The computational strategies and tools developed in this project will support a communal effort to facilitate large-scale and reproducible analysis of plant metabolic diversity and natural product biosynthesis, extending on approaches for life sciences such as Wikidata. This open-source knowledge base will serve as a Linked Open Data community for collaborative hypothesis generation on plant specialized biosynthesis, prioritization of promising candidates, and participatory annotation, advancing the discovery and characterization of plant natural products.
Why now?
- Data abundance: rapid growth of paired plant transcriptomics–metabolomics datasets unlocks pathway inference at scale.
- Fragmented knowledge: pathway, chemistry, and genomics data live in silos, slowing prediction, validation, and AI-driven discovery.
- Linked open data momentum: community standards (Wikidata/WikiPathways) and SPARQL-queryable graphs make reusable, interoperable science practical.

Figure from F.C. Wolters, E. Del Pup, et al. Pairing omics to decode the diversity of plant specialized metabolism, Current Opinion in Plant Biology 82 (2024) 102657. https://doi.org/10.1016/j.pbi.2024.102657
What we built
To achieve this vision, we developed several building blocks of the knowledge base:
- Updated key repositories of plant biosynthesis;
- Collaborated on a pipeline to process and annotate paired multi-omics data in plants.
Catalyzing a community
During the FAIR Data Fund project, we gathered a community of researchers and hosted the symposium “Knowledge Graphs for Plant and Microbiome Multiomics” (14th October 2025) in Wageningen. With nearly 130 registered participants, the symposium brought together users, data generators, and tool developers in natural product discovery, all sharing the vision of building a linked knowledge base for natural product biosynthesis. The goal of the symposium was to bring together international speakers working at the interface of plant biology, computational metabolomics, and semantic data modeling and integration. The central question of the day was “How can we use linked open data and knowledge graphs to generate hypotheses in natural product biosynthesis?”. The recording of the symposium is available on YouTube.
Targeted roadmap session
After the symposium, we hosted a targeted discussion “Powering a Unified Knowledge Base for Plant Natural Product Discovery” to establish collaborations and find synergies between different research groups and tool developers and shape the design of the pathway knowledge base. By aligning our work with the natural product research community, we will ensure good project governance and sustainability.
Open community hub
The targeted brainstorming resulted in a research community that collects all of the individual efforts and projects connected to the collaborative vision, which is now publicly hosted on GitHub as a Pathway Linked Open Data Community.


The Symposium “Knowledge Graphs for Plant and Microbiome Multiomics” (14th October 2025) at Wageningen University & Research. Check the LinkedIn wrap-up post.
Building prototypes
The FAIR Data Fund allowed us to hire two Bioinformatics student assistants (Max Muller and Ariël Komen) at Wageningen University working on essential building blocks of the project.
The funding from the grant also allowed us to host the symposium “Knowledge Graphs for Plant and Microbiome Multiomics” in Wageningen and the follow-up targeted brainstorming to establish the research community working towards this vision. Additionally, Elena pitched the project at the OPEN and FAIR in Natural and Engineering Sciences (NES) in Utrecht on 22 May 2025.

Elena Del Pup pitching her project “Catalyzing plant natural product discovery with a community-driven linked knowledge base” at the OPEN and FAIR in Natural and Engineering Sciences (NES) in Utrecht on 22 May 2025.