FAIR Data Fund use case: Mining figures from scientific publications

We inter­view the author, Artur Schwei­dt­mann, about his project fund­ed by the 4TU.ResearchData FAIR Data Fund.

What is your project about?

The goal of our research is to make fig­ures from peer-reviewed open-access sci­en­tif­ic pub­li­ca­tions FAIR. Fig­ures can be found in almost all sci­en­tif­ic pub­li­ca­tions and com­prise a lot of infor­ma­tion. For exam­ple, graphs com­mon­ly show rel­e­vant cor­re­la­tions between mea­sure­ments. Anoth­er exam­ple are engi­neer­ing dia­grams which show the con­nec­tiv­i­ty of equip­ment in plants. How­ev­er, this infor­ma­tion is dif­fi­cult to find. We auto­mat­i­cal­ly extract fig­ures from sci­en­tif­ic pub­li­ca­tions and clas­si­fy them. In the future, we will make these acces­si­ble to the sci­en­tif­ic com­mu­ni­ty via a FAIR data plat­form. This will allow sci­en­tists to search direct­ly for rel­e­vant fig­ures. In the end, it can be imag­ined like a clever search engine for sci­en­tif­ic images. More­over, this has a great poten­tial for the train­ing of machine learn­ing mod­els in dif­fer­ent domains. 

What are some key results that you can share?

We have devel­oped soft­ware that can auto­mat­i­cal­ly extract images and clas­si­fy them. This increased the robust­ness of the approach and also con­tributed to the fol­low­ing pub­li­ca­tion [1]. In this pub­li­ca­tion, we auto­mat­i­cal­ly iden­ti­fied over 1,000 fig­ures that show spe­cif­ic (chem­i­cal) engi­neer­ing dia­grams called “flow­sheets”. In the future, we will extend this approach to dif­fer­ent types of fig­ures and cre­ate an open plat­form where every­one can access the images. More­over, we envi­sion to train machine learn­ing algo­rithms on the mined flow­sheet data to ulti­mate­ly sup­port the design of sus­tain­able chem­i­cal process­es. 

[1] Bal­horn, L. S., Gao, Q., Gold­stein, D., & Schwei­dt­mann, A. M. (2022). Flow­sheet recog­ni­tion using deep con­vo­lu­tion­al neur­al net­works. In Com­put­er Aid­ed Chem­i­cal Engi­neer­ing (Vol. 49, pp. 1567–1572). Else­vi­er.

How has the FAIR DATA Fund helped you with your project? What is the added val­ue?

The FAIR DATA Fund has helped us sig­nif­i­cant­ly by co-financ­ing a stu­dent assis­tant and an exter­nal soft­ware devel­op­er. This helped us to fur­ther devel­op the algo­rithms and improve robust­ness and qual­i­ty of our code.

Related Articles

Discover more from 4TU.ResearchData

Subscribe now to keep reading and get access to the full archive.

Continue reading