An original template solution for FAIR scientific text mining

Niels A. Zondervan, Frazen Tolentino-Zondervan

Research output: Contribution to journalArticleAcademicpeer-review

32 Downloads (Pure)

Abstract

This method paper presents a template solution for text mining of scientific literature using the R tm package. Literature to be analyzed can be collected manually or automatically using the code provided with this paper. Once the literature is collected, the three steps for conducting text mining can be performed as outlined below:

• loading and cleaning of text from articles,
• processing, statistical analysis, and clustering, and
• presentation of results using generalized and tailor-made visualizations.

The text mining steps can be applied to a single, multiple, or time series groups of documents.

References are provided to three published peer reviewed articles that use the presented text mining methodology. The main advantages of our method are: (1) Its suitability for both research and educational purposes, (2) Compliance with the Findable Accessible Interoperable and Reproducible (FAIR) principles, and (3) code and example data are made available on GitHub under the open-source Apache V2 license.
Original languageEnglish
Article number102145
Pages (from-to)1-5
JournalMethodsX
Volume10
DOIs
Publication statusPublished - 2023

Fingerprint

Dive into the research topics of 'An original template solution for FAIR scientific text mining'. Together they form a unique fingerprint.

Cite this