Mining Inter-Relationships in Online Scientific Articles and its Visualization: Natural Language Processing for Systems Biology Modeling

Authors

  • Nidheesh Melethadathil Amrita School of Biotechnology, Amrita Vishwa Vidyapeetham, Clappana PO, Kollam, Kerala.
  • Jaap Heringa Vrije Universiteit, De Boelelaan 1081A, 1081 HV Amsterdam.
  • Bipin Nair Amrita School of Biotechnology, Amrita Vishwa Vidyapeetham, Clappana PO, Kollam, Kerala.
  • Shyam Diwakar Amrita School of Biotechnology, Amrita Vishwa Vidyapeetham, Clappana PO, Kollam, Kerala. https://orcid.org/0000-0003-1546-0184

DOI:

https://doi.org/10.3991/ijoe.v15i02.9432

Keywords:

online scientific databases, natural language processing, systems biology, automated clustering, visualization, bioNLP

Abstract


With the rapid growth in the numbers of scientific publications in domains such as neuroscience and medicine, visually interlinking documents in online databases such as PubMed with the purpose of indicating the context of a query results can improve the multi-disciplinary relevance of the search results. Translational medicine and systems biology rely on studies relating basic sciences to applications, often going through multiple disciplinary domains. This paper focuses on the design and development of a new scientific document visualization platform, which allows inferring translational aspects in biosciences within published articles using machine learning and natural language processing (NLP) methods. From online databases, this software platform effectively extracted relationship connections between multiple sub-domains within neuroscience derived from abstracts related to user query. In our current implementation, the document visualization platform employs two clustering algorithms namely Suffix Tree Clustering (STC) and LINGO. Clustering quality was improved by mapping top-ranked cluster labels derived from an UMLS-Metathesaurus using a scoring function. To avoid non-clustered documents, an iterative scheme, called auto-clustering was developed and this allowed mapping previously uncategorized documents during the initial grouping process to relevant clusters. The efficacy of this document clustering and visualization platform was evaluated by expert-based validation of clustering results obtained with unique search terms.  Compared to normal clustering, auto-clustering demonstrated better efficacy by generating larger numbers of unique and relevant cluster labels. Using this implementation, a Parkinson’s disease systems theory model was developed and studies based on user queries related to neuroscience and oncology have been showcased as applications.

Author Biographies

Nidheesh Melethadathil, Amrita School of Biotechnology, Amrita Vishwa Vidyapeetham, Clappana PO, Kollam, Kerala.

Nidheesh Melethadathil is currently working as Assistant Professor at Amrita Vishwa Vidyapeetham, India. (e-mail: nidheesh@am.amrita.edu)

Jaap Heringa, Vrije Universiteit, De Boelelaan 1081A, 1081 HV Amsterdam.

Jaap Heringa is the Professor of Bioinformatics and the Director, Centre for Integrative Bioinformatics VU (IBIVU), Vrije Universiteitt, Amsterdam, The Netherlands. (e-mail: heringa@few.vu.nl).

Bipin Nair, Amrita School of Biotechnology, Amrita Vishwa Vidyapeetham, Clappana PO, Kollam, Kerala.

Bipin Nair is the Professor and Dean of Faculty of Sciences, Amrita Vishwa Vidyapeetham, India. (e-mail: bipin@amrita.edu)

Shyam Diwakar, Amrita School of Biotechnology, Amrita Vishwa Vidyapeetham, Clappana PO, Kollam, Kerala.

Shyam Diwakar is an Associate Professor and the Lab Director of Computational Neuroscience Laboratory at the School of Biotechnology. Amrita University, India. He is a Young Faculty Fellow under Sir Visvesvaraya PhD scheme, Ministry of Electronics and IT, Government of India. He is a Co-investigator of VALUE (Virtual and Accessible Laboratories for Universalizing Education); a virtual labs initiative supported by Sakshat mission of MHRD, Government of India, and Principle Investigator of other projects funded by Department of Science and Technology, Government of India. (e-mail: shyam@amrita.edu). � 

Downloads

Published

2019-01-31

How to Cite

Melethadathil, N., Heringa, J., Nair, B., & Diwakar, S. (2019). Mining Inter-Relationships in Online Scientific Articles and its Visualization: Natural Language Processing for Systems Biology Modeling. International Journal of Online and Biomedical Engineering (iJOE), 15(02), pp. 39–59. https://doi.org/10.3991/ijoe.v15i02.9432

Issue

Section

Papers