I am a PhD Candidate in Biomedical Informatics at Stanford University School of Medicine. I graduated from the Indian Institute of Technology, Kharagpur under the Dual Degree Programme - Bachelor (Honours) and Master of Technology, in Biotechnology and Biochemical Engineering. During my stay at IIT, I had qualified for the prestigious Google Summer of Code Program for 3 successive years. I have contributed to Drupal, an open-source content management platform, and Genome Informatics - Reactome Project, a knowledgebase of biological pathways.
I am interested to research at the intersection of Biosciences, Big Data and the Web. After my graduation, I had joined the Digital Enterprise Research Institute (DERI), Ireland under its Health Care and Life Sciences Unit. I was responsible for the development of user-driven platforms, facilitating intuitive data exploration, for the EU FP7 GRANATUM Project, Linked TCGA Project and the Ireland's Open Data Initiative (Data.gov.ie). I was a part of the team, which won the Best Paper Award at CSHALS 2014 and the Semantic Web Challenge Award (Big Data Prize) at ISWC 2013.
Honors & Awards
Student Best Resource Paper Award, 16th International Semantic Web Conference (October 2017)
NSF Travel Award, 16th International Semantic Web Conference (October 2017)
Best Poster Award (Graphic Design), Stanford Biomedical Informatics Program (September 2017)
NIH/NLM Travel Award, 21st Pacific Symposium on Biocomputing (January 2016)
Best Project Award (Graphic Design), Stanford Biomedical Informatics Program (September 2015)
Best Paper Award, 7th Conference on Semantics in Healthcare and Life Sciences (February 2014)
Semantic Web Challenge Award (Big Data Prize), 12th International Semantic Web Conference (October 2013)
Best Project Award, 10th Summer School on Ontology Engineering and the Semantic Web (July 2013)
Best Poster Award, 10th Summer School on Ontology Engineering and the Semantic Web (July 2013)
Google Summer of Code Student (Genome Informatics - Reactome), Google Inc. (August 2012)
Honourable Mention in Technology, Indian Institute of Technology (IIT), Kharagpur (April 2012)
Best Outgoing Student (Technology), Meghnad Saha Hall of Residence, IIT Kharagpur (April 2012)
Google Summer of Code Student (Genome Informatics - Reactome), Google Inc. (August 2011)
Google Summer of Code Student (Drupal), Google Inc. (August 2010)
Xavierite Super Award, St. Xavier’s High School, Ahmedabad (February 2007)
BiOnIC: A Catalog of User Interactions with Biomedical Ontologies
International Semantic Web Conference
View details for DOI 10.1007/978-3-319-68204-4_13
PhLeGrA: Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data
International Conference on World Wide Web
View details for DOI 10.1145/3038912.3052692
Analyzing user interactions with biomedical ontologies: A visual perspective
JOURNAL OF WEB SEMANTICS
2018; 49: 16–30
Biomedical ontologies are large: Several ontologies in the BioPortal repository contain thousands or even hundreds of thousands of entities. The development and maintenance of such large ontologies is difficult. To support ontology authors and repository developers in their work, it is crucial to improve our understanding of how these ontologies are explored, queried, reused, and used in downstream applications by biomedical researchers. We present an exploratory empirical analysis of user activities in the BioPortal ontology repository by analyzing BioPortal interaction logs across different access modes over several years. We investigate how users of BioPortal query and search for ontologies and their classes, how they explore the ontologies, and how they reuse classes from different ontologies. Additionally, through three real-world scenarios, we not only analyze the usage of ontologies for annotation tasks but also compare it to the browsing and querying behaviors of BioPortal users. For our investigation, we use several different visualization techniques. To inspect large amounts of interaction, reuse, and real-world usage data at a glance, we make use of and extend PolygOnto, a visualization method that has been successfully used to analyze reuse of ontologies in previous work. Our results show that exploration, query, reuse, and actual usage behaviors rarely align, suggesting that different users tend to explore, query and use different parts of an ontology. Finally, we highlight and discuss differences and commonalities among users of BioPortal.
View details for DOI 10.1016/j.websem.2017.12.002
View details for Web of Science ID 000428090300002
View details for PubMedID 29657560
View details for PubMedCentralID PMC5895104
Mechanism-based Pharmacovigilance over the Life Sciences Linked Open Data Cloud.
AMIA ... Annual Symposium proceedings. AMIA Symposium
2017; 2017: 1014–23
Adverse drug reactions (ADR) result in significant morbidity and mortality in patients, and a substantial proportion of these ADRs are caused by drug-drug interactions (DDIs). Pharmacovigilance methods are used to detect unanticipated DDIs and ADRs by mining Spontaneous Reporting Systems, such as the US FDA Adverse Event Reporting System (FAERS). However, these methods do not provide mechanistic explanations for the discovered drug-ADR associations in a systematic manner. In this paper, we present a systems pharmacology-based approach to perform mechanism-based pharmacovigilance. We integrate data and knowledge from four different sources using Semantic Web Technologies and Linked Data principles to generate a systems network. We present a network-based Apriori algorithm for association mining in FAERS reports. We evaluate our method against existing pharmacovigilance methods for three different validation sets. Our method has AUROC statistics of 0.7-0.8, similar to current methods, and event-specific thresholds generate AUROC statistics greater than 0.75 for certain ADRs. Finally, we discuss the benefits of using Semantic Web technologies to attain the objectives for mechanism-based pharmacovigilance.
View details for PubMedID 29854169
PRISM: A DATA-DRIVEN PLATFORM FOR MONITORING MENTAL HEALTH.
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
2016; 21: 333-344
Neuropsychiatric disorders are the leading cause of disability worldwide and there is no gold standard currently available for the measurement of mental health. This issue is exacerbated by the fact that the information physicians use to diagnose these disorders is episodic and often subjective. Current methods to monitor mental health involve the use of subjective DSM-5 guidelines, and advances in EEG and video monitoring technologies have not been widely adopted due to invasiveness and inconvenience. Wearable technologies have surfaced as a ubiquitous and unobtrusive method for providing continuous, quantitative data about a patient. Here, we introduce PRISM-Passive, Real-time Information for Sensing Mental Health. This platform integrates motion, light and heart rate data from a smart watch application with user interactions and text entries from a web application. We have demonstrated a proof of concept by collecting preliminary data through a pilot study of 13 subjects. We have engineered appropriate features and applied both unsupervised and supervised learning to develop models that are predictive of user-reported ratings of their emotional state, demonstrating that the data has the potential to be useful for evaluating mental health. This platform could allow patients and clinicians to leverage continuous streams of passive data for early and accurate diagnosis as well as constant monitoring of patients suffering from mental disorders.
View details for PubMedID 26776198
- A Systematic Analysis on Term Reuse and Term Overlap across Biomedical Ontologies Semantic Web - Interoperability, Usability, Applicability 2016
An Ebola virus-centered knowledge base.
Database : the journal of biological databases and curation
Ebola virus (EBOV), of the family Filoviridae viruses, is a NIAID category A, lethal human pathogen. It is responsible for causing Ebola virus disease (EVD) that is a severe hemorrhagic fever and has a cumulative death rate of 41% in the ongoing epidemic in West Africa. There is an ever-increasing need to consolidate and make available all the knowledge that we possess on EBOV, even if it is conflicting or incomplete. This would enable biomedical researchers to understand the molecular mechanisms underlying this disease and help develop tools for efficient diagnosis and effective treatment. In this article, we present our approach for the development of an Ebola virus-centered Knowledge Base (Ebola-KB) using Linked Data and Semantic Web Technologies. We retrieve and aggregate knowledge from several open data sources, web services and biomedical ontologies. This knowledge is transformed to RDF, linked to the Bio2RDF datasets and made available through a SPARQL 1.1 Endpoint. Ebola-KB can also be explored using an interactive Dashboard visualizing the different perspectives of this integrated knowledge. We showcase how different competency questions, asked by domain users researching the druggability of EBOV, can be formulated as SPARQL Queries or answered using the Ebola-KB Dashboard.
View details for DOI 10.1093/database/bav049
View details for PubMedID 26055098
View details for PubMedCentralID PMC4460400
- Investigating Term Reuse and Overlap in Biomedical Ontologies 6th International Conference on Biomedical Ontology (ICBO) 2015
ReVeaLD: A user-driven domain-specific interactive search platform for biomedical research
JOURNAL OF BIOMEDICAL INFORMATICS
2014; 47: 112-130
View details for DOI 10.1016/j.jbi.2013.10.001
View details for Web of Science ID 000333004500012
View details for PubMedID 24135450
- GenomeSnip: Fragmenting the Genomic Wheel to augment discovery in cancer research 7th Conference on Semantics in Healthcare and Life Sciences (CSHALS) 2014
Functional Characterization of Two Structurally Novel Diacylglycerol Acyltransferase2 Isozymes Responsible for the Enhanced Production of Stearate-Rich Storage Lipid in Candida tropicalis SY005.
2014; 9 (4)
Diacylglycerol acyltransferase (DGAT) activity is an essential enzymatic step in the formation of neutral lipid i.e., triacylglycerol in all living cells capable of accumulating storage lipid. Previously, we characterized an oleaginous yeast Candida tropicalis SY005 that yields storage lipid up to 58% under a specific nitrogen-stress condition, when the DGAT-specific transcript is drastically up-regulated. Here we report the identification, differential expression and function of two DGAT2 gene homologues--CtDGAT2a and CtDGAT2b of this C. tropicalis. Two protein isoforms are unique with respect to the presence of five additional stretches of amino acids, besides possessing three highly conserved motifs known in other reported DGAT2 enzymes. Moreover, the CtDGAT2a and CtDGAT2b are characteristically different in amino acid sequences and predicted protein structures. The CtDGAT2b isozyme was found to be catalytically 12.5% more efficient than CtDGAT2a for triacylglycerol production in a heterologous yeast system i.e., Saccharomyces cerevisiae quadruple mutant strain H1246 that is inherently defective in neutral lipid biosynthesis. The CtDGAT2b activity rescued the growth of transformed S. cerevisiae mutant cells, which are usually non-viable in the medium containing free fatty acids by incorporating them into triacylglycerol, and displayed preferential specificity towards saturated acyl species as substrate. Furthermore, we document that the efficiency of triacylglycerol production by CtDGAT2b is differentially affected by deletion, insertion or replacement of amino acids in five regions exclusively present in two CtDGAT2 isozymes. Taken together, our study characterizes two structurally novel DGAT2 isozymes, which are accountable for the enhanced production of storage lipid enriched with saturated fatty acids inherently in C. tropicalis SY005 strain as well as in transformed S. cerevisiae neutral lipid-deficient mutant cells. These two genes certainly will be useful for further investigation on the novel structure-function relationship of DGAT repertoire, and also in metabolic engineering for the enhanced production of lipid feedstock in other organisms.
View details for DOI 10.1371/journal.pone.0094472
View details for PubMedID 24732323
View details for PubMedCentralID PMC3986092
- LinkedPPI: Enabling Intuitive, Integrative Protein-Protein Interaction Discovery 4th Workshop on Linked Science co-located with 13th International Semantic Web Conference 2014: 48–59
- A Roadmap for navigating the Life Sciences Linked Open Data Cloud 4th Joint International Semantic Technology (JIST) Conference 2014
- Open Data Ireland: Data Audit Report Open Data Ireland Support Project 2014
Linked Biomedical Dataspace: Lessons Learned Integrating Data for Drug Discovery
13th International Semantic Web Conference (ISWC)
View details for DOI 10.1007/978-3-319-11964-9_8
Big linked cancer data: Integrating linked TCGA and PubMed
Web Semantics: Science, Services and Agents on the World Wide Web
View details for DOI 10.1016/j.websem.2014.07.004
The Reactome pathway knowledgebase.
Nucleic acids research
2014; 42 (Database issue): D472-7
Reactome (http://www.reactome.org) is a manually curated open-source open-data resource of human pathways and reactions. The current version 46 describes 7088 human proteins (34% of the predicted human proteome), participating in 6744 reactions based on data extracted from 15 107 research publications with PubMed links. The Reactome Web site and analysis tool set have been completely redesigned to increase speed, flexibility and user friendliness. The data model has been extended to support annotation of disease processes due to infectious agents and to mutation.
View details for DOI 10.1093/nar/gkt1102
View details for PubMedID 24243840
View details for PubMedCentralID PMC3965010
Identification of an Extracellular Antifungal Protein from the Endophytic Fungus Colletotrichum sp DM06
PROTEIN AND PEPTIDE LETTERS
2013; 20 (2): 173-179
An extracellular antifungal protein of 28 kDa (exAFP-C28) was identified from an endophytic fungus Colletotrichum sp. DM-06. After purification, the MIC value of exAFP-C28 against Candida albicans, a well-known human pathogenic fungus was found to be 32 μg/mL that unaffected the human red blood cells. The antifungal activity associated with exAFP-C28 was manifested by the increased membrane permeability of C. albicans cells followed by disruption. Proteomics and bioinformatics analyses revealed that several peptide fragments of exAFP-C28 have identity with the bacterial 50S ribosomal protein L10, and a stretch of 55 amino acids of two peptide fragments corresponding to the Nterminus of L10 protein is capable of forming amphipathic helix required for membrane penetration. Taken together, our results suggest that the exAFP-C28 protein from Colletotrichum sp. DM-06 is a promising therapeutic agent in controlling candidiasis disease in animals including humans.
View details for Web of Science ID 000316859400008
View details for PubMedID 22894154
- Fostering Serendipity through Big Linked Data 12th International Semantic Web Conference (ISWC) 2013