Identifying phenotypic signatures of neuropsychiatric disorders from electronic medical records

J Am Med Inform Assoc. 2013 Dec;20(e2):e297-305. doi: 10.1136/amiajnl-2013-001933. Epub 2013 Aug 16.

Abstract

Objective: Mental illness is the leading cause of disability in the USA, but boundaries between different mental illnesses are notoriously difficult to define. Electronic medical records (EMRs) have recently emerged as a powerful new source of information for defining the phenotypic signatures of specific diseases. We investigated how EMR-based text mining and statistical analysis could elucidate the phenotypic boundaries of three important neuropsychiatric illnesses-autism, bipolar disorder, and schizophrenia.

Methods: We analyzed the medical records of over 7000 patients at two facilities using an automated text-processing pipeline to annotate the clinical notes with Unified Medical Language System codes and then searching for enriched codes, and associations among codes, that were representative of the three disorders. We used dimensionality-reduction techniques on individual patient records to understand individual-level phenotypic variation within each disorder, as well as the degree of overlap among disorders.

Results: We demonstrate that automated EMR mining can be used to extract relevant drugs and phenotypes associated with neuropsychiatric disorders and characteristic patterns of associations among them. Patient-level analyses suggest a clear separation between autism and the other disorders, while revealing significant overlap between schizophrenia and bipolar disorder. They also enable localization of individual patients within the phenotypic 'landscape' of each disorder.

Conclusions: Because EMRs reflect the realities of patient care rather than idealized conceptualizations of disease states, we argue that automated EMR mining can help define the boundaries between different mental illnesses, facilitate cohort building for clinical and genomic studies, and reveal how clear expert-defined disease boundaries are in practice.

Keywords: Autism; Bipolar Disorder; Data Mining; Electronic Medical Records; Network Analysis; Schizophrenia.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Autistic Disorder / diagnosis*
  • Autistic Disorder / genetics
  • Bipolar Disorder / diagnosis*
  • Bipolar Disorder / genetics
  • Child
  • Child, Preschool
  • Data Mining*
  • Diagnosis, Differential
  • Electronic Health Records*
  • Female
  • Humans
  • Male
  • Middle Aged
  • Phenotype*
  • Psychotropic Drugs / therapeutic use
  • Schizophrenia / diagnosis*
  • Schizophrenia / genetics
  • Unified Medical Language System
  • Young Adult

Substances

  • Psychotropic Drugs