Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data

Wei Wu, Eugene Bleecker, Wendy Moore, William W. Busse, Mario Castro, Kian Fan Chung, William Calhoun, Serpil Erzurum, Benjamin Gaston, Elliot Israel, Douglas Curran-Everett, Sally E. Wenzel

Research output: Contribution to journalArticle

124 Citations (Scopus)

Abstract

Background Previous studies have identified asthma phenotypes based on small numbers of clinical, physiologic, or inflammatory characteristics. However, no studies have used a wide range of variables using machine learning approaches. Objectives We sought to identify subphenotypes of asthma by using blood, bronchoscopic, exhaled nitric oxide, and clinical data from the Severe Asthma Research Program with unsupervised clustering and then characterize them by using supervised learning approaches. Methods Unsupervised clustering approaches were applied to 112 clinical, physiologic, and inflammatory variables from 378 subjects. Variable selection and supervised learning techniques were used to select relevant and nonredundant variables and address their predictive values, as well as the predictive value of the full variable set. Results Ten variable clusters and 6 subject clusters were identified, which differed and overlapped with previous clusters. Patients with traditionally defined severe asthma were distributed through subject clusters 3 to 6. Cluster 4 identified patients with early-onset allergic asthma with low lung function and eosinophilic inflammation. Patients with later-onset, mostly severe asthma with nasal polyps and eosinophilia characterized cluster 5. Cluster 6 asthmatic patients manifested persistent inflammation in blood and bronchoalveolar lavage fluid and exacerbations despite high systemic corticosteroid use and side effects. Age of asthma onset, quality of life, symptoms, medications, and health care use were some of the 51 nonredundant variables distinguishing subject clusters. These 51 variables classified test cases with 88% accuracy compared with 93% accuracy with all 112 variables. Conclusion The unsupervised machine learning approaches used here provide unique insights into disease, confirming other approaches while revealing novel additional phenotypes.

Original languageEnglish (US)
Pages (from-to)1280-1288
Number of pages9
JournalJournal of Allergy and Clinical Immunology
Volume133
Issue number5
DOIs
StatePublished - 2014

Fingerprint

Asthma
Lung
Research
Cluster Analysis
Learning
Inflammation
Phenotype
Bronchoalveolar Lavage Fluid
Eosinophilia
Age of Onset
Adrenal Cortex Hormones
Nitric Oxide
Quality of Life
Delivery of Health Care

Keywords

  • Asthma phenotyping
  • supervised machine learning approaches
  • unsupervised approaches
  • variable analysis

ASJC Scopus subject areas

  • Immunology and Allergy
  • Immunology

Cite this

Wu, W., Bleecker, E., Moore, W., Busse, W. W., Castro, M., Chung, K. F., ... Wenzel, S. E. (2014). Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data. Journal of Allergy and Clinical Immunology, 133(5), 1280-1288. https://doi.org/10.1016/j.jaci.2013.11.042

Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data. / Wu, Wei; Bleecker, Eugene; Moore, Wendy; Busse, William W.; Castro, Mario; Chung, Kian Fan; Calhoun, William; Erzurum, Serpil; Gaston, Benjamin; Israel, Elliot; Curran-Everett, Douglas; Wenzel, Sally E.

In: Journal of Allergy and Clinical Immunology, Vol. 133, No. 5, 2014, p. 1280-1288.

Research output: Contribution to journalArticle

Wu, W, Bleecker, E, Moore, W, Busse, WW, Castro, M, Chung, KF, Calhoun, W, Erzurum, S, Gaston, B, Israel, E, Curran-Everett, D & Wenzel, SE 2014, 'Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data', Journal of Allergy and Clinical Immunology, vol. 133, no. 5, pp. 1280-1288. https://doi.org/10.1016/j.jaci.2013.11.042
Wu, Wei ; Bleecker, Eugene ; Moore, Wendy ; Busse, William W. ; Castro, Mario ; Chung, Kian Fan ; Calhoun, William ; Erzurum, Serpil ; Gaston, Benjamin ; Israel, Elliot ; Curran-Everett, Douglas ; Wenzel, Sally E. / Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data. In: Journal of Allergy and Clinical Immunology. 2014 ; Vol. 133, No. 5. pp. 1280-1288.
@article{7a58768fed2141bdb778c762efec47dc,
title = "Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data",
abstract = "Background Previous studies have identified asthma phenotypes based on small numbers of clinical, physiologic, or inflammatory characteristics. However, no studies have used a wide range of variables using machine learning approaches. Objectives We sought to identify subphenotypes of asthma by using blood, bronchoscopic, exhaled nitric oxide, and clinical data from the Severe Asthma Research Program with unsupervised clustering and then characterize them by using supervised learning approaches. Methods Unsupervised clustering approaches were applied to 112 clinical, physiologic, and inflammatory variables from 378 subjects. Variable selection and supervised learning techniques were used to select relevant and nonredundant variables and address their predictive values, as well as the predictive value of the full variable set. Results Ten variable clusters and 6 subject clusters were identified, which differed and overlapped with previous clusters. Patients with traditionally defined severe asthma were distributed through subject clusters 3 to 6. Cluster 4 identified patients with early-onset allergic asthma with low lung function and eosinophilic inflammation. Patients with later-onset, mostly severe asthma with nasal polyps and eosinophilia characterized cluster 5. Cluster 6 asthmatic patients manifested persistent inflammation in blood and bronchoalveolar lavage fluid and exacerbations despite high systemic corticosteroid use and side effects. Age of asthma onset, quality of life, symptoms, medications, and health care use were some of the 51 nonredundant variables distinguishing subject clusters. These 51 variables classified test cases with 88{\%} accuracy compared with 93{\%} accuracy with all 112 variables. Conclusion The unsupervised machine learning approaches used here provide unique insights into disease, confirming other approaches while revealing novel additional phenotypes.",
keywords = "Asthma phenotyping, supervised machine learning approaches, unsupervised approaches, variable analysis",
author = "Wei Wu and Eugene Bleecker and Wendy Moore and Busse, {William W.} and Mario Castro and Chung, {Kian Fan} and William Calhoun and Serpil Erzurum and Benjamin Gaston and Elliot Israel and Douglas Curran-Everett and Wenzel, {Sally E.}",
year = "2014",
doi = "10.1016/j.jaci.2013.11.042",
language = "English (US)",
volume = "133",
pages = "1280--1288",
journal = "Journal of Allergy and Clinical Immunology",
issn = "0091-6749",
publisher = "Mosby Inc.",
number = "5",

}

TY - JOUR

T1 - Unsupervised phenotyping of Severe Asthma Research Program participants using expanded lung data

AU - Wu, Wei

AU - Bleecker, Eugene

AU - Moore, Wendy

AU - Busse, William W.

AU - Castro, Mario

AU - Chung, Kian Fan

AU - Calhoun, William

AU - Erzurum, Serpil

AU - Gaston, Benjamin

AU - Israel, Elliot

AU - Curran-Everett, Douglas

AU - Wenzel, Sally E.

PY - 2014

Y1 - 2014

N2 - Background Previous studies have identified asthma phenotypes based on small numbers of clinical, physiologic, or inflammatory characteristics. However, no studies have used a wide range of variables using machine learning approaches. Objectives We sought to identify subphenotypes of asthma by using blood, bronchoscopic, exhaled nitric oxide, and clinical data from the Severe Asthma Research Program with unsupervised clustering and then characterize them by using supervised learning approaches. Methods Unsupervised clustering approaches were applied to 112 clinical, physiologic, and inflammatory variables from 378 subjects. Variable selection and supervised learning techniques were used to select relevant and nonredundant variables and address their predictive values, as well as the predictive value of the full variable set. Results Ten variable clusters and 6 subject clusters were identified, which differed and overlapped with previous clusters. Patients with traditionally defined severe asthma were distributed through subject clusters 3 to 6. Cluster 4 identified patients with early-onset allergic asthma with low lung function and eosinophilic inflammation. Patients with later-onset, mostly severe asthma with nasal polyps and eosinophilia characterized cluster 5. Cluster 6 asthmatic patients manifested persistent inflammation in blood and bronchoalveolar lavage fluid and exacerbations despite high systemic corticosteroid use and side effects. Age of asthma onset, quality of life, symptoms, medications, and health care use were some of the 51 nonredundant variables distinguishing subject clusters. These 51 variables classified test cases with 88% accuracy compared with 93% accuracy with all 112 variables. Conclusion The unsupervised machine learning approaches used here provide unique insights into disease, confirming other approaches while revealing novel additional phenotypes.

AB - Background Previous studies have identified asthma phenotypes based on small numbers of clinical, physiologic, or inflammatory characteristics. However, no studies have used a wide range of variables using machine learning approaches. Objectives We sought to identify subphenotypes of asthma by using blood, bronchoscopic, exhaled nitric oxide, and clinical data from the Severe Asthma Research Program with unsupervised clustering and then characterize them by using supervised learning approaches. Methods Unsupervised clustering approaches were applied to 112 clinical, physiologic, and inflammatory variables from 378 subjects. Variable selection and supervised learning techniques were used to select relevant and nonredundant variables and address their predictive values, as well as the predictive value of the full variable set. Results Ten variable clusters and 6 subject clusters were identified, which differed and overlapped with previous clusters. Patients with traditionally defined severe asthma were distributed through subject clusters 3 to 6. Cluster 4 identified patients with early-onset allergic asthma with low lung function and eosinophilic inflammation. Patients with later-onset, mostly severe asthma with nasal polyps and eosinophilia characterized cluster 5. Cluster 6 asthmatic patients manifested persistent inflammation in blood and bronchoalveolar lavage fluid and exacerbations despite high systemic corticosteroid use and side effects. Age of asthma onset, quality of life, symptoms, medications, and health care use were some of the 51 nonredundant variables distinguishing subject clusters. These 51 variables classified test cases with 88% accuracy compared with 93% accuracy with all 112 variables. Conclusion The unsupervised machine learning approaches used here provide unique insights into disease, confirming other approaches while revealing novel additional phenotypes.

KW - Asthma phenotyping

KW - supervised machine learning approaches

KW - unsupervised approaches

KW - variable analysis

UR - http://www.scopus.com/inward/record.url?scp=84899647833&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84899647833&partnerID=8YFLogxK

U2 - 10.1016/j.jaci.2013.11.042

DO - 10.1016/j.jaci.2013.11.042

M3 - Article

VL - 133

SP - 1280

EP - 1288

JO - Journal of Allergy and Clinical Immunology

JF - Journal of Allergy and Clinical Immunology

SN - 0091-6749

IS - 5

ER -