Theoretical Justification of Wavelength Selection in PLS Calibration

Development of a New Algorithm

Clifford H. Spiegelman, Michael J. McShane, Marcel J. Goetz, Massoud Motamedi, Qin Li Yue, Gerard L. Coté

Research output: Contribution to journalArticle

213 Citations (Scopus)

Abstract

The mathematical basis of improved calibration through selection of informative variables for partial least-squares calibration has been identified. A theoretical investigation of calibration slopes indicates that including uninformative wavelengths negatively affect calibrations by producing both large relative bias toward zero and small additive bias away from the origin. These theoretical results are found regardless of the noise distribution in the data. Studies are performed to confirm this result using a previously used selection method compared to a new method, which is designed to perform more appropriately when dealing with data having large outlying points by including estimates of spectral residuals. Three different data sets are tested with varying noise distributions. In the first data set, Gaussian and log-normal noise was added to simulated data which included a single peak. Second, near-infrared spectra of glucose in cell culture media taken with an FT-IR spectrometer were analyzed. Finally, dispersive Raman Stokes spectra of glucose dissolved in water were assessed. In every case considered here, improved prediction is produced through selection, but data with different noise characteristics showed varying degrees of improvement depending on the selection method used. The practical results showed that, indeed, including residuals into ranking criteria improves selection for data with noise distributions resulting in large outliers. It was concluded that careful design of a selection algorithm should include consideration of spectral noise distributions in the input data to increase the likelihood of successful and appropriate selection.

Original languageEnglish (US)
Pages (from-to)35-44
Number of pages10
JournalAnalytical Chemistry
Volume70
Issue number1
StatePublished - Jan 1 1998

Fingerprint

Calibration
Wavelength
Glucose
Infrared spectrometers
Cell culture
Culture Media
Infrared radiation
Water

ASJC Scopus subject areas

  • Analytical Chemistry

Cite this

Spiegelman, C. H., McShane, M. J., Goetz, M. J., Motamedi, M., Yue, Q. L., & Coté, G. L. (1998). Theoretical Justification of Wavelength Selection in PLS Calibration: Development of a New Algorithm. Analytical Chemistry, 70(1), 35-44.

Theoretical Justification of Wavelength Selection in PLS Calibration : Development of a New Algorithm. / Spiegelman, Clifford H.; McShane, Michael J.; Goetz, Marcel J.; Motamedi, Massoud; Yue, Qin Li; Coté, Gerard L.

In: Analytical Chemistry, Vol. 70, No. 1, 01.01.1998, p. 35-44.

Research output: Contribution to journalArticle

Spiegelman, CH, McShane, MJ, Goetz, MJ, Motamedi, M, Yue, QL & Coté, GL 1998, 'Theoretical Justification of Wavelength Selection in PLS Calibration: Development of a New Algorithm', Analytical Chemistry, vol. 70, no. 1, pp. 35-44.
Spiegelman, Clifford H. ; McShane, Michael J. ; Goetz, Marcel J. ; Motamedi, Massoud ; Yue, Qin Li ; Coté, Gerard L. / Theoretical Justification of Wavelength Selection in PLS Calibration : Development of a New Algorithm. In: Analytical Chemistry. 1998 ; Vol. 70, No. 1. pp. 35-44.
@article{6d5b646ef64c4116926b571602cbffab,
title = "Theoretical Justification of Wavelength Selection in PLS Calibration: Development of a New Algorithm",
abstract = "The mathematical basis of improved calibration through selection of informative variables for partial least-squares calibration has been identified. A theoretical investigation of calibration slopes indicates that including uninformative wavelengths negatively affect calibrations by producing both large relative bias toward zero and small additive bias away from the origin. These theoretical results are found regardless of the noise distribution in the data. Studies are performed to confirm this result using a previously used selection method compared to a new method, which is designed to perform more appropriately when dealing with data having large outlying points by including estimates of spectral residuals. Three different data sets are tested with varying noise distributions. In the first data set, Gaussian and log-normal noise was added to simulated data which included a single peak. Second, near-infrared spectra of glucose in cell culture media taken with an FT-IR spectrometer were analyzed. Finally, dispersive Raman Stokes spectra of glucose dissolved in water were assessed. In every case considered here, improved prediction is produced through selection, but data with different noise characteristics showed varying degrees of improvement depending on the selection method used. The practical results showed that, indeed, including residuals into ranking criteria improves selection for data with noise distributions resulting in large outliers. It was concluded that careful design of a selection algorithm should include consideration of spectral noise distributions in the input data to increase the likelihood of successful and appropriate selection.",
author = "Spiegelman, {Clifford H.} and McShane, {Michael J.} and Goetz, {Marcel J.} and Massoud Motamedi and Yue, {Qin Li} and Cot{\'e}, {Gerard L.}",
year = "1998",
month = "1",
day = "1",
language = "English (US)",
volume = "70",
pages = "35--44",
journal = "Analytical Chemistry",
issn = "0003-2700",
publisher = "American Chemical Society",
number = "1",

}

TY - JOUR

T1 - Theoretical Justification of Wavelength Selection in PLS Calibration

T2 - Development of a New Algorithm

AU - Spiegelman, Clifford H.

AU - McShane, Michael J.

AU - Goetz, Marcel J.

AU - Motamedi, Massoud

AU - Yue, Qin Li

AU - Coté, Gerard L.

PY - 1998/1/1

Y1 - 1998/1/1

N2 - The mathematical basis of improved calibration through selection of informative variables for partial least-squares calibration has been identified. A theoretical investigation of calibration slopes indicates that including uninformative wavelengths negatively affect calibrations by producing both large relative bias toward zero and small additive bias away from the origin. These theoretical results are found regardless of the noise distribution in the data. Studies are performed to confirm this result using a previously used selection method compared to a new method, which is designed to perform more appropriately when dealing with data having large outlying points by including estimates of spectral residuals. Three different data sets are tested with varying noise distributions. In the first data set, Gaussian and log-normal noise was added to simulated data which included a single peak. Second, near-infrared spectra of glucose in cell culture media taken with an FT-IR spectrometer were analyzed. Finally, dispersive Raman Stokes spectra of glucose dissolved in water were assessed. In every case considered here, improved prediction is produced through selection, but data with different noise characteristics showed varying degrees of improvement depending on the selection method used. The practical results showed that, indeed, including residuals into ranking criteria improves selection for data with noise distributions resulting in large outliers. It was concluded that careful design of a selection algorithm should include consideration of spectral noise distributions in the input data to increase the likelihood of successful and appropriate selection.

AB - The mathematical basis of improved calibration through selection of informative variables for partial least-squares calibration has been identified. A theoretical investigation of calibration slopes indicates that including uninformative wavelengths negatively affect calibrations by producing both large relative bias toward zero and small additive bias away from the origin. These theoretical results are found regardless of the noise distribution in the data. Studies are performed to confirm this result using a previously used selection method compared to a new method, which is designed to perform more appropriately when dealing with data having large outlying points by including estimates of spectral residuals. Three different data sets are tested with varying noise distributions. In the first data set, Gaussian and log-normal noise was added to simulated data which included a single peak. Second, near-infrared spectra of glucose in cell culture media taken with an FT-IR spectrometer were analyzed. Finally, dispersive Raman Stokes spectra of glucose dissolved in water were assessed. In every case considered here, improved prediction is produced through selection, but data with different noise characteristics showed varying degrees of improvement depending on the selection method used. The practical results showed that, indeed, including residuals into ranking criteria improves selection for data with noise distributions resulting in large outliers. It was concluded that careful design of a selection algorithm should include consideration of spectral noise distributions in the input data to increase the likelihood of successful and appropriate selection.

UR - http://www.scopus.com/inward/record.url?scp=0003188418&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0003188418&partnerID=8YFLogxK

M3 - Article

VL - 70

SP - 35

EP - 44

JO - Analytical Chemistry

JF - Analytical Chemistry

SN - 0003-2700

IS - 1

ER -