| |
Center
for Computing Sciences (JMC), Institute for Defense Analyses, Bowie, Maryland;
and Computational Sciences and Mathematics Research Department (TGK), Sandia
National Laboratories, Livermore, California; and Computer Science Department
and Institute for Advanced Computer Studies (DPO), University of Maryland,
College Park, Maryland; and Department of Cellular Pathology (TJO), Armed
Forces Institute of Pathology, Washington, DC
|
| |
The analysis of G-banded chromosomes remains the most important tool
available to the clinical cytogeneticist. The analysis is laborious when
performed manually, and the utility of automated chromosome identification
algorithms has been limited by the fact that classification accuracy of
these methods seldom exceeds about 80% in routine practice. In this study,
we use four new approaches to automated chromosome identification - singular
value decomposition (SVD), principal components analysis (PCA), Fisher
discriminant analysis (FDA), and hidden Markov models (HMM) - to classify
three well-known chromosome data sets (Philadelphia, Edinburgh, and Copenhagen),
comparing these approaches with the use of neural networks (NN). We show
that the HMM is a particularly robust approach to identification that
attains classification accuracies of up to 97% for normal chromosomes
and retains classification accuracies of up to 95% when chromosome telomeres
are truncated or small portions of the chromosome are inverted. This represents
a substantial improvement of the classification accuracy for normal chromosomes,
and a doubling in classification accuracy for truncated chromosomes and
those with inversions, as compared with NN-based methods. HMMs thus appear
to be a promising approach for the automated identification of both normal
and abnormal G-banded chromosomes.
|