Characterization of lung tumor subtypes through gene expression cluster validity assessment
DSI – Dipartimento di Scienze dell'Informazione,
Università degli Studi di Milano,
via Comelico 39, Milano, Italy;
The problem of assessing the reliability of clusters patients identified by clustering algorithms is crucial to estimate the significance of subclasses of diseases detectable at bio-molecular level, and more in general to support bio-medical discovery of patterns in gene expression data. In this paper we present an experimental analysis of the reliability of clusters discovered in lung tumor patients using DNA microarray data. In particular we investigate if subclasses of lung adenocarcinoma can be detected with high reliability at bio-molecular level. To this end we apply cluster validity measures based on random projections recently proposed by Bertoni and coworkers. The results show that at least two subclasses of lung adenocarcinoma can be detected with relatively high reliability, confirming and extending previous findings reported in the literature.
Mathematics Subject Classification: 62H30 / 62P10 / 92C50
Key words: Cluster validity / clustering algorithms / bio-molecular taxonomy of tumors / DNA microarray data analysis.
© EDP Sciences, 2006