*  
Identification of cell-of-origin breast tumour subtypes in inflammatory breast cancer

To tackle the heterogeneity of breast cancer, Perou et al (Nature 2000) described a cell-of-origin classifier, based on the expression of approximately 500 genes. This gene set, called ‘intrinsic gene set’, was selected based upon a greater variability of gene expression between tumour samples from different patients than between tumour samples from the same patient. Unsupervised hierarchical clustering analysis was used to investigate the relationships between different samples in a data set of 122 breast tumour specimens defined by the intrinsic gene set. The set of 122 breast specimens fell apart in 5 different clusters, namely, Luminal A, Luminal B, Basal-like cluster, ErbB2-overexpressing cluster and a Normal-like cluster, collectively termed cell-of-origin subtypes. Each cluster was characterized by the elevated expression of specific genes, the Luminal A and B clusters showed overexpression of the Estrogen Receptor (ER), expression of Cytokeratine 5/6 was elevated in the Basal-like cluster, the ErbB2 expression level was increased in the ErbB2-overexpressing cluster and markers for normal breast epithelium were elevated in the Normal-like cluster. Metastasis-free and overall survival differ significantly between the cell-of-origin subtypes, indicating that heterogeneity defined by this intrinsic gene set is clinically relevant. Unsupervised hierarchical clustering using gene expression data for genes present in the intrinsic gene set was performed using alternative data sets (Sorlie et al, PNAS 2002). Independently of which data set used, the same cell-of-origin subtypes were identified, hereby validating previous observations.

The identification of the different cell-of-origin subtypes in breast cancer was only reported for non-Inflammatory Breast Cancer (IBC). Therefore, we investigated the presence of the different cell-of-origin subtypes in IBC. Out of 500 genes, defining the intrinsic gene set, only 144 genes were present in our cDNA chips. First we tested the performance of these 144 genes in identifying each of the cell-of-origin subtypes in the original data set of 122 breast tumour specimens described in the original article by Sorlie et al (PNAS 2002). All of the cell-of-origin subtypes were discovered by unsupervised hierarchical clustering analysis and 84% of the samples described in the original manuscripts clustered in the same manner. Next we calculated centroids for each of the cell-of-origin subtypes. This was done by selecting the most representative samples for each of the cell-of-origin subtypes and then averaging the gene expression for each of the 144 genes in the corresponding samples. Using gene expression data from 18 non-IBC and 16 IBC specimens for these 144 genes, correlation coeffients were calculated between each centroid and each IBC and non-IBC sample. Samples were classified according to the highest correlation coefficient between each sample and each centroid. Hence we identified that IBC significantly more often belongs to the combined basal-like and ErbB2-overexpressing cluster as compared to non-IBC (p=0.038). This observation agrees with the poor patient outcome in IBC patients, since the basal-like and the ErbB2-overexpressing clusters are characterized by a worse metastasis-free and overall survival compared to the other cell-of-origin subtypes. The centroid-based classification was validated in several ways. Unsupervised hierarchical clustering using gene expression data for the 144 selected intrinsic genes in our data set of 16 IBC and 18 non-IBC specimens revealed 4 subgroups related to the cell-of-origin subtype. No ErbB2-overexpressing cluster was identified, probably due to the low number of samples correlated with the ErbB2-overexpressing centroid. The robustness of the taxonomy was tested by selecting an alternative gene set with discriminating genes for each of the cell-of-origin subtypes using a discriminating score combined with permutation testing to eliminate false positive results. Using this alternative gene set, an unsupervised hierarchical clustering was executed. Hence, we again identified 4 major subgroups, related to the cell-of-origin subtypes. In this analysis, the samples that closely correlated with the ErbB2-overexpressing centroid were left out, since permutation testing for this subgroup was unstable due to the low number of samples. Last, but not least, we tested the contribution of the cell-of-origin subtypes to the IBC phenotype by means of Principle Component Analysis. The different cell-of-origin subtypes only contribute for 30% to the difference between IBC and non-IBC. This indicates that the difference seen between the IBC and non-IBC phenotype at the level of gene expression is also attributable to other factors than the presence of different cell-of-origin subtypes