Identification of
cell-of-origin breast tumour subtypes in inflammatory breast
cancer
To tackle the heterogeneity of breast cancer, Perou et al (Nature
2000) described a cell-of-origin classifier, based on the
expression of approximately 500 genes. This gene set, called
‘intrinsic gene set’, was selected based upon a greater variability
of gene expression between tumour samples from different patients
than between tumour samples from the same patient. Unsupervised
hierarchical clustering analysis was used to investigate the
relationships between different samples in a data set of 122 breast
tumour specimens defined by the intrinsic gene set. The set of 122
breast specimens fell apart in 5 different clusters, namely,
Luminal A, Luminal B, Basal-like cluster, ErbB2-overexpressing
cluster and a Normal-like cluster, collectively termed
cell-of-origin subtypes. Each cluster was characterized by the
elevated expression of specific genes, the Luminal A and B clusters
showed overexpression of the Estrogen Receptor (ER), expression of
Cytokeratine 5/6 was elevated in the Basal-like cluster, the ErbB2
expression level was increased in the ErbB2-overexpressing cluster
and markers for normal breast epithelium were elevated in the
Normal-like cluster. Metastasis-free and overall survival differ
significantly between the cell-of-origin subtypes, indicating that
heterogeneity defined by this intrinsic gene set is clinically
relevant. Unsupervised hierarchical clustering using gene
expression data for genes present in the intrinsic gene set was
performed using alternative data sets (Sorlie et al, PNAS 2002).
Independently of which data set used, the same cell-of-origin
subtypes were identified, hereby validating previous
observations.
The identification of the different cell-of-origin subtypes in
breast cancer was only reported for non-Inflammatory Breast Cancer
(IBC). Therefore, we investigated the presence of the different
cell-of-origin subtypes in IBC. Out of 500 genes, defining the
intrinsic gene set, only 144 genes were present in our cDNA chips.
First we tested the performance of these 144 genes in identifying
each of the cell-of-origin subtypes in the original data set of 122
breast tumour specimens described in the original article by Sorlie
et al (PNAS 2002). All of the cell-of-origin subtypes were
discovered by unsupervised hierarchical clustering analysis and 84%
of the samples described in the original manuscripts clustered in
the same manner. Next we calculated centroids for each of the
cell-of-origin subtypes. This was done by selecting the most
representative samples for each of the cell-of-origin subtypes and
then averaging the gene expression for each of the 144 genes in the
corresponding samples. Using gene expression data from 18 non-IBC
and 16 IBC specimens for these 144 genes, correlation coeffients
were calculated between each centroid and each IBC and non-IBC
sample. Samples were classified according to the highest
correlation coefficient between each sample and each centroid.
Hence we identified that IBC significantly more often belongs to
the combined basal-like and ErbB2-overexpressing cluster as
compared to non-IBC (p=0.038). This observation agrees with the
poor patient outcome in IBC patients, since the basal-like and the
ErbB2-overexpressing clusters are characterized by a worse
metastasis-free and overall survival compared to the other
cell-of-origin subtypes. The centroid-based classification was
validated in several ways. Unsupervised hierarchical clustering
using gene expression data for the 144 selected intrinsic genes in
our data set of 16 IBC and 18 non-IBC specimens revealed 4
subgroups related to the cell-of-origin subtype. No
ErbB2-overexpressing cluster was identified, probably due to the
low number of samples correlated with the ErbB2-overexpressing
centroid. The robustness of the taxonomy was tested by selecting an
alternative gene set with discriminating genes for each of the
cell-of-origin subtypes using a discriminating score combined with
permutation testing to eliminate false positive results. Using this
alternative gene set, an unsupervised hierarchical clustering was
executed. Hence, we again identified 4 major subgroups, related to
the cell-of-origin subtypes. In this analysis, the samples that
closely correlated with the ErbB2-overexpressing centroid were left
out, since permutation testing for this subgroup was unstable due
to the low number of samples. Last, but not least, we tested the
contribution of the cell-of-origin subtypes to the IBC phenotype by
means of Principle Component Analysis. The different cell-of-origin
subtypes only contribute for 30% to the difference between IBC and
non-IBC. This indicates that the difference seen between the IBC
and non-IBC phenotype at the level of gene expression is also
attributable to other factors than the presence of different
cell-of-origin subtypes

