Classification Models for Breast Cancer Molecular Subtyping: What is the Best Candidate for a Translation into Clinic?

Abstract

Keywords

breast cancer classification gene expression model robustness personalized medicine subtype translational research

“…results suggest that we may soon have a robust and reliable approach to breast cancer molecular subtype classification, in a form that can be readily implemented in a clinical laboratory.”

For decades, clinicians have been well aware that breast cancer (BC) is a clinically heterogeneous disease. Tumor size, lymph-node involvement, histological type, grade, and both estrogen receptor (ER) and HER2 receptor status, all influence prognosis and response to systemic therapies but they do not fully capture the varied clinical course of BC [1].

These aforementioned clinical variables have been combined into multivariate prediction models, such as the Nottingham Prognostic Index [2] and Adjuvant! Online [3], for prognosis; or the nomogram, published by Rouzier et al., for the prediction of response to preoperative chemotherapy [4]. Regardless of the clinicopathologic model used, there remains substantial variability in disease outcome within each prediction category. This is probably due to the poor reproducibility of key clinical parameters as defined by immunohistochemistry (IHC) assays (e.g., ER or histological grade) [5], and the transcriptional heterogeneity found among breast tumors. The hope in the community has been that genomic approaches will allow us to overcome these limitations so that ‘quantitative molecular analysis of breast cancer could yield diagnostic tests that might be more accurate than existing clinical prediction models, or complement them’ [6].

“…the use of IHC technique [for subtyping] is questionable owing to its poor reproducibility … its semiquantitative nature and its weak concordance with the molecular subtypes defined by gene expressions.”

High-throughput technologies, such as gene expression profiling, provide us with a unique opportunity to explore the molecular basis for BC by simultaneously analyzing thousands of genes. Microarray-based gene expression studies have revealed that, in addition to being clinically heterogeneous, BC is also a molecularly heterogeneous disease. These studies highlight the presence of distinct molecular subtypes that exhibit different gene expression patterns and clinical outcomes [7 –14].

The relevance of these molecular subtypes in terms of basic and translational research has led to the progressive incorporation of such molecular profiles into prognostic assessments [14,15], the prediction of therapeutic efficacy [16] and the design of clinical trials [17 –19].

During the past decade, several classification models have been published that enable BC molecular subtypes to be identified using gene expression data. In their seminal work, Perou et al. highlighted four breast tumor subtypes: the ‘basal-like’ (characterized by cytokeratins 5 and 17); the HER2-enriched (mostly, but not all, HER2 amplified); the ‘luminal’ (expressing luminal cytokeratins 8 and 18 and often differentiated into two or three subgroups); and the ‘normal-like’ tumors [7].

These molecular subtypes were first identified through hierarchical clustering of a small data-set of breast tumor gene expression profiles, using a large set of highly variably expressed genes referred to as ‘intrinsic’ genes [7]. The authors then designed a classification model, called the Single Sample Predictor (SSP), that enables the subtype of a single tumor to be identified using a nearest centroid classifier based on the initial hierarchical clustering [9]. This first SSP has been further refined by using different versions of the intrinsic gene list [11,14].

Despite their value, SSPs have severe limitations. Pusztai et al. demonstrated that small changes in the initial set of breast tumors may have a dramatic impact on the hierarchical clustering used in the SSPs, raising some doubt about the stability of the method [20]. Kapp et al. challenged the use of hundreds of intrinsic genes, and their results suggested that only genes related to ER and HER2 phenotypes led to a stable identification of three main subtypes: ER⁻/HER2 (basal-like tumors), HER2⁺ and ER⁺/HER2⁻ (luminal tumors) [21]. Weigelt et al. reported that the subtype classifications depended on the list of intrinsic genes since SSPs were only moderately concordant [22].

In an attempt to address these issues, Sotiriou et al. developed a novel classification model called the Subtype Classification Model (SCM), which is based on a parametric clustering technique (a mixture of Gaussians) in a low-dimensional space defined by three gene modules (a list of genes specifically correlated to ER, HER2 and AURKA), to robustly quantify the main discriminators of BC – the ER, HER2 and proliferation phenotypes, respectively. Two versions of these gene modules have been published thus far [12,13].

“Although the consistency and robustness of the SCMs make these models promising candidates for translation into clinic, they still use a large number of genes, making their application in a clinical routine both costly and technically challenging.”

The complex nature of molecular classification using transcriptional profiling has led to numerous efforts to develop IHC markers that can reproduce this molecular subtyping. Combinations of various IHC markers, including cytokeratins, ER and HER2 status and proliferation-related proteins have been proposed to define the subtypes of BC [23 –25]. However in this context, the use of IHC is questionable owing to its poor reproducibility when compared with gene expression profiling [26], its semiquantitative nature and its weak concordance with the molecular subtypes defined by gene expressions.

Although the molecular taxonomy of BCs, as defined by these approaches, has had a significant impact on the way clinicians perceive the disease, we still know surprisingly little about the concordance between these classification models, their prognostic or predictive value and the robustness of the classification algorithms. In addition, the availability of multiple models could lead to confusing results since investigators might not make the same model selections and consequently assign a different subtype to the same tumor sample. Subtype classification is increasingly being incorporated into clinical trials [17 –19], and efforts are being made to adapt molecular subtyping to routine clinical use [14], therefore, it is critically important to adopt standardized methodologies in BC classification.

In a recent meta-analysis of BC studies that included gene expression data obtained from 4607 patients, Haibe-Kains et al. highlighted the advantages and disadvantages of the existing classification models for molecular subtype identification [27]. Concurring with the results of Weigelt et al. [22], the authors show that the published SSPs were only moderately consistent with the subtype classification, which is strongly depending upon the intrinsic genes in the classification models.

On the contrary, SCMs were highly consistent and yielded the best concordance with the traditional clinical parameters (such as ER and HER2 status and histological grade). Interestingly, none of the classification models were concordant with the progesterone receptor status, thereby challenging its relevance for molecular subtyping.

Haibe-Kains and colleagues also assessed the robustness of the various classification models; that is, the ability to assign the same tumors to the same subtypes whatever the gene expression data used to build these models [27]. In other words, if the molecular subtypes are real, a classification model should not depend on the data used to fit it; otherwise the model is considered to be unreliable. The authors showed that SCMs were statistically more robust than SSPs for identifying the three main BC subtypes (basal-like, HER2-enriched and luminal), as well as providing better discrimination between the low- and high-proliferative luminal tumors (referred to as luminal A and B, respectively). The authors also confirmed the clinical relevance of the subtype classifications for prognostic purposes in a large series of 1315 untreated node-negative patients with BC.

Although the consistency and robustness of the SCMs make these models promising candidates for translation into clinic, they still use a large number of genes, making their application in a clinical routine both costly and technically challenging. Haibe-Kains et al. developed a three-gene SCM that used only ER, HER2 and AURKA genes, and yet this proved to be as robust as the original SCMs [27].

These results suggest that we may soon have a robust and reliable approach to BC molecular subtype classification, in a form that can be readily implemented in a clinical laboratory. Such a test, if widely used in a standardized fashion, could dramatically change the way in which patients are managed in a clinical setting and, hopefully, could lead to substantial improvements in outcome and survival.

Footnotes

Acknowledgements

The author would like to thank Professors Christos Sotiriou, Gianluca Bontempi and John Quackenbush for making this research possible, as well as Mary Kalamaras for her editorial assistance.

The author has no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

No writing assistance was utilized in the production of this manuscript.

References

Henderson

Patek

: The relationship between prognostic and predictive factors in the management of breast cancer. Breast Cancer Res. Treat. 52(1–3), 261–288 (1998).

Goldhirsch

Wood

Gelber

Coates

Thurlimann

Senn

: Meeting highlights: updated international expert consensus on the primary therapy of early breast cancer. J. Clin. Oncol. 21(17), 3357–3365 (2003).

Olivotto

Bajdik

Ravdin

: Population-based validation of the prognostic model adjuvant! For early breast cancer. J. Clin. Oncol. 23(12), 2716–2725 (2005).

Rouzier

Perou

Symmans

: Breast cancer molecular subtypes respond differently to preoperative chemotherapy. Clin. Cancer Res. 11(16), 5678–5685 (2005).

Rhodes

Jasani

Barnes

Bobrow

Miller

: Reliability of immunohistochemical demonstration of oestrogen receptors in routine practice: interlaboratory variance in the sensitivity of detection and evaluation of scoring systems. J. Clin. Pathol. 53(2), 125–130 (2000).

Andre

Pusztai

: Molecular classification of breast cancer: implications for selection of adjuvant chemotherapy. Nat. Clin. Pract. Oncol. 3(11), 621–632 (2006).

Perou

Sorlie

Eisen

: Molecular portraits of human breast tumours. Nature 406(6797), 747–752 (2000).

Sorlie

Perou

Tibshirani

: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl Acad. Sci. USA 98(19), 10869–10874 (2001).

Sorlie

Tibshirani

Parker

: Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc. Natl Acad. Sci. USA 100(14), 8418–8423 (2003).

10.

Sotiriou

Neo

Mcshane

: Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc. Natl Acad. Sci. USA 100(18), 10393–10398 (2003).

11.

Fan

: The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics 7, 96 (2006).

12.

Wirapati

Sotiriou

Kunkel

: Meta-analysis of gene expression profiles in breast cancer: toward a unified understanding of breast cancer subtyping and prognosis signatures. Breast Cancer Res. 10(4), R65 (2008).

13.

Desmedt

Haibe-Kains

Wirapati

: Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. Clin. Cancer Res. 14(16), 5158–5165 (2008).

14.

Parker

Mullins

Cheang

: Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27(8), 1160–1167 (2009).

15.

Haibe-Kains

Desmedt

Rothe

Piccart

Sotiriou

Bontempi

: A fuzzy gene expression-based computational approach improves breast cancer prognostication. Genome Biol. 11(2), R18 (2010).

16.

Liedtke

Mazouni

Hess

: Response to neoadjuvant therapy and long-term survival in patients with triple-negative breast cancer. J. Clin. Oncol. 26(8), 1275–1281 (2008).

17.

Pusztai

Broglio

Andre

Symmans

Hess

Hortobagyi

: Effect of molecular disease subsets on disease-free survival in randomized adjuvant chemotherapy trials for estrogen receptor-positive breast cancer. J. Clin. Oncol. 26(28), 4679–4683 (2008).

18.

Sorlie

: Introducing molecular subtyping of breast cancer into the clinic? J. Clin. Oncol. 27(8), 1153–1154 (2009).

19.

Peppercorn

Perou

Carey

: Molecular subtypes in breast cancer evaluation and management: divide and conquer. Cancer Invest. 26(1), 1–10 (2008).

20.

Pusztai

Mazouni

Anderson

Symmans

: Molecular classification of breast cancer: limitations and potential. Oncologist 11(8), 868–877 (2006).

21.

Kapp

Jeffrey

Langerod

: Discovery and validation of breast cancer subtypes. BMC Genomics 7, 231 (2006).

22.

Weigelt

Mackay

A'hern

: Breast cancer molecular profiling with single sample predictors: a retrospective analysis. Lancet Oncol. 11(4), 339–349 (2010).

23.

Abd El-Rehim

Ball

Pinder

: High-throughput protein expression analysis using tissue microarray technology of a large well-characterised series identifies biologically distinct classes of breast cancer confirming recent cdna expression analyses. Int. J. Cancer 116(3), 340–350 (2005).

24.

Nielsen

Hsu

Jensen

: Immunohistochemical and clinical characterization of the basal-like subtype of invasive breast carcinoma. Clin. Cancer Res. 10(16), 5367–5374 (2004).

25.

Van De Rijn

Perou

Tibshirani

: Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome. Am. J. Pathol. 161(6), 1991–1996 (2002).

26.

Gong

Yan

Lin

: Determination of oestrogen-receptor status and erbb2 status of breast carcinoma: a gene-expression profiling study. Lancet Oncol. 8(3), 203–211 (2007).

27.

Haibe-Kains

Culhane

Desmedt

Bontempi

Quackenbush

Sotiriou

: Robustness of breast cancer molecular subtypes identification. Presented at: IMPAKT Breast Cancer Conference. Brussels, Belgium, 5–8 May (2010).