2009 SupervisedRiskPredictorofBreast

(Parker et al., 2009) ⇒ Joel S Parker, Michael Mullins, Maggie CU Cheang, Samuel Leung, David Voduc, Tammi Vickery, Sherri Davies, Christiane Fauron, Xiaping He, Zhiyuan Hu, and others. (2009). “Supervised Risk Predictor of Breast Cancer based on Intrinsic Subtypes.” In: Journal of Clinical Oncology. doi:10.1200/JCO.2008.18.1370

Subject Headings: risk modeling; multivariable Cox model; breast cancer

Notes

Cited By

http://scholar.google.com/scholar?q=%22Supervised+risk+predictor+of+breast+cancer+based+on+intrinsic+subtypes%22+2009

Quotes

Abstract

Purpose

To improve on current standards for breast cancer prognosis and prediction of chemotherapy benefit by developing a risk model that incorporates the gene expression–based “intrinsic” subtypes luminal A, luminal B, HER2-enriched, and basal-like.

Methods

A 50-gene subtype predictor was developed using microarray and quantitative reverse transcriptase polymerase chain reaction data from 189 prototype samples. Test sets from 761 patients (no systemic therapy) were evaluated for prognosis, and 133 patients were evaluated for prediction of pathologic complete response (pCR) to a taxane and anthracycline regimen.

Results

The intrinsic subtypes as discrete entities showed prognostic significance (P = 2.26E-12) and remained significant in multivariable analyses that incorporated standard parameters (estrogen receptor status, histologic grade, tumor size, and node status). A prognostic model for node-negative breast cancer was built using intrinsic subtype and clinical information. The C-index estimate for the combined model (subtype and tumor size) was a significant improvement on either the clinicopathologic model or subtype model alone. The intrinsic subtype model predicted neoadjuvant chemotherapy efficacy with a negative predictive value for pCR of 97%.

Conclusion

Diagnosis by intrinsic subtype adds significant prognostic and predictive information to standard parameters for patients with breast cancer. The prognostic properties of the continuous risk score will be of value for the management of node-negative breast cancers. The subtypes and risk score can also be used to assess the likelihood of efficacy from neoadjuvant chemotherapy.

INTRODUCTION

Breast cancer is a heterogeneous disease with respect to molecular alterations, cellular composition, and clinical outcome. This diversity creates a challenge in developing tumor classifications that are clinically useful with respect to prognosis or prediction. Gene expression profiling by microarray has given us insight into the complexity of breast tumors and can be used to provide prognostic information beyond standard clinical assessment.1–7 For example, the 21-gene OncotypeDx assay (Genome Health Inc, Redwood City, CA) can be used to risk stratify early-stage estrogen receptor (ER) –positive breast cancer.4,5 Another strong predictor of outcome in ER-positive disease is proliferation or genomic grade.7–9 In addition, the 70-gene MammaPrint (Agendia, Huntington Beach, CA) microarray assay has shown prognostic significance in ER-positive and ER-negative early-stage node-negative breast cancer.2,3

The “intrinsic” subtypes luminal A (LumA), luminal B (LumB), HER2-enriched, basal-like, and normal-like have been extensively studied by microarray and hierarchical clustering analysis.1,6,10–12 Here, we study the utility of these subtypes alone and as part of a risk of relapse predictor in two cohorts: 1 patients receiving no adjuvant systemic therapy, and 2 patients undergoing paclitaxel, fluorouracil, doxorubicin, and cyclophosphamide (T/FAC) neoadjuvant chemotherapy. The risk of relapse models were compared with standard models using pathologic stage, grade, and routine biomarker status (ER and HER2).

…

Sample Subtype Prediction

The 50 gene set was compared for reproducibility of classification across three centroid-based prediction methods: Prediction Analysis of Microarray (PAM),24 a simple nearest centroid,6 and Classification of Nearest Centroid.25 In all cases, the subtype classification is assigned based on the nearest of the five centroids. Because of its reproducibility in subtype classification, the final algorithm consisted of centroids constructed as described for the PAM algorithm24 and distances calculated using Spearman's rank correlation. The centroids of the training set using the 50-gene classifier (henceforth called PAM50) are shown in Appendix Figure A3 (online only).

Prognostic and Predictive Models Using Clinical and Molecular Subtype Data

Univariate and multivariable analyses were used to determine the significance of the intrinsic subtypes (LumA, LumB, HER2-enriched, and basal-like) in untreated patients and in patients receiving neoadjuvant chemotherapy. For prognosis, subtypes were compared with standard clinical variables (tumor size [T], node status [N], ER status, and histologic grade), with time to relapse (ie, any event) as the end point. Subtypes were compared with grade and molecular markers (ER, progesterone receptor [PR], HER2) for prediction in the neoadjuvant setting because pathologic staging is not applicable. Likelihood ratio tests were done to compare models of available clinical data, subtype data, and combined clinical and molecular variables. Categoric survival analyses were performed using a log-rank test and visualized with Kaplan-Meier plots.

Developing Risk Models With Clinical and Molecular Data

The subtype risk model was trained with a multivariable Cox model using Ridge regression fit to the node-negative, untreated subset of the van de Vijver cohort.(van de Vijver et al., 2002) A ROR score was assigned to each test case using correlation to the subtype alone (1) (ROR-S) or using subtype correlation along with tumor size (2) (ROR-C):

…

The sum of the coefficients from the Cox model is the ROR score for each patient. To classify samples into specific risk groups, we chose thresholds from the training set that required no LumA sample to be in the high-risk group and no basal-like sample to be in the low-risk group. Thresholds were determined from the training set and remained unchanged when evaluating test cases. SiZer analysis was performed to characterize the relationship between the ROR score and relapse-free survival26 (Appendix Fig A4, online only). The 95% CIs for the ROR score are local versions of binomial CIs, with the local sample size computed from a Gaussian kernel density estimator based on the Sheather-Jones choice of window width.27

References

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2009 SupervisedRiskPredictorofBreast	Joel S Parker Michael Mullins Maggie CU Cheang Samuel Leung David Voduc Tammi Vickery Sherri Davies Christiane Fauron Xiaping He Zhiyuan Hu			Supervised Risk Predictor of Breast Cancer based on Intrinsic Subtypes				10.1200/JCO.2008.18.1370		2009