A Step Forward for Two-Step Screening for Ovarian Cancer

Stage for stage, ovarian cancers and breast cancers carry a similar prognosis—more than 90% of women diagnosed with stage I disease survive at least 5 years, but only 30% of women who present with distant disease are alive at 5 years. Breast and ovarian cancer differ markedly, however, in their stage distributions at presentation—whereas more than 60% of patients with breast cancer have localized disease at diagnosis, fewer than 20% of patients with ovarian cancer are diagnosed with localized disease.1 This apparent link of early stage at diagnosis to longer overall survival provides a compelling argument for screening to detect early-stage disease. In fact, the development of a highly sensitive and specific test to detect early-stage epithelial ovarian cancer has long been considered the key to changing ovary cancer from the silent killer it has been named to a detectable and highly curable cancer.

In an evaluation of the clinical utility of a proposed ovarian cancer screening test, there are four important questions to ask: First, does the test find early-stage, invasive ovarian cancers? The test should have high sensitivity for early-stage cancers. Finding late-stage cancers before they become symptomatic or borderline ovarian cancers2 is highly unlikely to lead to longer survival. Second, does the test find only ovarian cancers rather than benign disease? The test should be highly specific for ovarian cancer. If the specificity is too low, an unacceptable number of women will undergo unnecessary laparotomy/laparoscopy and oophorectomy. Third, how will the test perform in the population to be screened? The positive predictive value (PPV) of the test should remain high when used in the lower-prevalence screening population, and when used repeatedly during the years. Potential ovarian cancer screening tests are developed in sample populations that are artificially enriched for ovarian cancers. In such a laboratory setting, positive PPVs are high. When the screening test is applied to the normal-risk population, the PPV will fall substantially. With repeated testing over years, the risk for false positive screening tests will be additionally augmented. Finally, will use of the screening test improve survival among the women who are screened? This is the gold standard test for all screening interventions. It is insufficient to demonstrate that use of a screening test finds early-stage cancers. It is possible that there are inherent biologic differences between “early-stage-presenting” and “late-stage-presenting” ovarian cancers. It is not certain that every advanced-stage ovarian cancer was once an early-stage, detectable and curable ovarian cancer. If screening is successful, an increase in the number of early-stage cancers diagnosed will be offset by a decrease in the number of late-stage cancers, and overall survival will be longer among the women assigned to screening. As has recently been discussed regarding nearly two decades of breast and prostate cancer screening, if screening detects only lower-risk cancers at an earlier time-point, true survival is not enhanced.3

In this issue of Journal of Clinical Oncology, Yurkovetsky et al4 report the performance of a multimarker screening assay proposed as the first step of a two-step screening strategy for early detection of epithelial ovarian cancer in postmenopausal women. In the two-step strategy, women with a positive multimarker screening assay would undergo subsequent transvaginal sonogram (TVS). Because the prevalence of ovarian cancer is low (approximately 0.03%), it is estimated that a screening strategy should have greater than 75% sensitivity for early-stage ovarian cancer and specificity of 99.6%. Applied to the normal-risk population of women, this would yield a positive predictive value of 10%,57 which means that 10 women would undergo surgery for every one ovarian cancer found. In a two-step screening strategy, whereby TVS is offered only to women with positive serum assays, for which only women with suspicious findings on TVS undergo surgical evaluation, the specificity of the first-step assay should be at least 98% to keep the specificity of the two-step strategy at 99.6%. The goal of the work reported here was to develop an assay with high sensitivity for early-stage ovarian cancer while preserving 98% specificity. The work has several important strengths that reflect the depth of expertise and experience of the research team. This is multi-institutional, international, collaborative work using state-of-the-art immunoassays, statistical design and analyses, and an impressively large number of serum samples from patients with early and late stage ovarian cancer, benign and nonovarian malignant disease, and healthy controls. An important caveat for the generalizability of the findings is that the study population comprised only postmenopausal women.

The authors report that, from a pool of 96 candidate serum biomarkers, a panel of four markers (CA125, HE4, CEA, and VCAM-1) showed the best diagnostic power.8 In a training set of samples, the four-biomarker assay detected early-stage ovarian cancer with 86% sensitivity and 98% specificity. The same high sensitivity and specificity were preserved when the four-biomarker panel was applied to an independent validation set of serum samples. In the same samples, CA125 alone had a sensitivity of only 61% for early-stage ovarian cancer.

Now, the questions: First, does the test find early-stage cancers? Yes, with 86% sensitivity, meaning that 14% of the early-stage cancers are missed. The four-biomarker assay would appear to be superior to CA125 alone for finding early-stage cancer. Does the test find only ovarian cancers, and not a lot of benign disease? Yes, with 98% specificity. The four-biomarker panel identified 33% of benign pelvic disease, 6% of breast cancers, 0% of colon cancers, and 36% of lung cancers as positive for ovarian cancer. On the surface, this four-biomarker panel would appear to be a clear advance in the serologic detection of early-stage ovarian cancer.

That leaves the two hard questions: First, how will the test perform in the population to be screened? In the training set of samples in which the selection of the four-biomarker assay was defined, the prevalence of ovarian cancer (all stages) was 26%, and the prevalence of early-stage cancers was 12.6%. In the validation set, the prevalence of all-stage ovarian cancer was 18% and the prevalence of early-stage disease 4.7%. These values are still 157-fold higher than the prevalence of early-stage ovarian cancer in the general population of women. Furthermore, by design, all serum samples came only from postmenopausal women who had serum available for study. Biomarker profiles from banked serum may not be fully representative of biomarker profiles from the normal-risk, asymptomatic population. For example, of necessity, the serum from early-stage ovarian cancer cases in the study population came from women with pelvic pathology deemed sufficiently suspicious to merit surgical resection and staging. It is plausible that the serum biomarker profile of such a patient may differ from a woman with no symptoms and no known pelvic abnormalities. There may be a temptation to apply the four-biomarker assay to premenopausal women and to high-risk women, such as carriers of deleterious BRCA1 or BRCA2 mutations. Such a generalization of these data would be premature. For carcinoembryonic antigen 125 (CA125), at least, false-positive CA125 screening results are likely more common among premenopausal, high-risk women.9,10 We do not have any information about how the four-biomarker assay performs outside the postmenopausal-with-serum-available sample population.

Finally, will use of the screening test improve survival among the women who are screened? The authors correctly conclude that their promising findings require additional validation. Furthermore, the intention of the work was for the biomarker assay to serve as a first-step screen that could be applied to postmenopausal women, to identify women who required second-stage evaluation with transvaginal sonogram. When 10,958 postmenopausal women underwent three annual CA125 tests as the first step of such a screening strategy, the positive predictive value for all-stage ovarian cancers was 20.7%, there was no significant difference in the stage-distribution of the ovarian cancers between the screened patients and the control patients, and screen-detected ovarian cancers were of lower grade than ovarian cancers found in the no-screen control group.11 When 50,640 postmenopausal women were screened by annual CA125, interpreted by a risk-of–ovarian cancer score, followed by transvaginal sonogram for women with positive CA125 score results, the sensitivity for all-stage, invasive ovarian cancer was 89.4%, and the positive-predictive value for the strategy was 35.1%.12 The possibly improved sensitivity of the four-biomarker assay for early-stage disease would hopefully translate into an improvement in the detection performance of the full two-step strategy.

If the four-biomarker assay is to become a standard for screening postmenopausal women for early-stage ovarian cancer, the next step needs to be a giant one: a prospective, randomized trial for postmenopausal normal-risk women, comparing four-biomarker assay screening, followed by transvaginal sonogram for women with an abnormal assay compared with no screening. The end point, ideally, should be survival. Given the questions about potential biologic differences in behavior of early-stage– versus late-stage–presenting ovarian cancer, an end point of greater detection of early-stage cancer, although attractive, is unlikely to be definitive. Particularly in ovarian cancer, overdiagnosis has the potential to do harm. We all hope to be able to tell women that ovarian cancer screening can increase survival. We are not there yet, but we may be one step closer.

See full editorial here