Cancer Prevalence Statistics: Approaches to Estimation Using Cancer Registry Data

Counting Method is used to estimate prevalence based on tumor registry data (Feldman et al., 1986; Gail et al., 1999). Cases still alive on the desired prevalence date are simply counted, while adjustments are made to estimate the proportion of cases lost to follow-up who would have made it to the prevalence date. The expected number of cases lost to follow-up who make it to the prevalence date is computed using conditional survival curves for specified cohorts. For example, for 45-year SEER prevalence estimates at January 1, 2020, computations are done using survival cohorts defined by site, sex, race (white, black, other, and unspecified/unknown), calendar year groupings, and age (<60, 60-69, and 70+). Calendar year groupings are split into five year groups; the earliest group will vary in length but is never fewer than three years. Depending on the length of the cancer incidence series available, the counting method provides limited duration prevalence. Because people can be diagnosed with more than one tumor, there are different methods that can be used to determine which tumors to include in the counting method. The counting method is implemented in the prevalence session of the SEER*Stat software.

Standard Error for the Counting Method is based on the Poisson method. (See Clegg et al., 2002)

Completeness Index is a statistical model which estimates complete prevalence from limited-duration prevalence (Capocaccia & De Angelis, 1997; Merrill et al., 2000). The completeness index method is implemented in the ComPrev software and is used to provide U.S. cancer prevalence for the Cancer Statistics Review.

Estimating U.S. Cancer Prevalence. U.S. cancer prevalence is estimated by projecting SEER prevalence proportions by age group, date of prevalence, sex and race to the respective U.S. population. The ProjPrev software can be used to estimate U.S. prevalence from U.S. populations and SEER prevalence proportions.

Back Calculation Methods can be used to estimate cancer prevalence when cancer incidence data is not available and to project prevalence into the future. The MIAMOD/PIAMOD software is a back-calculation method similar to those used for AIDS that can be applied to calculate the incidence and prevalence of cancer from mortality and survival.

The MIAMOD method (Verdecchia et al., 1989; De Angelis et al. 1994) estimates incidence and prevalence from mortality and survival. Since mortality data are available for the entire nation, and survival for known areas (e.g. SEER) can be extrapolated to other areas (Mariotto et al. 2002), the MIAMOD method can be used to calculate regional and national estimates of incidence and prevalence. The MIAMOD method has been used to estimate and project breast cancer prevalence at state level (De Angelis et al., 2008).

The PIAMOD method (Verdecchia et al., 2002) estimates prevalence from incidence and survival by fitting a parametric incidence model to incidence data. This method is useful to project prevalence in time. This method has been used to estimate and project the U.S. prevalence by phases of care of colorectal cancer and all cancer sites combined.

The MIAMOD/PIAMOD software can be obtained from https://www.iss.it/en/-/miamod-piamod

Transition Rate Method. MIAMOD/PIAMOD are part of a broader class of transition rate methods that estimates prevalence using a three-state stochastic process:

alive and cancer free
alive with cancer
dead

After estimating the transition rates between these states, this stochastic model then is allowed to run to simulate cancer prevalence under a set of specified conditions (e.g. constant transition rates over time). (Gail et al., 1999).

Estimation of Childhood Cancer Prevalence
Childhood cancer prevalence is the prevalence of people who were diagnosed with cancer between ages 0 and 19 or ages 0 and 14. Since people diagnosed with childhood cancer can live for a very long time limited-duration prevalence of childhood cancer can be zero by definition for some age groups. For example, when we consider childhood cancers (0 - 19 years), the 20-year prevalence includes only cases who are 39 years or younger at the prevalence date. Survivors older than 39 years at the prevalence date would have been diagnosed prior to the 20 years of incidence data used to estimate prevalence. Thus the 20-year prevalence is zero by definition for age groups 40-44, 45-49,... A method similar to the completeness index method has been developed to estimate long term survivors of childhood cancers (See Simonetti et al. , 2008. ; Mariotto et al. 2009). This method is implemented in the ComPrev software. Other approaches, such as the back-calculation or the transition rates methods, can also be used to estimate complete prevalence of childhood cancers.

Other Methods Using National Survey Data

Cross-Sectional Population-Based Surveys can be used to estimate prevalence using self reporting; however, one must be concerned with underreporting and misclassification of disease. (See Byrne et al., 1992).