# Cancer Prevalence Statistics: Approaches to Estimation Using Cancer Registry Data

**Counting Method** is used to estimate prevalence based on tumor registry data (Feldman et al., 1986; Gail et al., 1999). Cases still alive on the desired prevalence date are simply counted, while adjustments are made to estimate the proportion of cases lost to follow-up who would have made it to the prevalence date. The expected number of cases lost to follow-up who make it to the prevalence date is computed using conditional survival curves for specified cohorts. For example, for 42-year SEER prevalence estimates at January 1, 2017, computations are done using survival cohorts defined by site, sex, race (white, black, other, and unspecified/unknown), calendar year groupings (1975-81, 1982-1986,...,2012-2016) and age (<60, 60-69, and 70+). Depending on the length of the cancer incidence series available the counting method provides limited duration prevalence. Because people can be diagnosed with more than one tumor, there are different methods that can be used to determine which tumors to include in the counting method. The counting method is implemented in the prevalence session of the SEER*Stat software.

Standard Errorfor the Counting Method is based on the Poisson method. (See Clegg et al., 2002)

**Completeness Index** is a statistical model which estimates complete prevalence from limited-duration prevalence (Capocaccia & De Angelis, 1997; Merrill et al., 2000). The completeness index method is implemented in the ComPrev software and is used to provide US cancer prevalence for the Cancer Statistics Review.

**Estimating US Cancer Prevalence**. US cancer prevalence is estimated by projecting SEER prevalence proportions by age group, date of prevalence, sex and race to the respective US population. The ProjPrev software can be used to estimate US prevalence from US populations and SEER prevalence proportions.

**Back Calculation Methods** can be used to estimate cancer prevalence when cancer incidence data is not available and to project prevalence into the future. The MIAMOD/PIAMOD software is a back-calculation method similar to those used for AIDS that can be applied to calculate the incidence and prevalence of cancer from mortality and survival.

The

MIAMODmethod (Verdecchia et al., 1989; De Angelis et al. 1994) estimates incidence and prevalence from mortality and survival. Since mortality data are available for the entire nation, and survival for known areas (e.g. SEER) can be extrapolated to other areas (Mariotto et al. 2002), the MIAMOD method can be used to calculate regional and national estimates of incidence and prevalence. The MIAMOD method has been used to estimate and project breast cancer prevalence at state level (De Angelis et al., 2008). These estimates are also in State Cancer Profiles website.The

PIAMODmethod (Verdecchia et al., 2002) estimates prevalence from incidence and survival by fitting a parametric incidence model to incidence data. This method is useful to project prevalence in time. This method has been used to estimate and project the US prevalence by phases of care of colorectal cancer and all cancer sites combined.The MIAMOD/PIAMOD software can be obtained from http://www.eurocare.it/MiamodPiamod/tabid/60/Default.aspx

**Transition Rate Method**. MIAMOD/PIAMOD are part of a broader class of transition rate methods that estimates prevalence using a three-state stochastic process:

- alive and cancer free
- alive with cancer
- dead

After estimating the transition rates between these states, this stochastic model then is allowed to run to simulate cancer prevalence under a set of specified conditions (e.g. constant transition rates over time). (Gail et al., 1999).

**Estimation of Childhood Cancer Prevalence**

Childhood cancer prevalence is the prevalence of people who were diagnosed with cancer between ages 0 and 19 or ages 0 and 14. Since people diagnosed with childhood cancer can live for a very long time limited-duration prevalence of childhood cancer can be zero by definition for some age groups. For example, when we consider childhood cancers (0 - 19 years), the 20-year prevalence includes only cases who are 39 years or younger at the prevalence date. Survivors older than 39 years at the prevalence date would have been diagnosed prior to the 20 years of incidence data used to estimate prevalence. Thus the 20-year prevalence is zero by definition for age groups 40-44, 45-49,... A method similar to the completeness index method has been developed to estimate long term survivors of childhood cancers (See Simonetti et al. , 2008. ; Mariotto et al. 2009). This method is implemented in the ComPrev software. Other approaches, such as the back-calculation or the transition rates methods, can also be used to estimate complete prevalence of childhood cancers.

## Other Methods Using National Survey Data

**Cross-Sectional Population-Based Surveys** can be used to estimate prevalence using self reporting; however, one must be concerned with underreporting and misclassification of disease. (See Byrne et al., 1992).