CanSurv: Frequently Asked Questions

Statistical procedures for survival data are available from SAS, Splus and other statistical software. What is special about CanSurv?
What survival models are available in CanSurv?
There are four components (c, μ, σ, δ) in the survival models. Into which component should I put the covariates?
How do I interpret the cure fraction estimates?
When I fit mixture cure model with different parametric distributions G(t), the cure fraction estimates are quite different. Why?
Why are some options not available for some survival models?
Which survival model and which distribution should I use to fit the survival data?
How can I create my own graphs?
Is there a suggested citation for CanSurv

1. Statistical procedures for survival data are available from SAS, Splus and other statistical packages. What is special about CanSurv?

Answer: Many statistical packages are available for the analysis of survival data, and most of them fit standard survival models for data from clinical studies, i.e., cause-specific survival data with continuous survival times. CanSurv is designed for population-based survival data, which are usually grouped into life tables and the measure of net survival is relative survival. Besides the standard survival models, CanSurv can also fit mixture cure models. (See Question 3 for more detail).

2. What survival models are available in CanSurv?

Answer: CanSurv can fit both standard parametric survival models, Cox PH model, and mixture cure models to population-based survival data. The standard survival models include lognormal, loglogistic, Weibull and Gompertz models. CanSurv can also fit mixture cure model (Gamel et al. 2000) and mixture cure model with power function (Capocaccia and De Angelis, 1997). The covariates can be put into the components (c,µ,s,d) (For detail, see Question 4).

3. There are four components (c, μ, σ, δ) in the survival models, into which component should I put the covariates?

Answer: Technically, you can put the covariates in each of the four parameters (c, μ, σ, δ) or even put the covariates in all four parameters simultaneously. This may cause collinearity and identifiability problems. So for the mixture cure model with power function, a covariate can be used in either (c, μ, σ) or δ, but not in both. In the end run, figuring out where a covariate goes is based on some biological evidence.

Usually σ is not fit as a function of covariates. For example, in Weibull model, μ is the scale parameter and σ is the shape parameter. When the Weibull model was fitted to Hodgkin's disease data across diagnosis years, it seemed as though the scale on which the curve was operating was changing, but the shape of the distribution was not. Hence only μ is a function of covariates.

However, if it is mainly of interest to provide good fits, one might waste a lot of time trying to figure out which parameter a particular covariate goes into, but the fit for all the models may be very close.

4. How do I interpret the cure fraction estimates?

Answer: The mixture cure model assumes that the survival function S(t)=c+(1-c)G(t), and the cure fraction c is the value of the survival function when the time t goes to infinity. When relative survival is used as net survival, the cure fraction is the proportion of cancer patients whose survival experience is equivalent to general cancer-free population.

If a parametric distribution is used for G(t) in the mixture cure model, the survival curve extrapolates beyond the last event/end of the follow-up. For the mixture cure model with Cox PH model as latency, the cure fraction is set at the level of the survival curve after the last event. Hence, it tends to over-estimate the cure fraction. (see Yu B. et al. 2004 and Question 6 for more detail).

5. When I fit mixture cure model with different parametric distributions G(t), the cure fraction estimates are quite different. Why?

Answer: While all of the parametric models fit the data approximately the same in the observed range of data, they may produce different cure fraction estimates for some cancer sites. Relatively long follow-up is required to obtain an accurate cure fraction estimate. For more details, please see Yu B. et al. (2004).

6. Why are some options not available for some survival models?

Answer: Depending on the characteristics of the survival data and models, some options are not appropriate or have not been implemented in CanSurv, hence are not available. Here is a list of the options:

When the survival data is individually-listed, only Kaplan-Meier analysis is available; when the data is stratified with grouped survival time, only actuarial analysis is available.
When the survival data is individually-listed, only standard parametric and binary mixture cure survival models are available. The Cox PH model is not available. The graphical plots are not available for individually-listed data either.
When Cox PH model is used for grouped survival data, the option of trying different starting values is not available.

7. Which survival model and which distribution should I use to fit the survival data?

Answer: The analysis should be guided by research interest and exploratory analysis may help decide which survival model to use. The mixture cure models usually provide better fit than the standard survival model if there is clear evidence that the cure exists. However, we would only suggest to use the mixture cure models when the follow-up time is sufficient and most event times have been observed.

There are several criteria to select the "best" latency distribution. For example, Akaike Information Criterion (AIC) is defined as AIC=-2*(Maximum loglikelihood value)+2*(Number of parameters), the model with minimum value of AIC is selected as the best model.

8. How can I create my own graphs?

Answer: You can export the output data file into a cross-tabulated (*.csv) file, which includes the actuarial and estimated survival curves. The data file can be opened by Excel, SAS or Splus to create the graphs you like.

9. Is there a suggested citation for CanSurv?

Answer: There are two suggested citations for CanSurv: one for the software and one for the methods. The citations are listed below and can also be accessed by selecting About from the CanSurv Help menu.

Software Citation:: Cansurv, Version 1.3. February 2014; Statistical Research and Applications Branch, Data Modeling Branch, National Cancer Institute.
Methods Citation:: Gamel JW, Weller EA, Wesley MN, Feuer EJ. Parametric cure models of relative and cause-specific survival for grouped survival times. Comput Methods Programs Biomed 2000;61:99-110.