Sample CanSurv Analysis

In this example, we analyze the survival data by historic stage, i.e., localized, regional and distant, for the colorectal cancer patients who are diagnosed during 1973-2001 in the SEER nine cancer registries. The data were exported from SEER-Stat.

There are four steps involved in CanSurv analysis. As you review the description of each step in this exercise, you may view or download the files that were created in the process.

Step 1: Create an Input Data File for CanSurv

Step 2: Set Parameters in the CanSurv Program

Step 3: Execute the CanSurv Program

Step 4: View the CanSurv Results


Step 1: Create an Input Data File for CanSurv

Currently, CanSurv can only accept the input data created by a SEER*Stat survival session. To create a new SEER*Stat survival session select New from the File menu, and then choose Survival session. The default is grouped survival data. To create a case-listing of survival data, select Case Listing from the Session menu. See the SEER*Stat help for information about survival sessions.

Because CanSurv models the net survival due to cancer, only choose the relative survival or cause-specific survival on the Statistic tab in a SEER*Stat Survival sessionl. The Selection tab is used to select the records to be included in the analysis and the Table tab is used to determine which variables are used to stratify the results. The Parameters tab in a Survival session specifies the dates, intervals, and vital status involved with the calculation of survival time. The options set on the Parameters tab differ depending on the version of SEER*Stat:

  • SEER*Stat Version 5.x.x or earlier - Use the default options in the Parameter tab.
  • SEER*Stat Version 6.x.x or later - The default options on the Parameters tab changed. The new defaults are Cumulative summary table, Number of Intervals=60 and Months Per Interval=1. In order to create the survival data for CanSurv, uncheck Cumulative summary and check Standard life table. Although CanSurv can analyze monthly data, the computation is slower than yearly data. To speed the calculation, set Months Per Interval=12 and change the Number of intervals accordingly.

Once a result matrix is created, it can be exported to a text (*.txt) data file. The associated information about the variable names, format, and the name and location of the data file is saved in the dictionary (*.dic) file. CanSurv uses the information in the dictionary file to read the data file. For more detail about creating input data for CanSurv, please see "export the survival matrix from SEER*Stat" in the CanSurv help system.

The input file for this example contains grouped relative survival data by historic stage for the colorectal cancer patients who are diagnosed in 1973-2001. If you have the SEER*Stat software, you may open or download CanSurvSample.ssm, the SEER*Stat matrix file. The survival data, which include historic stage, life table and expected survival, were exported to a text file (CanSurvSample.txt) using the SEER*Stat export feature. View the SEER*Stat export dictionary (CanSurvSample.dic) for more information regarding the contents of the sample input file.

Step 2: Set Parameters in the CanSurv Program

The CanSurv program allows users to specify the model and output options using three tabs. The parameter settings can be saved in a CanSurv session (*.srs) file by choosing "save session" under "File" menu.

Input File Tab

If the input file was extracted from SEER*Stat, a dictionary file (*.dic) and a data file (*.txt) were created. The SEER*Stat dictionary file and data file should be stored under the same directory. If the input data file does not exist in the same directory or has been moved, use the Browse button to choose the correct location and name of the data file. Note: Since the data file is linked to the dictionary file by its name, it cannot be opened if it has been renamed. Download the following files to use in this exercise:

  • CanSurvSample.dic - this is the SEER*Stat export dictionary to use in this exercise, and
  • CanSurvSample.txt - this is the data exported from SEER*Stat. This will be used as the Input Data File.

From the CanSurv File menu, select New Session then Use SEER*Stat export data. Use the file dialog window to open the copy of CanSurvSample.dic that you downloaded. CanSurv will use the information in the dictionary file to identify whether the survival data is grouped or individually listed and whether the survival type is relative or cause-specific. The appropriate analysis method is automatically chosen depending on the survival data type. Be sure that the Input Data File is set properly. You may need to browse to the folder in which you downloaded CanSurvSample.txt.

Model Specifications Tab

CanSurv program can fit standard parametric survival models, Cox PH models, mixture cure model and mixture cure model with power function. The default for standard survival models is Cox proportional hazards model, and the default latency distribution G(t) for the cure models is lognormal distribution. The computation specifications control the parameters of Newton-Raphson algorithm. The matrix at bottom specifies how to use the covariates in the analysis.

In the exercise, the default settings on computation specifications were used. Both standard survival model and mixture cure model were fitted with lognormal distribution. The historic stage was used as a categorical variable and it was used in µ for standard survival model and in (c, µ) for mixture cure model. Please refer to the CanSurv program's help and FAQ 5, 6 for more information regarding the options on this tab.

Output Specifications Tab

You need to input the name of the output report files by clicking the Browse button. The output files for the exercise are SampleStandardLognormalOutput.txt and SampleLognormalCureOutput.txt for the two models. The graph options and report options determine which graphs and report are generated. For this exercise, the plot of actuarial and estimated survival curves and the plot of deviance residuals were checked. If the deviance residuals scatter around 0 and have no systematic pattern, then the fit is usually acceptable. See the online help system for more information regarding the output options.

Step 3: Execute the CanSurv Program

For the exercise, both standard survival model and mixture cure model with lognormal distribution were fitted. You may download the session files SampleStandardLognormal.srs and SampleLognormalCure.srs for the two models. Click the lightning bolt on the CanSurv toolbar or select Run on the menu.

Step 4: View the CanSurv Results

A new window with four tabs will pop up after the calculation finishes. The graphs under Actuarial and Observed Survival Curves Tab and Deviance Residuals Tab can be saved as bitmap (*.bmp) files by choosing Save Graph under Output menu. The data tab can be exported into a cross-tabulated (*.csv) file by choosing Export Data under Output menu.

The following graphs were created by CanSurv when the mixture cure model with lognormal latency distribution (SampleLognormalCure.srs) was executed. These are scatter plots of the actuarial and relative survival curves for localized, regional and distant colorectal cancer.

Localized Colorectal Cancer:

Regional Colorectal Cancer:

Distant Colorectal Cancer:

Last Updated: 02 Jan, 2020