SEER*Stat/DevCan Importing Tutorials
The following information and exercises will teach you how to:
- prepare cancer incidence and mortality data using SEER*Stat, and
- import it into DevCan for analysis.
These instructions assume that you have SEER*Stat installed, and are familiar with how to use it. If you do not know how to perform SEER*Stat tasks such as building selection statements or creating user-defined variables, you should work through the SEER*Stat tutorials before starting these exercises.
This page provides general instructions for the process of preparing data in SEER*Stat and importing it into DevCan. To work through specific examples, see the exercises.
How to Import SEER*Stat Incidence and Mortality Data into DevCan
- Open two new Rate sessions in SEER*Stat.
- Select an Incidence database in one session, and a Mortality database in the other. Be sure to choose databases with the same submission year, and which both cover the time period for which you want statistics.
- In both sessions, select Rates (Crude) for your type of statistic.
- Define selection criteria in each session according to your interest. Because the SEER incidence databases only contain data for the geographical areas covered by the SEER registries, while the US Mortality databases contain data for the entire United States, it is recommended that you define appropriate geographical selection criteria if you are using the SEER incidence databases.
- When you edit the Incidence session, it is important that you construct your selection statement so that it will find only one tumor of each kind per person. You are interested in what percentage of the population has had a particular type of cancer, not in how many tumors anyone has had. (This is not an issue when you are working with a mortality database, because a mortality database can only have one record per person.) If you are working with only one cancer site, do this by marking the Select Only the First Matching Record for Each Person box on the Selection tab. Multiple cancer sites are more complex; see Exercise 2 for details.
- On each session's Table tab, choose the variables by which you would like your matrix to be organized.
- You must choose corresponding variables (for example, "Age at Diagnosis" and "Age at Death", or "Site Recode" and "Cause of Death") and put them in the same places (i.e., in the same order from top to bottom on the Table tab) in both sessions.
- In order to provide DevCan with the necessary data in the correct format, you must make the population-defining age variable the last variable listed (usually, the last Column variable).
- In addition, the first age in the data must be 0, and the first characters of the name of each age group must be the numeric representation of that group's starting age. (e.g, a grouping named "00-04" is acceptable, but "Ages 00-04" is not). DevCan will not allow the data to be imported if these conditions are not met. Please note: if you are using the "Age Recode with <1 year olds" variable in the SEER databases, you will need to create a user defined copy of this variable without the "Unknown" age group.
- Do not choose any variables that define the tumor itself. You have already ensured that your results will include only one tumor per person, and you do not want to risk excluding any of those tumors from the table. However, it is safe to choose variables that define the person, such as those in the first two categories ("Age at Diagnosis" and "Race, Sex, Year Dx" for Incidence databases, and "Age at Death" and "Race, Sex, Year Dth, State, Cnty, Reg" for Mortality databases).
- On each session's Output tab, set Display Rates as Cases Per to "100,000". Title your matrix and adjust the other settings according to your preference.
- Execute both sessions and save the matrices. These are the Incidence of Cancer and Cancer Mortality matrices.
- In the Mortality session, go to the Selection and Table tabs and remove all search criteria or variables which specify the particular cancer site(s). Execute the session again and save the matrix under a different name. This is the All Causes of Mortality matrix.
- Export the matrices with the following settings:
- Output Variables as: Numeric Representation
- Line Delimiter: DOS/Windows (CR/LF)
- Missing Character: Space
- Field Delimiter: Tab
- Check the boxes to Remove All Thousands Separators (Commas) and Remove Flags (Footnote), Prefix and Suffix Characters. Leave the other checkboxes unmarked.
- In DevCan, open the Database menu and select Import SEER*Stat Data.
- Use the Browse buttons next to each field to locate the appropriate exported SEER*Stat ".dic" files, then Execute the task.
- When prompted, enter a new database name in which to save this data. You can use this name later to retrieve the data you are importing.
- You may receive warning messages at this point, particularly if you are importing data on multiple cancer sites. Check that the warning messages do not say anything unexpected before you proceed.
- Select your desired values for the listed variables, and use the drop-down list on the toolbar to select how the statistics should be displayed.
- Execute the session. The results are displayed in the area at the bottom of the screen.
- You can Save and/or Print these reports as desired.
Mortality Data Only
If you want to create a database with only mortality information, you can omit the incidence data. In that case, ignore the instructions that pertain to the Incidence of Cancer database, but set up the Cancer Mortality and All Causes of Mortality databases as directed. When importing the data into DevCan, you need not enter anything in the Incidence Dictionary File field.