An official website of the United States government

Data Source Files

A description of the files used as a data source for the ComPrev application.

A Data Source is composed of two different files - the dictionary file (a file with a .INI extension), and the data file (a file with a .TXT extension). 

The dictionary file will look something like this:

 

The data file will look something like this:

 

Dictionary File

The dictionary file is in a format similar to that which is exported from SEER*Stat. The dictionary file defines what variables will be used in ComPrev. It also defines the values for each variable with what label to show for each value. These are the labels and values showed in the Selection Tab of the Session Setup Window, and in the Cohort Tree on the Main Window

Format:

Field Description
[ByVars] There is only a single Variables section. It contains a list of all of the variables in the dictionary. 
Var1 = <Variable Name> The first variable is set to this name. There must be a corresponding section [] lower in the dictionary defining the values for this variable. 
Var2 = <Variable Name>  
... There can be as many variables as you like (but the practical limit is 10).  
   
[<Variable Name>] For each variable, there must be a section defining its values and their labels. 
<Label> = <Value> Each value that can be found in the data file must have a label.  
... There can be as many labels and values as you like.  

 

Notes:

  • The Dictionary File fields can be commented out by using a semi-colon at the beginning of the line. Commented out means that they will not be read by the ComPrev application (they will be ignored).
  • To change a variable name, you need to change it in two different places. First you need to change it in the [ByVars] section, then you need to change it in the [<Variable Name>] section. For example, let's say you wanted to change the name of the variable "Race" to "Population". In the Data.Ini file, you would first change the entry "Var2=Race" to "Var2=Population" in the [ByVars] section. Next, you would go down the file to the section named "[Race]", and change this entry to be "[Population]".

 

Data File

The data file is a text file where each line is a record, and each of the values is delimited by a comma. This data file will be read into ComPrev based on the variables defined in the Dictionary File. The very first record of this file should be a header list, indicating the names of the fields being used. The remaining records contain values for all of the required fields. Each record represents a range of years over which the specific parameters are valid for a given cohort. 

Format:

Field Required Description
<Variable 1 Value> Yes A value from the dictionary file for the first variable. 
<Variable 2 Value> Yes A value from the dictionary file for the second variable. 
... Yes There needs to be a value for each of the variables defined in the dictionary. 
Model Yes The incidence model used for these values. This entry can be "p" for Polynomial Model or "e" for Exponential Model. 
Validation No Whether this entry is validated or not. This entry can be "0" for not validated, or "1" for validated.
StartAge No The starting age of this record. 
EndAge No The ending age of this record.  
StartYear No The starting year of this record.
EndYear No The ending year of this record. 
CTTrend No  
TrendBest No  
PropNotCured Yes Parameter for the Survival Mixture Cure model formula
Scale Yes Parameter for the Survival Mixture Cure model formula.
Shape Yes Parameter for the Survival Mixture Cure model formula.
PeriodCoef Yes Parameter for the Survival Mixture Cure model formula. 
AgeCoef Yes Parameter for the Survival Mixture Cure model formula.
RaceCoef Yes Parameter for the Survival Mixture Cure model formula. 
AgeRef Yes Parameter for the Survival Mixture Cure model formula. 
PeriodRef Yes Parameter for the Survival Mixture Cure model formula. 
Inc1 Yes Parameter for the Incidence Polynomial model formula.
Inc2 Yes Parameter for the Incidence Polynomial model formula. 
Inc3 Yes Parameter for the Incidence Polynomial model formula. 
Inc4 Yes Parameter for the Incidence Polynomial model formula. 
Inc5 Yes Parameter for the Incidence Polynomial model formula. 
Inc6 Yes Parameter for the Incidence Polynomial model formula. 
Inc7 Yes Parameter for the Incidence Polynomial model formula. 
ShiftConst Yes This is the reference age for the logistic function.
ScaleConst Yes An arbitrary scale factor used to avoid very large numbers and numerical instability which arise when taking powers.
SurvCov1_1 - SurvCov1_6 No Survival coverage matrix used for calculating Analytical Variance.
SurvCov2_2 - SurvCov2_6 No Survival coverage matrix used for calculating Analytical Variance.
SurvCov3_3 - SurvCov3_6 No Survival coverage matrix used for calculating Analytical Variance.
SurvCov4_4 - SurvCov4_6 No Survival coverage matrix used for calculating Analytical Variance.
SurvCov5_5 - SurvCov5_6 No Survival coverage matrix used for calculating Analytical Variance.
SurvCov6_6 No Survival coverage matrix used for calculating Analytical Variance.
IncCov1_1 - IncCov1_7 No Incidence coverage matrix used for calculating Analytical Variance.   
IncCov2_2 - IncCov2_7 No Incidence coverage matrix used for calculating Analytical Variance.  
IncCov3_3 - IncCov3_7 No Incidence coverage matrix used for calculating Analytical Variance.   
IncCov4_4 - IncCov4_7 No Incidence coverage matrix used for calculating Analytical Variance.  
IncCov5_5 - IncCov5_7 No Incidence coverage matrix used for calculating Analytical Variance.   
IncCov6_6 - IncCov6_7 No Incidence coverage matrix used for calculating Analytical Variance.  
IncCov7_7  No Incidence coverage matrix used for calculating Analytical Variance.