How Joinpoint Selects the Final Model

Selecting the model, that is, the number of joinpoints, can be done by using one of the following options in Joinpoint:

  • Permutation Test

Traditional BIC Methods

  • Bayesian Information Criterion (BIC)
  • BIC3
  • Modified BIC

Data Driven BIC Methods

  • Weighted BIC (WBIC)
  • Weighted BIC Alternative (WBIC-Alt)
  • Data Dependent Selection (DDS)

 

The Permutation test, that was proposed in Kim et al. (2000) and implemented in the original version of Joinpoint, is computationally intensive. In addition to the Permutation test method, there are six methods based on the information criteria. These methods are much more computationally efficient than the Permutation test. These information based criteria methods can be grouped into two categories, Traditional Bayesian Information Criteria (BIC) methods and Data Driven BIC methods.  The Traditional BIC methods select the model for which the object function, which is either the sum of the model fit error and the penalty term (BIC and BIC3) or an asymptotic approximation of the Bayes factor (MBIC), is minimized.  The Data Driven BIC methods include the DDS method introduced in Joinpoint v4.6.0.0 and two new methods, Weighted BIC and Weighted BIC-Alt introduced in v4.7.0.0.

The motivation of the Data driven BIC methods is to internally determine the model selection method, BIC or BIC3, based on the characteristics of data. The basic idea is to use BIC if change sizes are relatively small, and to use BIC3 otherwise. Through simulations, the permutation test has been shown to produce reasonable results, (i.e. predicts the correct number of joinpoints), when the effect size, a function of the size of slope changes adjusted for the variability in data, is large. The BIC3 method has been shown to produce results similar to the permutation test in situations when the permutation test is performing well relative to the traditional BIC method. For situations with smaller effect sizes, the BIC performs better. However, unlike in simulations, the analyst does not know the true effect sizes in practice, and the data driven methods were developed with an aim to use the data to approximate the effect sizes. The data driven methods either choose between BIC and BIC3 (Data Driven Selection-DDS) or use a weighted average of the BIC and BIC3 (Weighted BIC or Weighted BIC-Alt). Simulations have shown that among the Data Driven BIC methods, the DDS and Weighted BIC methods perform similarly, and the Weighted BIC is preferred because it has a stronger conceptual justification. The Weighted BIC-Alt performs worst of the Data Driven BIC Methods.

Therefore our overall recommendations are:

  • Use the permutation test if the user prefers the method that has the longest track record and generally produces parsimonious results (i.e. will detect fewer joinpoints than other approaches, especially if the slope changes are small).

  • Use the BIC3 method if the user would like to produce results similar to the permutation test, but computation time is an issue.

  • Use the Weighted BIC method if the user prefers a method that on average performs best across a wide range of situations. While the permutation test, BIC, and BIC3 might perform better in specific situations, the Weighted BIC is the most flexible in adapting to different situations. The Weighted BIC is being considered as the default for use in the Joinpoint software after we gain more experience with it. The DDS method is a worthy competitor to the Weighted BIC, and the Weighted BIC was selected as preferred based on conceptual reasons, rather than on the results of our simulations.

  • The Weighted BIC-Alt and Modified BIC are generally not recommended for most users, and are included for very specialized purposes.

 

Details of each method are given below:

 

Permutation Test

This method uses the sequence of permutation tests to ensure that the approximate probability of overall Type I error is less than the specified significance level (also called the alpha level, default = .05). Assuming that the default value of the minimum number of joinpoints is 0, "the overall Type I error" is the probability of incorrectly concluding that the underlying model has one or more joinpoints when, in fact, the true underlying model has no joinpoints.

 

Bayesian Information Criteria (BIC)

The value of BIC is the loglikelihood value with penalizing the cost of extra parameters. The model with the minimum value of BIC is selected as the optimal model.

 

BIC3

A modification of traditional BIC with a harsher penalty.

 

Modified BIC

A modification of traditional BIC proposed to improve its performance.

  

Weighted BIC (WBIC)   [Beta Version]

While the Data Dependent Selection (DDS) internally uses BIC or BIC3 based on the empirically determined cut-off values for the selection statistics, the weighted BIC combines BIC and BIC3 using a weighted penalty term based on the data characteristic. 

 

Weighted BIC Alternative (WBIC-Alt)    [Beta Version]

A modification of BIC3 that is less conservative than WBIC.

 

Data Dependent Selection (DDS)   [Beta Version]

This procedure internally determines the model selection method, BIC or BIC3, based on the characteristics of data, and its basic idea is to use BIC if change sizes are relatively small and BIC3 otherwise.