Data Dependent Choice of Model Selection Methods
The Permutation test is computationally intensive, and the information based criteria, BIC, BIC3 and Modified BIC, are much more computationally efficient than the Permutation test. Regarding their performances, simulation studies indicated that (i) BIC performs well to detect a change with a small effect size and has a tendency of over-estimating the number of joinpoints, (ii) Modified BIC is the most conservative among these selection methods and performs well to detect a change with a large effect size, (iii) the performance of BIC3 is comparable to that of the Permutation test.
\(y = \beta_{0} +\ \beta_{1}x\ +\ \delta_{1} ( x - \tau_{1} )^+ +\ \cdots\ +\ \delta_{k} ( x - \tau_{\kappa} )^+ +\ \epsilon ,\) where \(\kappa\) is an unknown number of joinpoints, and \(a^+ = \ a \) if \(a \ > \ 0 \), and 0 otherwise. Suppose that with a pre-specified \(k_{max}\), a model with k joinpoints is selected by BIC or BIC3 \((0 \ \leq \ k \leq \ k_{max})\), for which the parameters are estimated as \(\hat{\tau_1}, \ldots, \hat{\tau_k} , \hat{\beta_0}, \hat{\beta_1}, \hat{\delta_1}, \ldots, \hat{\delta_k}\). For the observations in the ith and (i+1)st segments estimated (i=1, 2, ..., k), that is, the observations whose x-values are in \(( \hat{\tau_{i-1}}, \hat{\tau_{i+1}} ]\), where \(\hat{\tau_0}\) = min \(x_i \ - \ 1\) and \(\hat{\tau_{k+1}}\) = max \(x_i\), call their x-values in ascending order as \(x_{j_1 + 1}, \ \ldots, \ x_{j_2}\), and let
\(z_i \ = \ \left( (x_{j_1 + 1} \ - \ \hat{\tau_i})^+, \ \ldots, \ (x_{j_2} - \hat{\tau_i})^+ \right)^T\) and
\(X_0 \ = \ \left( \stackrel{\stackrel{1 \hspace{2mm} x_{j_1+1}}{\vdots \hspace{4mm} \vdots}}{\stackrel{1 \hspace{3mm} x_{j_2}}{ \ }} \right)\). Also let
\(\Delta_{i,i+1} \ = \ \hat{\delta_i}^2 z_i^T(I \ - \ H_0) z_i / \hat{\sigma}^2\), where \(H_0 \ = \ X_0(X_0^TX_0)^{-1}X_0^T\) and \(\hat{\sigma}^2\) is the mean squared error of the model with the maximum number of joinpoints, \(k_{max}\), and define
\(\Delta(k) \ = \ min_{i=1, \ldots, k} \Delta_{i, i+1}\).
Note that the measure \(\Delta_{i, i+1}\) is motivated from the consideration of a quantity related to the power of a test to detect a slope change of \(\delta\) from a simple linear regression model.
Given two pre-specified values, c and d, as cutoff values, the number of joinpoints is estimated as \(\hat{\kappa}\) according to the following steps:
Step 1: Estimate the number of joinpoints using both BIC and BIC3 and call them \(\hat{\kappa_{BIC}}\) and \(\hat{\kappa_{BIC3}}\), respectively.
Step 2: If \(\hat{\kappa_{BIC}}\) = \(\hat{\kappa_{BIC3}}\), then report it as \(\hat{\kappa}\).
Step 3: If \(\hat{\kappa_{BIC}} \ \neq \ \hat{\kappa_{BIC3}}\), compute \(\Delta_{max} \ = \ max (\Delta(\hat{\kappa_{BIC}}), \ \Delta(\hat{\kappa_{BIC3}}))\) and \(\Delta_{min} \ = \ min (\Delta(\hat{\kappa_{BIC}}), \ \Delta(\hat{\kappa_{BIC3}}))\).
Step 4: Use BIC if \(\Delta_{max} \ \leq \ c\) or \(\Delta_{diff} \ = \ \Delta_{max} - \ \Delta_{min} \ > \ d\) (that is, \(\hat{\kappa} = \hat{\kappa_{BIC}}\)), and use BIC3 otherwise (that is, \(\hat{\kappa} = \hat{\kappa_{BIC3}}\)).
Based on a simulation study where the performance of the new model selection procedure with various choices of c and d was examined, we recommend to use c=10 and d=200. Among the values of c and d considered, the selection procedure with these choices of c and d was observed to perform best for the goal of it being at least as good as BIC3 and improving BIC3 when BIC performs better than BIC3. Further details can be found in a technical report that is available upon request.
References
- Consistent Model Selection in Segmented Line Regression
- A Modified Bayes Information Criterion with Applications to the Analysis of Comparative Genomic Hybridization Data