Random Number Generation

The program performs permutation tests to select the number of joinpoints. Since fitting all N! possible permutations of the data would take too long, the program takes a Monte Carlo sample of these N! data sets, using a random number generator to calculate p-values for a series of permutation tests.

Here we discuss the implications of the choice of the number of permutation data sets, say N. The program runs faster with smaller values of N, but it gives better precision for the p-value with larger values of N. In addition, a larger N reduces the probability that another analysis of the same data might get a different answer when run with different random number generator seeds.

Computer programs can produce pseudo-random numbers through algorithms that mimic randomness, which we use to shuffle or permute the errors. The algorithms use a seed or seeds to start the algorithm. These seeds can be used to produce repeatable pseudo-random numbers. The problem of two analyses obtaining different answers from the same data is addressed by this program by specifying default random number generator seeds. Thus, as long as no parameters are changed (including the random number generator seed and N), repeats of the analyses will produce the same results. Otherwise, two runs of the same analysis using different seeds could get different answers.

To get an idea how results would change for someone using different random number generator seeds, we list some confidence intervals for p-values below. For example, with N=999 Monte Carlo samples if you obtained a p-value of .04 from the program there is an approximately 99% chance that another researcher repeating the analysis with N very large (i.e., an ideal situation with N -> infinity) would obtain a p-value between .025 and .0577.

N=99
lower 99% ci estimate upper 99% ci
0.0000	0.01	0.0521
0.0034	0.04	0.1065
0.0069	0.05	0.1218
0.0111	0.06	0.1364
0.0325	0.10	0.1910
0.2702	0.40	0.5281

N=999
lower 99% ci estimate upper 99% ci
0.0031	0.01	0.0199
0.0250	0.04	0.0577
0.0331	0.05	0.0694
0.0415	0.06	0.0810
0.0762	0.10	0.1259
0.3595	0.40	0.4402

N=9999
lower 99% ci estimate upper 99% ci
0.0075	0.01	0.0127
0.0350	0.04	0.0452
0.0445	0.05	0.0558
0.0540	0.06	0.0663
0.0923	0.10	0.1079
0.3873	0.40	0.4127

N=99999
lower 99% ci estimate upper 99% ci
0.0092	0.01	0.0108
0.0384	0.04	0.04116
0.0482	0.05	0.0518
0.0581	0.06	0.0620
0.0978	0.10	0.1025
0.3960	0.40	0.4040

Typically, you should allow the computer to use the default seeds for the random number generator. By using these default seeds, one can duplicate results from a previous Joinpoint session even though the program uses Monte Carlo sampling. Although not recommended in general, the Joinpoint Regression Program allows one to change the default seed by selecting Session > Preferences from the File menu.