Errors correlated following a non-specific covariance structure
Sometimes in practice, especially for data from complex survey samples, observations may be correlated following a more general covariance structure than the heteroscedastic and autocorrelated error cases. Starting with Version 4.9, Joinpoint has the capability of reading in a general variance-covariance matrix of a sequence of aggregate-level observations provided by a user before logarithm transformation. With the variance-covariance matrix read in, Joinpoint calculates the weight matrix as follows and conducts the weighted least squares fitting for the joinpoint linear model y = xb and the log-linear model ln(y) = xb.
Suppose that V=(vij) is the estimated covariance matrix of (y1,...,yn) where
and
Then the weight matrix for the linear model y=xb is defined as the inverse of V.
For the log-linear model where ln(y)=xb, the variance-covariance matrix of ln(y) is approximated using the Taylor series linearization method as
where
and
Then, the weight matrix for the log-linear model is defined as the inverse of \(\tilde{V}\).
A similar approach can be used for the logit model logit(y) = xb when the input data yi, i = 1, ..., n, is in the range of 0 to 1. The variance-covariance matrix of logit(y) can be estimated as
where
and
Please note that the Joinpoint software currently doesn’t have an option for fitting the logit regression model. Users may derive their own algorithms if the logit model is preferred.
An important application with a general variance-covariance matrix is the aggregate-level trend analysis of complex survey data. Suppose that yT denotes the summary of survey responses (e.g. proportions from dummy responses or means from continuous responses) based on nT subjects at time t and we fit a joinpoint regression model for (y1,...,yT) or for the logarithm of the summary responses. For this type of trend analysis with aggregate data, the T by T variance-covariance matrix of (y1,...,yT) can be obtained from standard survey software such as SAS or SUDAAN. For example, it can be computed using the SAS SURVEYMEANS procedure with the COV option of the DOMAIN statement using the individual level survey data, where Time (or survey year) is treated as the domain variable (SAS/STAT 15.1 User’s Guide). The COV option displays the estimated covariance matrix of domain means. It can also be computed using the SUDAAN DESCRIPT procedure when specifying the keyword COVMEAN in the OUTPUT statement (SUDAAN 11, RTI). For further information, please refer to Liu et al. (2022).
References
SAS Institute Inc. (2018). SAS/STAT 15.1 User’s Guide. The SURVEYMEANS procedure, https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_surveymeans_syntax05.htm&docsetVersion=15.1&locale=en. Cary, NC: SAS Institute Inc.
Research Triangle Institute (2012). SUDAAN Language Manual, Volumes 1 and 2, Release 11. Research Triangle Park, NC: Research Triangle Institute.
Liu B, Kim H-J, Feuer E.J., Graubard B.I. (2022). Joinpoint regression methods of aggregate outcomes for complex survey data. Journal of Survey Statistics and Methodology, 00, 1-23. https://doi.org/10.1093/jssam/smac014.