Implements a survey weighted mixedeffects model using the provided formula.
mix( formula, data, weights, cWeights = FALSE, center_group = NULL, center_grand = NULL, max_iteration = 10, nQuad = 13L, run = TRUE, verbose = FALSE, acc0 = 120, keepAdapting = FALSE, start = NULL, fast = FALSE, family = NULL )
formula  a formula object in the style of 

data  a data frame containing the raw data for the model. 
weights  a character vector of names of weight variables found in the data frame starts with units (level 1) and increasing (larger groups). 
cWeights  logical, set to 
center_group  a list where the name of each element is the name of the aggregation level, and the element is a formula of
variable names to be group mean centered; for example to group mean center gender and age within the group student:

center_grand  a formula of variable names to be grand mean centered, for example to center the variable education by overall mean of
education: 
max_iteration  a optional integer, for nonlinear models fit by adaptive quadrature which limits number of iterations allowed before quitting. Defaults to 10. This is used because if the liklihood surface is flat, models may run for a very long time without converging. 
nQuad  an optional integer number of quadrature points to evaluate models solved by adaptive quadrature. Only nonlinear models are evaluated with adaptive quadrature. See notes for additional guidelines. 
run  logical; 
verbose  logical, default 
acc0  integer, the precision of 
keepAdapting  logical, set to 
start  optional numeric vector representing the point at which the model should start optimization; takes the shape of c(coef, vars) from results (see help). 
fast  logical; deprecated 
family  the family; optionally used to specify generalized linear mixed models. Currently only 
object of class WeMixResults
.
This is a list with elements:
function, the likelihood function.
numeric, the loglikelihood of the model.
numeric vector, the estimated coefficients of the model.
the grouplevel random effects.
the standard errors of the fixed effects, robust if robustSE was set to true.
numeric vector, the random effect variances.
the theta vector.
the original call used.
integer, the number of levels in the model.
numeric, the intraclass correlation coefficient.
the conditional mean of the random effects.
inverse of the second derivative of the likelihood function.
the interclass correlation.
logical, indicates if adaptive quadrature was used for estimation.
the sigma value.
the number of observations in each group.
the variance data frame in the format of the variance data frame returned by lme4.
the variancecovariance matrix of the random effects.
the variancecovariance matrix of the fixed effects.
the variance covariance matrix of the theta terms.
statistics regarding weights, by level.
Linear models are solved using a modification of the analytic solution developed by Bates and Pinheiro (1998). Nonlinear models are solved using adaptive quadrature following the method in STATA's GLAMMM (RabeHesketh & Skrondal, 2006). For additional details, see the vignettes Weighted Mixed Models: Adaptive Quadrature and Weighted Mixed Models: Analytical Solution which provide extensive examples as well as a description of the mathematical basis of the estimation procedure and comparisons to model specifications in other common software. Notes:
Standard errors of random effect variances are robust; see vignette for details.
To see the function that is maximized in the estimation of this model, see the section on "Model Fitting" in the Introduction to Mixed Effect Models With WeMix vignette.
When all weights above the individual level are 1, this is similar to a lmer
and you should use lme4
because it is much faster.
If starting coefficients are not provided they are estimated using lme4
.
For nonlinear models, when the variance of a random effect is very low (<.1), WeMix doesn't estimate it, because very low variances create problems with numerical evaluation. In these cases, consider estimating without that random effect.
The model is estimated by maximum likelihood estimation.
To choose the number of quadrature points for nonlinear model evaluation, a balance is needed between accuracy and speed; estimation time increases quadratically with the number of points chosen. In addition, an odd number of points is traditionally used. We recommend starting at 13 and increasing or decreasing as needed.
Paul Bailey, Claire Kelley, and Trang Nguyen
if (FALSE) { library(lme4) data(sleepstudy) ss1 < sleepstudy # Create weights ss1$W1 < ifelse(ss1$Subject %in% c(308, 309, 310), 2, 1) ss1$W2 < 1 # Run randomintercept 2level model two_level < mix(Reaction ~ Days + (1Subject), data=ss1, weights=c("W1", "W2")) #Run randomintercept 2level model with groupmean centering grp_centered < mix(Reaction ~ Days + (1Subject), data=ss1, weights = c("W1", "W2"), center_group = list("Subject" = ~Days)) #Run three level model with random slope and intercept. #add group variables for 3 level model ss1$Group < 3 ss1$Group < ifelse(as.numeric(ss1$Subject) %% 10 < 7, 2, ss1$Group) ss1$Group < ifelse(as.numeric(ss1$Subject) %% 10 < 4, 1, ss1$Group) # level3 weights ss1$W3 < ifelse(ss1$Group == 2, 2, 1) three_level < mix(Reaction ~ Days + (1Subject) + (1+DaysGroup), data=ss1, weights=c("W1", "W2", "W3")) # Conditional Weights # use vignette example library(EdSurvey) #read in data downloadPISA("~/", year=2012) cntl < readPISA("~/PISA/2012", countries="USA") data < getData(cntl,c("schoolid","pv1math","st29q03","sc14q02","st04q01", "escs","w_fschwt","w_fstuwt"), omittedLevels=FALSE, addAttributes=FALSE) # Remove NA and omitted Levels om < c("Invalid", "N/A", "Missing", "Miss", NA, "(Missing)") for (i in 1:ncol(data)) { data < data[!data[,i] %in% om,] } #relevel factors for model data$st29q03 < relevel(data$st29q03, ref="Strongly agree") data$sc14q02 < relevel(data$sc14q02, ref="Not at all") # run with unconditional weights m1u < mix(pv1math ~ st29q03 + sc14q02 +st04q01+escs+ (1schoolid), data=data, weights=c("w_fstuwt", "w_fschwt")) summary(m1u) # conditional weights data$pwt2 < data$w_fschwt data$pwt1 < data$w_fstuwt / data$w_fschwt # run with conditional weights m1c < mix(pv1math ~ st29q03 + sc14q02 +st04q01+escs+ (1schoolid), data=data, weights=c("pwt1", "pwt2"), cWeights=TRUE) summary(m1c) # the results are, up to rounding, the same in m1u and m1c, only the calls are different }