Calculates the degrees of freedom for a statistic (or of a contrast between two statistics) based on the jackknife and imputation variance estimates.
Usage
DoFCorrection(
varEstA,
varEstB = varEstA,
varA,
varB = varA,
method = c("WS", "JR")
)Arguments
- varEstA
the
varEstInputobject returned from certain functions, such aslm.sdfwhenreturnVarEstInputs=TRUE). The variablevarAmust be on this dataset. See Examples.- varEstB
similar to the
varEstAargument. If left blank, both are assumed to come fromvarEstA. When set, the degrees of freedom are for a contrast betweenvarAandvarB, and thevarBvalues are taken fromvarEstB.- varA
a character that names the statistic in the
varEstAargument for which the degrees of freedom calculation is required.- varB
a character that names the statistic in the
varEstBargument for which a covariance is required. WhenvarBis specified, returns the degrees of freedom for the contrast betweenvarAandvarB.- method
a character that is either
WSfor the Welch-Satterthwaite formula orJRfor the Johnson-Rust correction to the Welch-Satterthwaite formula
Details
This calculation happens under the notion that statistics have little variance within strata, and some strata will contribute fewer than a full degree of freedom.
The functions are not vectorized, so both varA and
varB must contain exactly one variable name.
The method used to compute the degrees of freedom is in the vignette titled Statistical Methods Used in EdSurvey section “Estimation of Degrees of Freedom.”
References
Johnson, E. G., & Rust, K. F. (1992). Population inferences and variance estimation for NAEP data. Journal of Educational Statistics, 17, 175–190.
Examples
if (FALSE) { # \dontrun{
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package="NAEPprimer"))
lm1 <- lm.sdf(formula=composite ~ dsex + b017451, data=sdf, returnVarEstInputs=TRUE)
summary(lm1)
# this output agrees with summary of lm1 coefficient for dsex
DoFCorrection(lm1$varEstInputs,
varA="dsexFemale",
method="JR")
# second example, a covariance term requires more work
# first, estimate the covariance between two regression coefficients
# note that the variable names are parallel to what they are called in lm1 output
covFEveryDay <- varEstToCov(lm1$varEstInputs,
varA="dsexFemale",
varB="b017451Every day",
jkSumMultiplier=
EdSurvey:::getAttributes(data=sdf, attribute="jkSumMultiplier"))
# second, find the difference and the SE of the difference
se <- lm1$coefmat["dsexFemale","se"] + lm1$coefmat["b017451Every day","se"] +
-2*covFEveryDay
# third, calculate the t-statistic
tv <- (coef(lm1)["dsexFemale"] - coef(lm1)["b017451Every day"])/se
# fourth, calculate the p-value, which requires the estimated degrees of freedom
dofFEveryDay <- DoFCorrection(lm1$varEstInputs,
varA="dsexFemale",
varB="b017451Every day",
method="JR")
# finally, the p-value
2*(1-pt(abs(tv), df=dofFEveryDay))
} # }