When the variance of a derived statistic (e.g., a difference) is
required, the covariance between the two statistics must be
calculated. This function uses results generated by various
functions (e.g., a lm.sdf) to find the covariance
between two statistics.
Usage
varEstToCov(
varEstA,
varEstB = varEstA,
varA,
varB = varA,
jkSumMultiplier,
returnComponents = FALSE
)Arguments
- varEstA
a list of two
data.frames returned by a function after thereturnVarEstInputsargument was turned on. The statistic named in thevarAargument must be present in eachdata.frame.- varEstB
a list of two
data.frames returned by a function after thereturnVarEstInputsargument was turned on. The statistic named in thevarAargument must be present in eachdata.frame. When the same asvarEstA, the covariance is within one result.- varA
a character that names the statistic in the
varEstAargument for which a covariance is required- varB
a character that names the statistic in the
varEstBargument for which a covariance is required- jkSumMultiplier
when the jackknife variance estimation method—or balanced repeated replication (BRR) method—multiplies the final jackknife variance estimate by a value, set
jkSumMultiplierto that value. For anedsurvey.data.frameor alight.edsurvey.data.frame, the recommended value can be recovered withEdSurvey::getAttributes(myData,"jkSumMultiplier").- returnComponents
set to
TRUEto return the imputation variance seperate from the sampling variance
Value
a numeric value; the jackknife covariance estimate. If returnComponents is TRUE, returns a vector of
length three, V is the variance estimate, Vsamp is the sampling component of the variance, and Vimp is the imputation component of the variance
Details
These functions are not vectorized, so varA and
varB must contain exactly one variable name.
The method used to compute the covariance is in the vignette titled Statistical Methods Used in EdSurvey
The method used to compute the degrees of freedom is in the vignette titled Statistical Methods Used in EdSurvey in the section “Estimation of Degrees of Freedom.”
Examples
if (FALSE) { # \dontrun{
# read in the example data (generated, not real student data)
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))
# estimate a regression
lm1 <- lm.sdf(formula=composite ~ dsex + b017451, data=sdf, returnVarEstInputs=TRUE)
summary(lm1)
# estimate the covariance between two regression coefficients
# note that the variable names are parallel to what they are called in lm1 output
jkSumMultiplier <- EdSurvey:::getAttributes(data=sdf, attribute="jkSumMultiplier")
covFEveryDay <- varEstToCov(varEstA=lm1$varEstInputs,
varA="dsexFemale",
varB="b017451Every day",
jkSumMultiplier=jkSumMultiplier)
# the estimated difference between the two coefficients
# note: unname prevents output from being named after the first coefficient
unname(coef(lm1)["dsexFemale"] - coef(lm1)["b017451Every day"])
# the standard error of the difference
# uses the formula SE(A-B) = sqrt(var(A) + var(B) - 2*cov(A,B))
sqrt(lm1$coefmat["dsexFemale", "se"]^2
+ lm1$coefmat["b017451Every day", "se"]^2
- 2 * covFEveryDay)
} # }