When the variance of a derived statistic (e.g., a difference) is
required, the covariance between the two statistics must be
calculated. This function uses results generated by various
functions (e.g., a lm.sdf
) to find the covariance
between two statistics.
Usage
varEstToCov(
varEstA,
varEstB = varEstA,
varA,
varB = varA,
jkSumMultiplier,
returnComponents = FALSE
)
Arguments
- varEstA
a list of two
data.frame
s returned by a function after thereturnVarEstInputs
argument was turned on. The statistic named in thevarA
argument must be present in eachdata.frame
.- varEstB
a list of two
data.frame
s returned by a function after thereturnVarEstInputs
argument was turned on. The statistic named in thevarA
argument must be present in eachdata.frame
. When the same asvarEstA
, the covariance is within one result.- varA
a character that names the statistic in the
varEstA
argument for which a covariance is required- varB
a character that names the statistic in the
varEstB
argument for which a covariance is required- jkSumMultiplier
when the jackknife variance estimation method—or balanced repeated replication (BRR) method—multiplies the final jackknife variance estimate by a value, set
jkSumMultiplier
to that value. For anedsurvey.data.frame
or alight.edsurvey.data.frame
, the recommended value can be recovered withEdSurvey::getAttributes(
myData,
"jkSumMultiplier")
.- returnComponents
set to
TRUE
to return the imputation variance seperate from the sampling variance
Value
a numeric value; the jackknife covariance estimate. If returnComponents
is TRUE
, returns a vector of
length three, V
is the variance estimate, Vsamp
is the sampling component of the variance, and Vimp
is the imputation component of the variance
Details
These functions are not vectorized, so varA
and
varB
must contain exactly one variable name.
The method used to compute the covariance is in the vignette titled Statistical Methods Used in EdSurvey
The method used to compute the degrees of freedom is in the vignette titled Statistical Methods Used in EdSurvey in the section “Estimation of Degrees of Freedom.”
Examples
if (FALSE) { # \dontrun{
# read in the example data (generated, not real student data)
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))
# estimate a regression
lm1 <- lm.sdf(formula=composite ~ dsex + b017451, data=sdf, returnVarEstInputs=TRUE)
summary(lm1)
# estimate the covariance between two regression coefficients
# note that the variable names are parallel to what they are called in lm1 output
jkSumMultiplier <- EdSurvey:::getAttributes(data=sdf, attribute="jkSumMultiplier")
covFEveryDay <- varEstToCov(varEstA=lm1$varEstInputs,
varA="dsexFemale",
varB="b017451Every day",
jkSumMultiplier=jkSumMultiplier)
# the estimated difference between the two coefficients
# note: unname prevents output from being named after the first coefficient
unname(coef(lm1)["dsexFemale"] - coef(lm1)["b017451Every day"])
# the standard error of the difference
# uses the formula SE(A-B) = sqrt(var(A) + var(B) - 2*cov(A,B))
sqrt(lm1$coefmat["dsexFemale", "se"]^2
+ lm1$coefmat["b017451Every day", "se"]^2
- 2 * covFEveryDay)
} # }