Calculate the standard deviation of a numeric variable in an edsurvey.data.frame
.
Usage
SD(
data,
variable,
weightVar = NULL,
jrrIMax = 1,
varMethod = "jackknife",
dropOmittedLevels = TRUE,
defaultConditions = TRUE,
recode = NULL,
targetLevel = NULL,
jkSumMultiplier = getAttributes(data, "jkSumMultiplier"),
returnVarEstInputs = FALSE,
omittedLevels = deprecated()
)
Arguments
- data
an
edsurvey.data.frame
, anedsurvey.data.frame.list
, or alight.edsurvey.data.frame
- variable
character vector of variable names
- weightVar
character weight variable name. Default is the default weight of
data
if it exists. If the given survey data do not have a default weight, the function will produce unweighted statistics instead. Can be set toNULL
to return unweighted statistics.- jrrIMax
a numeric value; when using the jackknife variance estimation method, the default estimation option,
jrrIMax=1
, uses the sampling variance from the first plausible value as the component for sampling variance estimation. TheVjrr
term (see Statistical Methods Used in EdSurvey) can be estimated with any number of plausible values, and values larger than the number of plausible values on the survey (includingInf
) will result in all plausible values being used. Higher values ofjrrIMax
lead to longer computing times and more accurate variance estimates.- varMethod
deprecated parameter;
gap
always uses the jackknife variance estimation- dropOmittedLevels
a logical value. When set to
TRUE
, drops those levels of the specifiedvariable
. Use print on anedsurvey.data.frame
to see the omitted levels. Defaults toFALSE
.- defaultConditions
a logical value. When set to the default value of
TRUE
, uses the default conditions stored in anedsurvey.data.frame
to subset the data. Useprint
on anedsurvey.data.frame
to see the default conditions.- recode
a list of lists to recode variables. Defaults to
NULL
. Can be set asrecode
=
list(var1
=
list(from
=
c("a","b","c"), to
=
"d"))
.- targetLevel
a character string. When specified, calculates the gap in the percentage of students at
targetLevel
in thevariable
argument, which is useful for comparing the gap in the percentage of students at a survey response level.- jkSumMultiplier
when the jackknife variance estimation method—or balanced repeated replication (BRR) method—multiplies the final jackknife variance estimate by a value, set
jkSumMultiplier
to that value. For anedsurvey.data.frame
, or alight.edsurvey.data.frame
, the recommended value can be recovered withEdSurvey::getAttributes(
myData,
"jkSumMultiplier")
.- returnVarEstInputs
a logical value set to
TRUE
to return the inputs to the jackknife and imputation variance estimates, which allows for the computation of covariances between estimates.- omittedLevels
this argument is deprecated. Use
dropOmittedLevels
Value
a list object with elements:
- mean
the mean assessment score for
variable
, calculated according to the vignette titled Statistical Methods Used in EdSurvey- std
the standard deviation of the
mean
- stdSE
the standard error of the
std
- df
the degrees of freedom of the
std
- varEstInputs
the variance estimate inputs used for calculating covariances with
varEstToCov
. Only returned withreturnVarEstInputs
isTRUE
Examples
if (FALSE) { # \dontrun{
# read in the example data (generated, not real student data)
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package="NAEPprimer"))
# get standard deviation for Male's composite score
SD(data = subset(sdf, dsex == "Male"), variable = "composite")
# get several standard deviations
# build an edsurvey.data.frame.list
sdfA <- subset(sdf, scrpsu %in% c(5,45,56))
sdfB <- subset(sdf, scrpsu %in% c(75,76,78))
sdfC <- subset(sdf, scrpsu %in% 100:200)
sdfD <- subset(sdf, scrpsu %in% 201:300)
sdfl <- edsurvey.data.frame.list(datalist=list(sdfA, sdfB, sdfC, sdfD),
labels=c("A locations",
"B locations",
"C locations",
"D locations"))
# this shows how these datasets will be described:
sdfl$covs
# SD results for each survey
SD(data = sdfl, variable = "composite")
# SD results more compactly and with comparisons
gap(variable="composite", data=sdfl, stDev=TRUE, returnSimpleDoF=TRUE)
} # }