Returns a summary table (as a data.frame)
that shows the number of students, the percentage of students, and the mean
value of the outcome (or left-hand side) variable by the
predictor (or right-hand side) variable(s).
Usage
edsurveyTable(
formula,
data,
weightVar = NULL,
jrrIMax = 1,
pctAggregationLevel = NULL,
returnMeans = TRUE,
returnSepct = TRUE,
varMethod = c("jackknife", "Taylor"),
drop = FALSE,
dropOmittedLevels = TRUE,
defaultConditions = TRUE,
recode = NULL,
returnVarEstInputs = FALSE,
omittedLevels = deprecated()
)Arguments
- formula
object of class
formula, potentially with a subject scale or subscale on the left-hand side and variables to tabulate on the right-hand side. When the left-hand side of the formula is omitted andreturnMeansisTRUE, then the default subject scale or subscale is used. You can find the default composite scale and all subscales using the functionshowPlausibleValues. Note that the order of the right-hand side variables affects the output.- data
object of class
edsurvey.data.frame. SeereadNAEPfor how to generate anedsurvey.data.frame.- weightVar
character string indicating the weight variable to use. Note that only the name of the weight variable needs to be included here, and any replicate weights will be automatically included. When this argument is
NULL, the function uses the default. UseshowWeightsto find the default.- jrrIMax
a numeric value; when using the jackknife variance estimation method, the default estimation option,
jrrIMax=1, uses the sampling variance from the first plausible value as the component for sampling variance estimation. The \(V_{jrr}\) term (see the Details section oflm.sdfto see the definition of \(V_{jrr}\)) can be estimated with any number of plausible values, and values larger than the number of plausible values on the survey (includingInf) will result in all of the plausible values being used. Higher values ofjrrIMaxlead to longer computing times and more accurate variance estimates.- pctAggregationLevel
the percentage variable sums up to 100 for the first
pctAggregationLevelcolumns. So, when set to0, thePCTcolumn adds up to 1 across the entire sample. When set to1, thePCTcolumn adds up to 1 within each level of the first variable on the right-hand side of the formula; when set to2, then the percentage adds up to 100 within the interaction of the first and second variable, and so on. Default isNULL, which will result in the lowest feasible aggregation level. See Examples section.- returnMeans
a logical value; set to
TRUE(the default) to get theMEANandSE(MEAN)columns in the returned table described in the Value section.- returnSepct
set to
TRUE(the default) to get theSEPCTcolumn in the returned table described in the Value section.- varMethod
a character set to
jackknifeorTaylorthat indicates the variance estimation method to be used.- drop
a logical value. When set to the default value of
FALSE, when a single column is returned, it is still represented as adata.frameand is not converted to a vector.- dropOmittedLevels
a logical value. When set to the default value of
TRUE, drops those levels of all factor variables that are specified in anedsurvey.data.frame. Useprinton anedsurvey.data.frameto see the omitted levels.- defaultConditions
a logical value. When set to the default value of
TRUE, uses the default conditions stored in anedsurvey.data.frameto subset the data. Useprinton anedsurvey.data.frameto see the default conditions.- recode
a list of lists to recode variables. Defaults to
NULL. Can be set asrecode=list(var1=list(from=c("a", "b", "c"),to="c")).- returnVarEstInputs
a logical value set to
TRUEto return the inputs to the jackknife and imputation variance estimates, which allows for the computation of covariances between estimates.- omittedLevels
this argument is deprecated. Use
dropOmittedLevels.
Value
A table with the following columns:
- RHS levels
one column for each right-hand side variable. Each row regards students who are at the levels shown in that row.
- N
count of the number of students in the survey in the
RHS levels- WTD_N
the weighted N count of students in the survey in
RHS levels- PCT
the percentage of students at the aggregation level specified by
pctAggregationLevel(see Arguments). See the vignette titled Statistical Methods Used in EdSurvey in the section “Estimation of Weighted Percentages” and its first subsection “Estimation of Weighted Percentages When Plausible Values Are Not Present.”- SE(PCT)
the standard error of the percentage, accounting for the survey sampling methodology. When
varMethodis thejackknife, the calculation of this column is described in the vignette titled Statistical Methods Used in EdSurvey in the section “Estimation of the Standard Error of Weighted Percentages When Plausible Values Are Not Present, Using the Jackknife Method.” WhenvarMethodis set toTaylor, the calculation of this column is described in “Estimation of the Standard Error of Weighted Percentages When Plausible Values Are Not Present, Using the Taylor Series Method.”- MEAN
the mean assessment score for units in the
RHS levels, calculated according to the vignette titled Statistical Methods Used in EdSurvey in the section “Estimation of Weighted Means When Plausible Values Are Present.”- SE(MEAN)
the standard error of the
MEANcolumn (the mean assessment score for units in theRHS levels), calculated according to the vignette titled Statistical Methods Used in EdSurvey in the sections “Estimation of Standard Errors of Weighted Means When Plausible Values Are Present, Using the Jackknife Method” or “Estimation of Standard Errors of Weighted Means When Plausible Values Are Present, Using the Taylor Series Method,” depending on the value ofvarMethod.
When returnVarEstInputs is TRUE, two additional elements are
returned. These are meanVarEstInputs and pctVarEstInputs and
regard the MEAN and PCT columns, respectively. These two
objects can be used for calculating covariances with
varEstToCov.
Details
This method can be used to generate a simple one-way, two-way, or n-way table with unweighted and weighted n values and percentages. It also can calculate the average of the subject scale or subscale for students at each level of the cross-tabulation table.
A detailed description of all statistics is given in the vignette titled Statistical Methods Used in EdSurvey.
References
Binder, D. A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51(3), 279–292.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.
Examples
if (FALSE) { # \dontrun{
# read in the example data (generated, not real student data)
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))
# create a table that shows only the breakdown of dsex
edsurveyTable(formula=composite ~ dsex, data=sdf, returnMeans=FALSE, returnSepct=FALSE)
# create a table with composite scores by dsex
edsurveyTable(formula=composite ~ dsex, data=sdf)
# add a second variable
edsurveyTable(formula=composite ~ dsex + b017451, data=sdf)
# add a second variable, do not omit any levels
edsurveyTable(formula=composite ~ dsex + b017451 + b003501, data=sdf, omittedLevels=FALSE)
# add a second variable, do not omit any levels, change aggregation level
edsurveyTable(formula=composite ~ dsex + b017451 + b003501, data=sdf, omittedLevels=FALSE,
pctAggregationLevel=0)
edsurveyTable(formula=composite ~ dsex + b017451 + b003501, data=sdf, omittedLevels=FALSE,
pctAggregationLevel=1)
edsurveyTable(formula=composite ~ dsex + b017451 + b003501, data=sdf, omittedLevels=FALSE,
pctAggregationLevel=2)
# variance estimation using the Taylor series
edsurveyTable(formula=composite ~ dsex + b017451 + b003501, data=sdf, varMethod="Taylor")
} # }