Skip to contents

Returns achievement levels using weights and variance estimates appropriate for the edsurvey.data.frame.

Usage

achievementLevels(
  achievementVars = NULL,
  aggregateBy = NULL,
  data,
  cutpoints = NULL,
  returnDiscrete = TRUE,
  returnCumulative = FALSE,
  weightVar = NULL,
  jrrIMax = 1,
  dropOmittedLevels = TRUE,
  defaultConditions = TRUE,
  recode = NULL,
  returnNumberOfPSU = FALSE,
  returnVarEstInputs = FALSE,
  omittedLevels = deprecated()
)

Arguments

achievementVars

character vector indicating variables to be included in the achievement levels table, potentially with a subject scale or subscale. When the subject scale or subscale is omitted, the default subject scale or subscale is used. You can find the default composite scale and all subscales using the function showPlausibleValues.

aggregateBy

character vector specifying variables by which to aggregate achievement levels. The percentage column sums up to 100 for all levels of all variables specified here. When set to the default of NULL, the percentage column sums up to 100 for all levels of all variables specified in achievementVars.

data

an edsurvey.data.frame

cutpoints

numeric vector indicating cutpoints. Set to standard NAEP cutpoints for Basic, Proficient, and Advanced by default.

returnDiscrete

logical indicating if discrete achievement levels should be returned. Defaults to TRUE.

returnCumulative

logical indicating if cumulative achievement levels should be returned. Defaults to FALSE. The first and last categories are the same as defined for discrete levels.

weightVar

character string indicating the weight variable to use. Only the name of the weight variable needs to be included here, and any replicate weights will be automatically included. When this argument is NULL, the function uses the default. Use showWeights to find the default.

jrrIMax

a numeric value. When using the jackknife variance estimation method, the default estimation option, jrrIMax=1, uses the sampling variance from the first plausible value as the component for sampling variance estimation. The \(V_{jrr}\) term (see Statistical Methods Used in EdSurvey for the definition of \(V_{jrr}\)) can be estimated with any number of plausible values, and values larger than the number of plausible values on the survey (including Inf) will result in all plausible values being used. Higher values of jrrIMax lead to longer computing times and more accurate variance estimates.

dropOmittedLevels

a logical value. When set to the default value (TRUE), it drops those levels in all factor variables that are specified in achievementVars and aggregateBy. Use print on an edsurvey.data.frame to see the omitted levels.

defaultConditions

a logical value. When set to the default value of TRUE, uses the default conditions stored in an edsurvey.data.frame to subset the data. Use print on an edsurvey.data.frame to see the default conditions.

recode

a list of lists to recode variables. Defaults to NULL. Can be set as recode = list(var1= list(from=c("a", "b", "c"), to ="d")). See Examples.

returnNumberOfPSU

a logical value set to TRUE to return the number of primary sampling units (PSUs)

returnVarEstInputs

a logical value set to TRUE to return the inputs to the jackknife and imputation variance estimates, which allows for the computation of covariances between estimates.

omittedLevels

this argument is deprecated. Use dropOmittedLevels.

Value

A list containing up to two data frames, one discrete achievement levels (when returnDiscrete is TRUE) and one for cumulative achievement levels (when returnCumulative is TRUE). The data.frame contains the following columns:

Level

one row for each level of the specified achievement cutpoints

Variables in achievementVars

one column for each variable in achievementVars and one row for each level of each variable in achievementVars

Percent

the percentage of students at or above each achievement level aggregated as specified by aggregateBy

StandardError

the standard error of the percentage, accounting for the survey sampling methodology. See the vignette titled Statistical Methods Used in EdSurvey.

N

the number of observations in the incoming data (the number of rows when omittedLevels and defaultConditions are set to FALSE)

wtdN

the weighted number of observations in the data

nPSU

the number of PSUs at or above each achievement level aggregated as specified by aggregateBy. Only returned with returnNumberOfPSU=TRUE.

Details

The achievementLevels function applies appropriate weights and the variance estimation method for each edsurvey.data.frame, with several arguments for customizing the aggregation and output of the analysis results. Namely, by using these optional arguments, users can choose to generate the percentage of students performing at each achievement level (discrete), generate the percentage of students performing at or above each achievement level (cumulative), calculate the percentage distribution of students by achievement level (discrete or cumulative) and selected characteristics (specified in aggregateBy), and compute the percentage distribution of students by selected characteristics within a specific achievement level.

Calculation of percentages

The details of the methods are shown in the vignette titled Statistical Methods Used in EdSurvey in “Estimation of Weighted Percentages When Plausible Values Are Present” and are used to calculate all cumulative and discrete probabilities.

When the requested achievement levels are discrete (returnDiscrete = TRUE), the percentage \(\mathcal{A}\) is the percentage of students (within the categories specified in aggregateBy) whose scores lie in the range \([cutPoints_i, cutPoints_{i+1}), i = 0,1,...,n\). cutPoints is the score thresholds provided by the user with \(cutPoints_0\) taken to be 0. cutPoints are set to NAEP standard cutpoints for achievement levels by default. To aggregate by a specific variable, for example, dsex, specify dsex in aggregateBy and all other variables in achievementVars. To aggregate by subscale, specify the name of the subscale (e.g., num_oper) in aggregateBy and all other variables in achievementVars.

When the requested achievement levels are cumulative (returnCumulative = TRUE), the percentage \(\mathcal{A}\) is the percentage of students (within the categories specified in aggregateBy) whose scores lie in the range [\(cutPoints_i\), \(\infty\)), \(i = 1, 2, ..., n-1\). The first and last categories are the same as defined for discrete levels.

Calculation of standard error of percentages

The method used to calculate the standard error of the percentages is described in the vignette titled Statistical Methods Used in EdSurvey in the sections “Estimation of the Standard Error of Weighted Percentages When Plausible Values Are Present, Using the Jackknife Method” and “Estimation of the Standard Error of Weighted Percentages When Plausible Values Are Not Present, Using the Taylor Series Method.” For “Estimation of the Standard Error of Weighted Percentages When Plausible Values Are Present, Using the Jackknife Method,” the value of jrrIMax sets the value of \(m^*\).

References

Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.

Author

Huade Huo, Ahmad Emad, and Trang Nguyen

Examples

if (FALSE) { # \dontrun{
# read in the example data (generated, not real student data)
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package="NAEPprimer"))

# discrete achievement levels
achievementLevels(achievementVars=c("composite"), aggregateBy=NULL, data=sdf)

# discrete achievement levels with a different subscale
achievementLevels(achievementVars=c("num_oper"), aggregateBy=NULL, data=sdf)

# cumulative achievement levels
achievementLevels(achievementVars=c("composite"), aggregateBy=NULL, data=sdf, 
                  returnCumulative=TRUE) 

# cumulative achievement levels with a different subscale
achievementLevels(achievementVars=c("num_oper"), aggregateBy=NULL, data=sdf, 
                  returnCumulative=TRUE) 

# achievement levels as independent variables, by sex aggregated by composite
achievementLevels(achievementVars=c("composite", "dsex"), aggregateBy="composite",
                  data=sdf, returnCumulative=TRUE) 

# achievement levels as independent variables, by sex aggregated by sex
achievementLevels(achievementVars=c("composite", "dsex"), aggregateBy="dsex", 
                  data=sdf, returnCumulative=TRUE) 

# achievement levels as independent variables, by race aggregated by race
achievementLevels(achievementVars=c("composite", "sdracem"),
                  aggregateBy="sdracem", data=sdf, returnCumulative=TRUE) 

# use customized cutpoints
achievementLevels(achievementVars=c("composite"), aggregateBy=NULL, data=sdf, 
                  cutpoints = c("Customized Basic" = 200, 
                                "Customized Proficient" = 300, 
                                "Customized Advanced" = 400))

# use recode to change values for specified variables:
achievementLevels(achievementVars=c("composite", "dsex", "b017451"),
                  aggregateBy = "dsex", sdf,
                  recode=list(b017451=list(from=c("Never or hardly ever",
                                                  "Once every few weeks",
                                                  "About once a week"),
                                           to="Infrequently"),
                              b017451=list(from=c("2 or 3 times a week",
                                                  "Every day"),
                                           to="Frequently")))
} # }