Skip to contents

Prepare IRT parameters and score items and then estimate a linear model with direct estimation.

Usage

mml.sdf(
  formula,
  data,
  weightVar = NULL,
  dropOmittedLevels = TRUE,
  composite = TRUE,
  verbose = 0,
  multiCore = FALSE,
  numberOfCores = NULL,
  minNode = -4,
  maxNode = 4,
  Q = 34,
  idVar = NULL,
  returnMmlCall = FALSE,
  omittedLevels = deprecated()
)

Arguments

formula

a formula for the model.

data

an edsurvey.data.frame for the National Assessment of Educational Progress (NAEP) and the Trends in International Mathematics and Science Study (TIMSS). The attributes dichotParamTab, polyParamTab, testData, scoreCard (for NAEP), and scoreDict (for TIMSS) must not be NULL. Use the function setNAEPScoreCard or setAttributes to set attributes.

weightVar

a character indicating the weight variable to use. The weightVar must be one of the weights for the edsurvey.data.frame. If NULL, it uses the default for the edsurvey.data.frame.

dropOmittedLevels

a logical value. When set to the value of TRUE, drops the levels of all factor variables that are specified in an edsurvey.data.frame. Use print on an edsurvey.data.frame to see the omitted levels. To draw plausible values for the full dataset, the user must set this to FALSE.

composite

logical; for a NAEP composite, setting to FALSE fits the model to all items at once, in a single construct, whereas setting to TRUE fits the model as a NAEP composite (i.e., a weighted average of the subscales). This argument is not applicable for TIMSS which is always fit as an overall (non-composite).

verbose

logical; indicates whether a detailed printout should display during execution, only for NAEP data.

multiCore

allows the foreach package to be used. This function will setup and take down the cluster.

numberOfCores

the number of cores to be used when using multiCore. Defaults to 75% of available cores. Users can check available cores with detectCores().

minNode

numeric; minimum integration point in direct estimation; see mml.

maxNode

numeric; maximum integration point in direct estimation; see mml.

Q

integer; number of integration points per student used when integrating over the levels of the latent outcome construct.

idVar

a variable that is used to explicitly define the name of the student identifier variable to be used from data. Defaults to NULL, and sid is used as the student identifier.

returnMmlCall

logical; when TRUE, do not process the mml call but instead return it for the user to edit before calling

omittedLevels

this argument is deprecated. Use dropOmittedLevels

Value

An mml.sdf object, which is the outcome from mml.sdf, with the following elements:

mml

an object containing information from the mml procedure. ?mml can be used for further information.

scoreDict

the scoring used in the mml procedure

.

itemMapping

the item mapping used in the mml procedure

.

Details

Typically, models are fit with NAEP data using plausible values to integrate out the uncertainty in the measurement of individual student outcomes. When direct estimation is used, the measurement error is integrated out explicitly using Q quadrature points. See documentation for mml in the Dire package.

The scoreDict helps turn response categories that are not simple item responses, such as Not Reached and Multiple, to something coded as inputs for the mml function in Dire. How mml treats these values depends on the test. For NAEP, for a dichotomous item, 8 is scored as the same proportion correct as the guessing parameter for that item, 0 is an incorrect response, an NA does not change the student's score, and 1 is correct. TIMSS does not require a scoreDict.

References

Cohen, J., & Jiang, T. (1999). Comparison of partially measured latent traits across nominal subgroups. Journal of the American Statistical Association, 94(448), 1035–1044. https://doi.org/10.2307/2669917

Examples

if (FALSE) { # \dontrun{
## Direct Estimation with NAEP 
# Load data 
sdfNAEP <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))

# Inspect scoring guidelines
defaultNAEPScoreCard()

# example output: 
#          resCat pointMult pointConst
# 1     Multiple         8          0
# 2  Not Reached        NA         NA
# 3      Missing        NA         NA
# 4      Omitted         8          0
# 5    Illegible         0          0
# 6 Non-Rateable         0          0
# 7     Off Task         0          0

# Run NAEP model, warnings are about item codings
mmlNAEP <- mml.sdf(formula=algebra ~ dsex + b013801, data=sdfNAEP, weightVar='origwt')

# Call with Taylor
summary(mmlNAEP, varType="Taylor", strataVar="repgrp1", PSUVar="jkunit")

## Direct Estimation with TIMSS 
# Load data 
downloadTIMSS("~/", year=2015)
sdfTIMSS <- readTIMSS(path="~/TIMSS/2015", countries="usa", grade = "4")

# Run TIMSS model, warnings are about item codings 
mmlTIMSS <- mml.sdf(formula=mmat ~ itsex + asbg04, data=sdfTIMSS, weightVar='totwgt')

# Call with Taylor
summary(mmlTIMSS, varType="Taylor", strataVar="jkzone", PSUVar="jkrep")
} # }