Fits a quantile regression model that uses weights and variance estimates appropriate for the data.
Usage
rq.sdf(
formula,
data,
tau = 0.5,
weightVar = NULL,
relevels = list(),
jrrIMax = 1,
dropOmittedLevels = TRUE,
defaultConditions = TRUE,
recode = NULL,
returnNumberOfPSU = FALSE,
omittedLevels = deprecated(),
...
)
Arguments
- formula
a
formula
for the quantile regression model. Seerq
. If y is left blank, the default subject scale or subscale variable will be used. (You can find the default usingshowPlausibleValues
.) If y is a variable for a subject scale or subscale (one of the names shown byshowPlausibleValues
), then that subject scale or subscale is used.- data
an
edsurvey.data.frame
, alight.edsurvey.data.frame
, or anedsurvey.data.frame.list
- tau
the quantile to be estimated. The value could be set between 0 and 1 with a default of 0.5.
- weightVar
a character indicating the weight variable to use. The
weightVar
must be one of the weights for theedsurvey.data.frame
. IfNULL
, it uses the default for theedsurvey.data.frame
.- relevels
a list. Used to change the contrasts from the default treatment contrasts to the treatment contrasts with a chosen omitted group (the reference group). The name of each element should be the variable name, and the value should be the group to be omitted (the reference group).
- jrrIMax
when using the jackknife variance estimation method, the default estimation option,
jrrIMax=1
, uses the sampling variance from the first plausible value as the component for sampling variance estimation. The \(V_{jrr}\) term can be estimated with any number of plausible values, and values larger than the number of plausible values on the survey (includingInf
) will result in all plausible values being used. Higher values ofjrrIMax
lead to longer computing times and more accurate variance estimates.- dropOmittedLevels
a logical value. When set to the default value of
TRUE
, drops those levels of all factor variables that are specified in anedsurvey.data.frame
. Useprint
on anedsurvey.data.frame
to see the omitted levels.- defaultConditions
a logical value. When set to the default value of
TRUE
, uses the default conditions stored in anedsurvey.data.frame
to subset the data. Useprint
on anedsurvey.data.frame
to see the default conditions.- recode
a list of lists to recode variables. Defaults to
NULL
. Can be set asrecode=
list(
var1
=
list(
from=
c("a",
"b",
"c"),
to=
"d"))
.- returnNumberOfPSU
a logical value set to
TRUE
to return the number of primary sampling units (PSUs)- omittedLevels
this argument is deprecated. Use
dropOmittedLevels
- ...
additional parameters passed from
rq
Value
An edsurvey.rq
with the following elements:
- call
the function call
- formula
the formula used to fit the model
- tau
the quantile to be estimated
- coef
the estimates of the coefficients
- se
the standard error estimates of the coefficients
- Vimp
the estimated variance from uncertainty in the scores (plausible value variables)
- Vjrr
the estimated variance from sampling
- M
the number of plausible values
- varm
the variance estimates under the various plausible values
- coefm
the values of the coefficients under the various plausible values
- coefmat
the coefficient matrix (typically produced by the summary of a model)
- weight
the name of the weight variable
- npv
the number of plausible values
- njk
the number of the jackknife replicates used; set to
NA
when Taylor series variance estimates are used- rho
the mean value of the objective function across the plausible values
Details
The function computes an estimate on the tau
-th conditional quantile function of the response,
given the covariates, as specified by the formula argument. Like lm.sdf()
, the
function presumes a linear specification for the quantile regression model (i.e., that the
formula defines a model that is linear in parameters). Unlike lm.sdf()
, the jackknife is the
only applicable variance estimation method used by the function.
For further details on quantile regression models and how they are implemented in R, see Koenker
and Bassett (1978), Koenker (2005), and the vignette from the quantreg
package—
accessible by vignette("rq",package="quantreg")
—on which this function is
built.
For further details on how left-hand side variables, survey sampling weights, and estimated
variances are correctly handled, see lm.sdf
or the vignette titled
Statistical Methods Used in EdSurvey.
References
Binder, D. A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51(3), 279–292.
Johnson, E. G., & Rust, K. F. (1992). Population inferences and variance estimation for NAEP data. Journal of Education Statistics, 17(2), 175–190.
Koenker, R. W., & Bassett, G. W. (1978). Regression quantiles, Econometrica, 46, 33–50.
Koenker, R. W. (2005). Quantile regression. Cambridge, UK: Cambridge University Press.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.
Examples
if (FALSE) { # \dontrun{
# read in the example data (generated, not real student data)
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))
# conduct quantile regression at a given tau value (by default, tau is set to be 0.5)
rq1 <- rq.sdf(formula=composite ~ dsex + b017451, data=sdf, tau = 0.8)
summary(rq1)
} # }