Opens a connection to a PISA data file and
returns an edsurvey.data.frame
with
information about the file and data.
Arguments
- path
a character vector to the full directory path(s) to the PISA-extracted fixed-width files and SPSS control files (.txt).
- database
a character to indicate a selected database. Must be one of
INT
(general database that most people use),CBA
(computer-based database in PISA 2012 only), orFIN
(financial literacy database in PISA 2012, 2018, and 2022. Note that `INT` needs to be used for PISA 2015 financial literacy data as it could be merged to the general database). Defaults toINT
.- countries
a character vector of the country/countries to include using the three-digit ISO country code. A list of country codes can be found in the PISA codebook or https://en.wikipedia.org/wiki/ISO_3166-1#Current_codes. If files are downloaded using
downloadPISA
, a country dictionary text file can be found in the filepath.- cognitive
one of
none
,score
, orresponse
. Default isscore
. The PISA database often has three student files: student questionnaire, cognitive item response, and scored cognitive item response. The first file is used as the main student file with student background information. Users can choose whether to mergescore
orresponse
data into the main file or not (ifnone
).- forceReread
a logical value to force rereading of all processed data. Defaults to
FALSE
. SettingforceReread
to beTRUE
will cause PISA data to be reread and increase processing time.- verbose
a logical value that will determine if you want verbose output while the function is running to indicate progress. Defaults to
TRUE
.
Value
an edsurvey.data.frame
for a single specified country or
an edsurvey.data.frame.list
if multiple countries are specified
Details
Reads in the unzipped files downloaded from the PISA database using the
OECD Repository (https://www.oecd.org/pisa.html). Users can use
downloadPISA
to download all required files.
Student questionnaire files (with weights and plausible values) are used as
main files, which are then
merged with cognitive, school, and parent files (if available).
The average first-time processing time for 1 year and one database for all
countries is 10–15 minutes. If forceReread
is set
to be FALSE
, the next time this function is called will take only
5–10 seconds.
For the PISA 2000 study, please note that the study weights are subject
specific. Each weight has different adjustment factors for reading, mathematics, and science
based on it's original subject source file. For example, the w_fstuwt_read
weight is associated with the reading
subject data file. Special care must be used to select the correct weight based on your specific analysis. See the OECD
documentation for further details. Use the showWeights
function to see all three student level subject weights:
w_fstuwt_read = Reading (default)
w_fstuwt_scie = Science
w_fstuwt_math = Mathematics
References
Organisation for Economic Co-operation and Development. (2017). PISA 2015 technical report. Paris, France: OECD Publishing. Retrieved from https://www.oecd.org/pisa/data/2015-technical-report.html
See also
getData
and downloadPISA
Examples
if (FALSE) { # \dontrun{
# the following call returns an edsurvey.data.frame to
# PISA 2012 International Database for Singapore
sgp2012 <- readPISA(path = "~/PISA/2012", database = "INT", countries = "sgp")
# extract a data.frame with a few variables
gg <- getData(sgp2012, c("cnt","read","w_fstuwt"))
head(gg)
# conduct an analysis on the edsurvey.data.frame
edsurveyTable(formula=read ~ st04q01 + st20q01, data = sgp2012)
} # }