# (p.313) Appendix C Estimating Math Achievement Benchmarks for College Readiness

# (p.313) Appendix C Estimating Math Achievement Benchmarks for College Readiness

First, a backward mapping method was used to examine the desirable growth trajectory and to estimate age- and grade-specific levels of math achievement for college readiness: (1) 2-year college completion, (2) 4-year college completion. Three national longitudinal datasets are used: the Early Childhood Longitudinal Study-Birth Cohort (ECLS-B), the Early Childhood Longitudinal Study-Kindergarten Cohort (ECLS-K), and the National Education Longitudinal Study (NELS). Second, these estimated national benchmarks of students’ math achievement levels for college readiness were compared with state, national (National Assessment of Educational Progress [NAEP]) and international (Trends in International Math and Science Studies [TIMSS]) math proficiency standards for corresponding grades, respectively. The study used linking (concordance) method for comparing scores from different tests; the underlying assumption was that scores are not interchangeable but comparable, thanks to the use of common math curriculum frameworks and similar standards.

A logistic regression method was used to examine desirable growth trajectory and to estimate age- and grade-specific levels of math achievement for college entrance and completion. Specifically, this book estimated benchmark scores based on NELS eighth-, tenth-, and twelfth-grade math test scores that best differentiated between students who attended 2-year versus 4-year colleges as their first postsecondary education institution and between students who completed 2-year versus 4-year colleges (associate’s degree vs. bachelor’s degree holders). For these analyses, the study used the NELS: 88/2000 Postsecondary Education Transcript Study (PETS) data that collected transcripts from all postsecondary institutions attended after high school by the NELS: 88 fourth follow-up study’s respondent population. Specifically, this book examined four outcome variables for the NELS: 88 eighth-grade cohort with information on whether they attended or completed colleges and universities during the period 1992–2000. The four outcomes included (1) 2-year college attendance, (2) 2-year college completion, (3) 4-year college attendance, and (4) 4-year college completion. After checking model fit and (p.314) prediction accuracy, logistic regression results were used to estimate benchmark math achievement scores at the 50 percent chance level of outcome occurrence.

A linear linking (scale concordance) method was used to convert NELS scores into the NAEP scale and then convert the NAEP scores into the ECLS scale based on the assumptions of test and population comparability. For the NAEP-NELS linkage, equating samples were composed of the main NAEP 1992 national sample of twelfth-graders and NELS: 88 1992 twelfth-grade cohort members (excluding dropouts and non-seniors; see Scott, Ingels, & Owings, 2007). For the NAEP-ECLS linkage, equating samples were composed of the main NAEP 2007 national sample of eighth-graders and ECLS-K 2007 eighth-grade cohort members.

After the scale conversions, ECLS-K eighth-graders were classified into two groups: those who were on track to colleges (i.e., at or above the corresponding NELS eighth-grade benchmark scores for college readiness) and those who were not on track to college. Using this dichotomous grouping as a dependent variable, the same logistic regression method was used to estimate benchmark math achievement scores for grades K, 1, 3, and 5. The ECLS-K fall kindergarten benchmark score was converted into the ECLS-B scale. This linking depends on the assumption that the tests are comparable and that there were no changes in population between 1998 and 2006/07 in the fall kindergarten math scores. Then, the same logistic regression method was used to estimate benchmark math achievement scores for ECLS-B pre-K (age 4).

This book also drew on performance standards from state assessments, NAEP, and TIMSS that specify desired math proficiency levels based on curriculum standards. The national average math achievement levels were compared with state, national, and international math assessment standards for corresponding grades based on 2007 state assessment, NAEP, and TIMSS test results, respectively. For the NAEP and TIMSS standards (i.e., Proficient level in NAEP, High benchmark in TIMSS), given cut scores were converted into *z*-scores based on the US national mean and standard deviation of math achievement within grades (grades 4 and 8 in 2007 NAEP and TIMSS; grade 12 in 2000 NAEP).

For state assessment standards, both test score distributions and cut scores were not available. Using states’ reported percentages (representing the area above *z*) of students in grades 3–11 whose math achievement was at or above target standards for No Child Left Behind (NCLB) Act accountability purpose, the study estimated *z*-scores of individual states’ proficiency standard cut scores based on the normal distribution table and then averaged *z*-scores across states. Once all the international, national, and state standards of math proficiency were identified by within-grade *z*-scores, the standards were also converted into cross-grade standard scores.

There are commonalities between NAEP and TIMSS that warrant linking for comparison of performance standards. First, both assessments were based on similar curricular frameworks, such as National Council for Teachers of Mathematics (NCTM) standards in math; content analyses of both assessments suggest that (p.315) the assessments are sufficiently similar to warrant linkage for global comparisons (McLaughlin, Dossey, & Stancavage, 1997; NCES, 2006). Second, both assessments were administered to nationally representative samples of students in the same grade in the same year (grades 4 and 8 in 2007). A similar linkage was made between NAEP and state assessments because previous studies showed comparability of the two assessments in terms of content despite big discrepancies in the rigor of performance standards (NCES, 2007).