Bright ideas for tech-savvy educators, right to your inbox

Study Island Benchmarks & Predictive Validity

Wednesday, October 14, 2015 -- Irene McAfee

Edmentum is excited to announce that, in a few weeks, we will be releasing our latest white paper report on Study Island Benchmarks! This report will highlight the predictive validity of the benchmarks, looking specifically at data related to math and English language arts achievement for grades 3 through 8. The report has been carefully designed to reflect the highest standards for educational researchers, the Standards for Educational and Psychological Testing (2014). 

Typically, Study Island Benchmark tests are administered online to students using the Study Island standards mastery program as a part of classroom curriculum. The benchmarks are a set of four tests per grade level designed to be taken periodically throughout the school year and are customized to mirror the structure and item formats found in the appropriate state assessments. The results of each benchmark test are expected to reflect how individual students would perform on a high-stakes assessment and to target areas for instructional support. Educators can make use of data from the Study Island Benchmarks to allow for more efficient use of classroom time and resources.

So, what exactly is predictive validity, and why is it important? Predictive validity helps address the questions, “Does this benchmark test measure what it’s supposed to and can the results help me predict the state test score of each student in my classroom?” As the name implies, predictive validity addresses how well a specific test predicts future behavior. Predictive validity, in Edmentum’s new report, examines the relationship between the scores of the benchmark assessment and the state assessment. The stronger the relationship between the two scores, the greater the chances of accurately predicting a state test score from a benchmark score. In addition, if the relationship increases over time, across the school year, it could mean that an increase in standards learning has occurred.

Illustrative Interpretation

When we talk about predictive validity, and the relationship between benchmark and state assessments, we speak in terms of correlation coefficients. To explain, let’s look at an example, and discuss what a correlation coefficient of .83 would mean.

A correlation coefficient does not have a simple interpretation, but researchers square the correlation coefficient to yield an interpretable quantity. When we square .83, we arrive at .69. This number .69 indicates that the predictor (benchmark) and the criterion (end-of-year test) share 69% of their variance in common.

The orange region of the Venn diagram in Figure 1 shows what 69% shared variance looks like. For purposes of comparison, Figure 2 shows what 20% shared variance looks like. This is the degree of shared variance implied by a correlation of .45. This is how we interpret what a “good enough” correlation is.

Figure 1: 69% Shared Variance   Figure 2: 20% Shared Variance

What does it mean to share 69% of variance in common? It means that both the benchmarks and the state tests show variability in student scores. There are individual differences. We infer that an underlying ability makes those scores vary from child to child. The Venn diagram in Figure 1 means that the orange region is what the benchmark and the state test capture in common. We infer that this underlying trait is in fact a student’s true ability in the specified domain.

In Edmentum’s predictive validity report, the statistical correlations between benchmarks and state assessments for math range from .594 to .862.  In general, the predictive validity of the Edmentum math benchmark tests increases from benchmark 1 to benchmark 4. For example, for grade 4, the predictive validity of benchmark 1, taken in the fall, is .764. Benchmark 2 is .766, benchmark 3 is .821, and benchmark 4 is .830. For English Language Arts, the relationships are similar to those for math and on average, higher. The English language arts, statistical correlations range from .688 to .836. Below is a visual of the upwardly increasing pattern of the relationship between the Study Island Benchmark scores and the state test score. In this particular sample of test scores, the 8th graders only took benchmarks 1, 2, and 3, while the 3rd graders took all four benchmark

ELA Predictive ValidityMath Predictive Validity

Overall, the Study Island Benchmark test scores are impressively high and provide a solid prediction of a student’s performance on the state test, as well as an instructional tool for the teacher. Edmentum is excited to share further details and the complete results of our research in our new white paper later this month. Stay tuned for its release! In the meantime, check out this brochure to learn more about Study Island Benchmark assessments, or request a demo to experience Study Island for yourself!