The #1 Curriculum and Assessment Partner for Educators

[Assessment Literacy Video Series] Breaking Down Standard Error of Measurement

[Assessment Literacy Video Series] Breaking Down Standard Error of Measurement

Our assessment literacy video series aims to demystify, unpack, and connect assessment concepts and principles to help you make more sense out of your assessment data. Maybe you’re just learning the ropes of some of the more complicated metrics reported in educational assessments, or perhaps you’re hoping to see how an assessment concept applies to Edmentum’s suite of assessment programs. Either way, let our top-notch research team of former educators and subject-matter experts be your guide.

Have you ever measured something multiple times—like your height or the time it takes you to drive to work—and gotten different results each time? Measurements of your height might be a little different based on which shoes you’re wearing or your posture. Measurements of commute time might fluctuate based on your luck with red lights. Whenever we use instruments to measure things, whether it’s a yardstick to measure height, a clock to measure drive time, or a test to measure student ability, there is some amount of noise that can result in slightly different measurements each time. That noise is called measurement error. In educational testing, this measurement error is more often referred to as the standard error of measurement—abbreviated “SEM.”

What is standard error of measurement?

This metric tells you about the precision of a student’s score and helps to account for the variables that inevitably affect a student’s final score. By considering both a student’s scale score and the SEM, educators can get an idea of how precise a student’s score is and use that information when interpreting test scores.

Although we expect student scores to increase when students learn more and grow in their ability, student scores can also fluctuate due to measurement error, or SEM. Some reasons why a student’s test scores may vary include how rested the student is, distractions in the testing environment, lucky guesses or careless errors, and how motivated the student is on the particular day of the test. The SEM quantifies how much you would expect the student’s score to vary by if they were to take the same test over and over again on the same day.

Check out this example. Let’s say we have a student with a scale score of 850 and a standard error of measurement of 30. This means the student’s score would likely fall within 30 scale score points below 850 and 30 points above 850—so between 820 and 880—if the student were to test again in the same way and on the same day. In fact, the student’s score would fall within this range about two-thirds of the time. Let’s expand this range to two standard errors, or twice as much as 30, so 60 points below and above 850. That is an interval, or range, of 790 to 910. We can be confident that the student’s score would fall within that range about 95% of the time. These ranges are sometimes called confidence intervals because they quantify the probability that the student’s score is within the range.

When the standard error of measurement is smaller, it means the estimate of student ability has more precision (not so much noise or measurement error). Tests with more questions generally have smaller standard errors because administering more questions allows us to collect more information about a student’s ability and, therefore, have a more precise scale score.

How is SEM reported in Edmentum Exact Path?

Standard error of measurement is reported in two ways within Exact Path. On the Exact Path Student Summary Report, the SEM is reported next to the student’s scale score. In this example, the student’s scale score [on Diagnostic 2] is 735, and the SEM is 30, which means that the student’s score would likely fall between 705 and 765 if they were to retest in the same testing window.

Also, on the Student Summary Report, there is a graph of each student’s item-by-item responses for a given test administration. This is called the “diagnostic experience visual.” After each item the student responds to, their overall score is re-estimated using the additional information from that response. For each new score estimated, there is a confidence interval, or SEM, illustrated by bars extending in either direction from the circles. Notice that the bars tighten as the test continues to collect more information about the student’s strengths and weaknesses and arrives at a more precise estimate of the student’s ability by the time the assessment comes to an end.

Given all of this information, it’s probably not surprising to hear that there is always some noise in measurements—that’s entirely normal. The SEM helps you quantify that noise and provides a level of confidence in the final score. Be sure to consider the SEM when comparing scores across students and when interpreting student growth from one testing window to the next. If a student’s score decreases, but within the SEM, then the change in score could just be due to measurement error as opposed to a true decline in the student’s ability. Now, you know all about the standard error of measurement and can look for it on your own score reports!

Interested in more assessment literacy topics? Check out our Edmentum Assessment Literacy video series, and continue to follow along on the blog as we dig deeper, making you assessment experts along the way! Want to learn more about Exact Path? Get more information about our award-winning program on our website.