A mathematical explanation of how we arrive at a scaled score for a script

A frequent question that we get from schools is how we arrive at a scaled score for a script. In this article we explain how we get to a scaled score, and in turn explain some other aspects of the statistical process involved in comparative judgement.

A good place to start is the raw data that you can download for the candidates once a task has been carried out (in the task, go into Check results and download the Candidates Results file).

You will see a file with these headings:

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/978d5f97-0890-4596-8149-892fdb28e0ce/Screenshot_2019-08-22_at_09.47.14.png

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/30b5c315-9e4f-47f0-a402-066119702636/Screenshot_2019-08-22_at_09.47.29.png

The headings we will concentrate on to explain where scaled scores come from will be (in order):

Local Comparisons

When we judge a set of scripts, each script is judged against other scripts a certain number of times. That is the number of Local Comparisons (if the task is moderated, e.g. in national tasks, there will be Mod Comparisons as well).

Score

In the given number of comparisons, a particular script will be chosen as best, or ‘win’, a certain number of times – this is a raw score*. *****This raw score ******for each script against other scripts are placed into a mathematical model and the resulting ‘quality’ of each script is calculated. The mathematical model is a theoretical model of how the wins and losses of scripts should be, given their varying quality. From the mathematical model, we obtain the theoretical number of wins that a script should have if the data were to fit the mathematical model perfectly - this is the Score.