5

0.8 Cumulative density

4 Density

1.0

(0.00, 0.00, 0.00, 1.00) (0.00, 0.00, 0.50, 1.00) (0.00, 0.20, 0.60, 1.00) (0.25, 0.50, 0.75, 1.00)

3 2

0.6 0.4

1

0.2

0

0.0 0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

Total score FIGURE 6 .13 Distribution of scores:

The actual (left) and cumulative (right) distribution of points obtained by the participants are shown for different marking schemes. The schemes differ in the number of points attributed to 1, 2, 3 or 4 correct answers per question (see legend) with the actual marking scheme used in the IBO 2013 shown in purple, the only originally proposed by the organizers in blue and the harshest and mildest marking schemes possible in red and green, respectively. The points were standardized such that 1 corresponds to the maximum number of points that could be obtained in the exam (92, regardless of marking scheme).

Total Scores and Difficulty

In line with the results of previous IBOs, the theoretical exam proved to be rather difficult, regardless of the marking scheme. Under the marking scheme used for the IBO 2013, the median relative score was 0.58 (standardized such that 1 corresponds the maximum attainable number of points). While one student obtained a score (0.27) slightly below the relative score expected for students guessing every item (0.29), the highest score obtained was only 0.8152. For comparison, an ideal exam would show a median value of about 0.64 (in the middle of 0.29 and 1) with the individual scores distributed uniformly between 0.29 and 1.

The overall difficulty of the exam is confirmed when analyzing the distribution of correct answers given by the participants for each question individually ( FIGURE 6 .14). While we found the questions to show varying degrees of difficulty, there is an apparent lack of easy questions with very few questions showing an average above 0.8. It thus suggests that a more reliable ranking would be obtained by using questions that were slightly easier on average. On a bright side, even the most difficult questions were still solved better than expected if students were only guessing.

