Results by Test Type
Score % for each test type (Explain, Implement, Debug, Compare) across all 16 runs.
Score % by Test Type
| Test Type |
Qs |
R1 |
R2 |
R3 |
R4 |
R5 |
R6 |
R7 |
R8 |
R9 |
R10 |
R11 |
R12 |
R13 |
R14 |
R15 |
R16 |
| Explain |
12 |
59.1% |
67.5% |
72.8% |
97.5% |
72.5% |
78.1% |
96.4% |
74.5% |
72.8% |
78.1% |
61.5% |
66.1% |
68.9% |
78.2% |
72.8% |
73.9% |
| Implement |
23 |
57.8% |
78.5% |
71.2% |
91.3% |
76.1% |
69.9% |
92.1% |
68.7% |
68.7% |
75.3% |
49.7% |
58.8% |
63.8% |
64.1% |
68.7% |
69.1% |
| Debug |
5 |
58.0% |
76.6% |
66.7% |
89.3% |
74.0% |
53.4% |
82.1% |
79.3% |
65.3% |
73.3% |
68.0% |
58.4% |
65.0% |
57.0% |
67.7% |
67.7% |
| Compare |
10 |
71.7% |
57.7% |
76.4% |
97.3% |
58.3% |
74.7% |
97.7% |
78.0% |
74.0% |
76.3% |
62.5% |
66.0% |
66.7% |
67.0% |
75.0% |
74.3% |
Test Type Improvement: Baseline (Run 1) vs Best (Run 4)
| Test Type |
Qs |
Baseline |
Best (R4) |
Delta |
| Explain |
12 |
59.1% |
97.5% |
+38.4pp |
| Implement |
23 |
57.8% |
91.3% |
+33.5pp |
| Debug |
5 |
58.0% |
89.3% |
+31.3pp |
| Compare |
10 |
71.7% |
97.3% |
+25.6pp |
Must-Have Pass % by Test Type
| Test Type |
Qs |
R1 |
R2 |
R3 |
R4 |
R5 |
R6 |
R7 |
R8 |
R9 |
R10 |
R11 |
R12 |
R13 |
R14 |
R15 |
R16 |
| Explain |
12 |
68.8% |
60.4% |
93.8% |
95.8% |
54.2% |
58.3% |
97.9% |
93.1% |
91.0% |
93.8% |
71.5% |
79.9% |
71.5% |
79.9% |
91.7% |
91.6% |
| Implement |
23 |
66.3% |
75.0% |
95.7% |
92.4% |
57.6% |
44.6% |
93.5% |
88.0% |
92.0% |
89.5% |
60.1% |
64.2% |
52.9% |
62.7% |
89.5% |
89.1% |
| Debug |
5 |
50.0% |
60.0% |
75.0% |
70.0% |
55.0% |
25.0% |
75.0% |
90.0% |
71.7% |
76.6% |
68.3% |
46.7% |
43.4% |
53.3% |
61.7% |
71.7% |
| Compare |
10 |
82.5% |
42.5% |
97.5% |
95.0% |
47.5% |
57.5% |
97.5% |
96.6% |
92.5% |
95.8% |
74.1% |
78.3% |
74.1% |
80.0% |
94.1% |
92.5% |