Formula scoring (FS) is the use of a don't know option (DKO) with subtraction of points for wrong answers. Its effect on construct validity and reliability of progress test scores, is subject of discussion. Choosing a DKO may not only be affected by knowledge level, but also by risk taking tendency, and may thus introduce construct-irrelevant variance into the knowledge measurement. On the other hand, FS may result in more reliable test scores. To evaluate the impact of FS on construct validity and reliability of progress test scores, a progress test for radiology residents was divided into two tests of 100 parallel items (A and B). Each test had a FS and a number-right (NR) version, A-FS, B-FS, A-NR, and B-NR. Participants (337) were randomly divided into two groups. One group took test A-FS followed by B-NR, and the second group test B-FS followed by A-NR. Evidence for impaired construct validity was sought in a hierarchical regression analysis by investigating how much of the participants' FS-score variance was explained by the DKO-score, compared to the contribution of the knowledge level (NR-score), while controlling for Group, Gender, and Training length. Cronbach's alpha was used to estimate NR and FS-score reliability per year group. NR score was found to explain 27 % of the variance of FS [F(1,332) = 219.2, p < 0.0005], DKO-score, and the interaction of DKO and Gender were found to explain 8 % [F(2,330) = 41.5, p < 0.0005], and the interaction of DKO and NR 1.6 % [F(1,329) = 16.6, p < 0.0005], supporting our hypothesis that FS introduces construct-irrelevant variance into the knowledge measurement. However, NR-scores showed considerably lower reliabilities than FS-scores (mean year-test group Cronbach's alphas were 0.62 and 0.74, respectively). Decisions about FS with progress tests should be a careful trade-off between systematic and random measurement error.