TY  - JOUR
U1  - Zeitschriftenartikel, wissenschaftlich - begutachtet (reviewed)
A1  - Santos, Adrian
A1  - Vegas, Sira
A1  - Dieste, Oscar
A1  - Uyaguari, Fernando
A1  - Tosun, Ayse
A1  - Fucci, Davide
A1  - Turhan, Burak
A1  - Scanniello, Giuseppe
A1  - Romano, Simone
A1  - Karac, Itir
A1  - Kuhrmann, Marco
A1  - Mandic, Vladimir
A1  - Ramac, Robert
A1  - Pfahl, Dietmar
A1  - Engblom, Christian
A1  - Kyykka, Jarno
A1  - Rungi, Kerli
A1  - Palomeque, Carolina
A1  - Spisak, Jaroslav
A1  - Oivo, Markku
A1  - Juristo, Natalia
T1  - A family of experiments on test-driven development
JF  - Empirical software engineering : an international journal
N2  - Context:

Test-driven development (TDD) is an agile software development approach that has been widely claimed to improve software quality. However, the extent to which TDD improves quality appears to be largely dependent upon the characteristics of the study in which it is evaluated (e.g., the research method, participant type, programming environment, etc.). The particularities of each study make the aggregation of results untenable.

Objectives:

The goal of this paper is to: increase the accuracy and generalizability of the results achieved in isolated experiments on TDD, provide joint conclusions on the performance of TDD across different industrial and academic settings, and assess the extent to which the characteristics of the experiments affect the quality-related performance of TDD.

Method:

We conduct a family of 12 experiments on TDD in academia and industry. We aggregate their results by means of meta-analysis. We perform exploratory analyses to identify variables impacting the quality-related performance of TDD.

Results:

TDD novices achieve a slightly higher code quality with iterative test-last development (i.e., ITL, the reverse approach of TDD) than with TDD. The task being developed largely determines quality. The programming environment, the order in which TDD and ITL are applied, or the learning effects from one development approach to another do not appear to affect quality. The quality-related performance of professionals using TDD drops more than for students. We hypothesize that this may be due to their being more resistant to change and potentially less motivated than students.

Conclusion:

Previous studies seem to provide conflicting results on TDD performance (i.e., positive vs. negative, respectively). We hypothesize that these conflicting results may be due to different study durations, experiment participants being unfamiliar with the TDD process, or case studies comparing the performance achieved by TDD vs. the control approach (e.g., the waterfall model), each applied to develop a different system. Further experiments with TDD experts are needed to validate these hypotheses.
KW  - family of experiments
KW  - test-driven development
KW  - industry
KW  - academia
KW  - quality
Y1  - 2021
SN  - 1382-3256
SS  - 1382-3256
U6  - https://doi.org/10.1007/s10664-020-09895-8
DO  - https://doi.org/10.1007/s10664-020-09895-8
VL  - 26
IS  - 3
SP  - 53
S1  - 53
PB  - Springer
CY  - Dordrecht
ER  -