An external replication on the effects of test-driven development using a multi-site blind analysis approach

Fucci, D; Scanniello, G; Romano, S; Shepperd, M; Sigweni, B; Uyaguari, F; Turhan, B; Juristo, N; Oivo, M

Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/13448

Full metadata record

DC Field	Value	Language
dc.contributor.author	Fucci, D	-
dc.contributor.author	Scanniello, G	-
dc.contributor.author	Romano, S	-
dc.contributor.author	Shepperd, M	-
dc.contributor.author	Sigweni, B	-
dc.contributor.author	Uyaguari, F	-
dc.contributor.author	Turhan, B	-
dc.contributor.author	Juristo, N	-
dc.contributor.author	Oivo, M	-
dc.date.accessioned	2016-11-04T11:07:32Z	-
dc.date.available	2016-09-08	-
dc.date.available	2016-11-04T11:07:32Z	-
dc.date.issued	2016	-
dc.identifier.citation	International Symposium on Empirical Software Engineering and Measurement (ESEM), 2016, 08-09-September-2016	en_US
dc.identifier.isbn	9781450344272	-
dc.identifier.issn	1949-3770	-
dc.identifier.uri	http://bura.brunel.ac.uk/handle/2438/13448	-
dc.description.abstract	Context: Test-driven development (TDD) is an agile practice claimed to improve the quality of a software product, as well as the productivity of its developers. A previous study (i.e., baseline experiment) at the University of Oulu (Finland) compared TDD to a test-last development (TLD) approach through a randomized controlled trial. The results failed to support the claims. Goal: We want to validate the original study results by replicating it at the University of Basilicata (Italy), using a different design. Method: We replicated the baseline experiment, using a crossover design, with 21 graduate students. We kept the settings and context as close as possible to the baseline experiment. In order to limit researchers bias, we involved two other sites (UPM, Spain, and Brunel, UK) to conduct blind analysis of the data. Results: The Kruskal-Wallis tests did not show any significant difference between TDD and TLD in terms of testing effort (p-value = .27), external code quality (p-value = .82), and developers' productivity (p-value = .83). Nevertheless, our data revealed a difference based on the order in which TDD and TLD were applied, though no carry over effect. Conclusions: We verify the baseline study results, yet our results raises concerns regarding the selection of experimental objects, particularly with respect to their interaction with the order in which of treatments are applied. We recommend future studies to survey the tasks used in experiments evaluating TDD. Finally, to lower the cost of replication studies and reduce researchers' bias, we encourage other research groups to adopt similar multi-site blind analysis approach described in this paper.	en_US
dc.description.sponsorship	This research is supported in part by the Academy of Finland Project 278354.	en_US
dc.language.iso	en	en_US
dc.publisher	ACM	en_US
dc.subject	Test-driven development	en_US
dc.subject	External experiment replication	en_US
dc.subject	blind analysis	en_US
dc.title	An external replication on the effects of test-driven development using a multi-site blind analysis approach	en_US
dc.type	Conference Paper	en_US
dc.identifier.doi	http://dx.doi.org/10.1145/2961111.2962592	-
dc.relation.isPartOf	International Symposium on Empirical Software Engineering and Measurement	-
pubs.publication-status	Published	-
pubs.volume	08-09-September-2016	-
Appears in Collections:	Dept of Computer Science Research Papers

Files in This Item:

File	Description	Size	Format
FullText.pdf		432.5 kB	Adobe PDF	View/Open

Show simple item record