BugSwarm: Mining and Continuously Growing a Dataset of Reproducible Failures and FixesDavid A. Tomassi, Naji Dmeiri, Yichen Wang, Antara Bhowmick, Yen-Chuan Liu, Premkumar Devanbu, Bogdan Vasilescu, Cindy Rubio-González
In Proceedings of International Conference on Software Engineering (ICSE'19) Montreal, Canada, May 2019.Abstract: Fault-detection, localization, and repair methods are vital to software quality; but it is difficult to evaluate their generality, applicability, and current effectiveness. Large, diverse, realistic datasets of durably-reproducible faults and fixes are vital to good experimental evaluation of approaches to software quality, but they are difficult and expensive to assemble and keep current. Modern continuous-integration (CI) approaches, like Travis-CI, which are widely used, fully configurable, and executed within custom-built containers, promise a path toward much larger defect datasets. If we can identify and archive failing and subsequent passing runs, the containers will provide a substantial assurance of durable future reproducibility of build and test. Several obstacles, however, must be overcome to make this a practical reality. We describe BugSwarm, a toolset that navigates these obstacles to enable the creation of a scalable, diverse, realistic, continuously growing set of durably reproducible failing and passing versions of real-world, open-source systems. The BugSwarm toolkit has already gathered 3,091 fail-pass pairs, in Java and Python, all packaged within fully reproducible containers. Furthermore, the toolkit can be run periodically to detect fail-pass activities, thus growing the dataset continually.
Bugs in the Wild: Examining the Effectiveness of Static Analyzers at Finding Real-World BugsDavid A. Tomassi
In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringAvailable as PDF and BibTex which can be found: HereAbstract: Static analysis is a powerful technique to find software bugs. In past years, a few static analysis tools have become available for developers to find certain kinds of bugs in their programs. However, there is no evidence on how effective the tools are in finding bugs in real-world software. In this paper, we present a preliminary study on the popular static analyzers ErrorProne and SpotBugs. Specifically, we consider 320 real Java bugs from the BugSwarm dataset, and determine which of these bugs can potentially be found by the analyzers, and how many are indeed detected. We find that 30.3% and 40.3% of the bugs are candidates for detection by ErrorProne and SpotBugs, respectively. Our evaluation shows that the analyzers are relatively easy to incorporate into the tool chain of diverse projects that use the Maven build system. However, the analyzers are not as effective detecting the bugs under study, with only one bug successfully detected by SpotBugs.
A Note About: Critical Review of BugSwarm for Fault Localization and Program RepairDavid A. Tomassi, Cindy Rubio-González
Available as PDF and BibTex which can be found: HereAbstract: Datasets play an important role in the advancement of software tools and facilitate their evaluation. BugSwarm is an infrastructure to automatically create a large dataset of real-world reproducible failures and fixes. In this paper, we respond to Durieux and Abreu's critical review of the BugSwarm dataset, referred to in this paper as CriticalReview. We replicate CriticalReview's study and find several incorrect claims and assumptions about the BugSwarm dataset. We discuss these incorrect claims and other contributions listed by CriticalReview. Finally, we discuss general misconceptions about BugSwarm, and our vision for the use of the infrastructure and dataset.