Tutorials · Setting up an experiment


Goals

In this tutorial, we will

  • select an artifact with the Dataset Browser
  • install the BugSwarm client
  • download the artifact with the client
  • run a code coverage tool inside the artifact.

Requirements

This tutorial requires no prior knowledge of BugSwarm. You will need Docker installed on your machine.

Selecting an artifact

Navigate to the Dataset Browser to view metadata for every existing artifact. Click on any column header to sort the metadata. Type in the "Filter column" text fields to change the displayed metadata based on project name, language, build system, test framework, number of tests ran or failed, and several other attributes.

In this tutorial, we'll use an artifact from the Apache Commons Lang project.

Type the image tag nutzam-nutz-140438299 into the Image Tag column filter to display metadata for the specific artifact identified by that image tag.

An image tag uniquely identifies a BugSwarm artifact.

The Dataset Browser is useful for selecting a small number of artifacts. To facilitate artifact selection at scale, we expose a (soon to be public) REST API for querying the artifact database based on any of the metadata attributes.

Installing the BugSwarm client

Now that we've selected an image tag and viewed its metadata, we can use the client to download a local copy of the corresponding artifact Docker image. There are two ways to install the client.

From PyPI:

$ pip3 install bugswarm-client

From source:

$ git clone https://github.com/BugSwarm/client.git
$ cd client
$ sudo python3 setup.py install

Downloading an artifact

To download the artifact we chose earlier, you'll need the image tag (apache-commons-lang-212643974). Make sure the Docker daemon is running, and then run the following command. Depending on how Docker is configured on your machine, you may need to enter an administrator password.

$ bugswarm run --image-tag nutzam-nutz-140438299 --use-sandbox

See the BugSwarm CLI documentation for a longer description of the client API.

The above command will download all the layers of the Docker image, initialize an artifact Docker container from the layers, and then provide an interactive shell in the artifact container.

It may take a few minutes to download your first artifact, depending on your network connection. Thanks to the properties of Docker's layered file system, all subsequent artifact downloads will be much faster.

Interacting with an artifact

Now that we're "inside" an artifact container, we can view the source code, modify files, install tools, etc.

Also, since we specified the --use-sandbox flag when invoking bugswarm run, we have a shared "sandbox" directory that is accessible from within the artifact container and from the host machine. The sandbox directory is a convenient way to move files into and out of an artifact container.

Installing a tool

At this point, we could reproduce the failed or passed Travis jobs by invoking a script included with the artifact. (See Anatomy of an Artifact to learn more about the contents of an artifact.) However, before we reproduce the failed job, let's install a code coverage tool to be invoked when the project is built and tested.

Namely, let's install Cobertura.

First, cd to the version of the project that resulted in the failed Travis job.

$ cd ~/build/failed/nutzam/nutz/

Next, modify the POM file that holds configuration instructions for the Maven build system.

$ vim pom.xml

Jump to the end of the build section by typing ?/build and then Enter. Insert the following snippet right before the nearest closing </plugins> tag.

<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>cobertura-maven-plugin</artifactId>
    <version>2.7</version>
    <configuration>
        <formats>
            <format>html</format>
            <format>xml</format>
        </formats>
    </configuration>
</plugin>

Hit Esc. Then jump to the end of the build section by typing ?/build and then Enter. Insert the following snippet right after the closing </build> tag.

<reporting>
    <plugins>
        <plugin>
            <groupId>org.codehaus.mojo</groupId>
            <artifactId>cobertura-maven-plugin</artifactId>
            <version>2.7</version>
            <configuration>
                <check></check>
                <formats>
                    <format>html</format>
                    <format>xml</format>
                </formats>
            </configuration>
        </plugin>
    </plugins>
</reporting>

Hit Esc. Then save and close the file by typing :wq and then Enter.

Now invoke Maven to begin the build and test process, which now includes running Cobertura and producing a machine-readable code coverage report.

$ /usr/local/maven-3.2.5/bin/mvn cobertura:cobertura -Dcobertura.report.format=xml -Dhttps.protocols=TLSv1.2

Copy the report to the sandbox so it can be viewed later after exiting the artifact container.

$ cd ~/build/failed/nutzam/nutz/target/site/cobertura
$ cp coverage.xml /bugswarm-sandbox

Exit the artifact container.

$ exit

The code coverage report is available on your machine at ~/bugswarm-sandbox/coverage.xml. You could perform further analysis on the report, but for now we'll just view it.

$ vim ~/bugswarm-sandbox/coverage.xml

Automating tool installation

In our experience, running an experiment commonly contains three main steps:

  1. copy necessary scripts or executables into the artifact container
  2. invoke the scripts or executables inside the artifact container
  3. copy any produced files out of the artifact container via the sandbox directory.

Although we created a (soon to be public) framework to automate this workflow, which is available upon request, we performed the installation manually during this walkthrough for the sake of clarity.


Previous article