A 10-Minute Introduction to ParkBench

ParkBench is an environment for running benchmarks. A benchmark is a set of experiments, each of which takes a set of input parameters, can be run, and produces a set of output values (the "results" of the experiments). Simple examples of benchmarks could be:

Comparing the running time of various sorting algorithms. Each experiment corresponds to the sorting of a given list by a given algorithm. A benchmark contains multiple such experiments, varying the list length and the algorithm used in each.
Comparing the precision of the solutions computed by various algebraic algorithms.
Etc.

Once you've finished reading this quick tutorial, you can have a look at some of the slightly more advanced features of ParkBench.

Creating an experiment class

Suppose you have a procedure called Procedure A, which takes as input two numerical parameters, x and y, and produces as an output a single numerical value (let us call it z). You would like to run it and compute its output value z for various combinations of input parameters.

You start by creating an empty class ProcedureA that extends the Experiment class, as follows:

import ca.uqac.lif.parkbench.*;

class ProcedureA extends Experiment {

  public ProcedureA() {
    super("Procedure A");
  }
  
  public Experiment newExperiment() {
    return new ProcedureA();
  }
  
  public void runExperiment(Parameters input, Parameters output) {
  }
}

The first two methods (constructor and newExperiment()) are boilerplate methods that you can simply copy-paste by replacing them with whathever name you use for your class. (The string within the call to super() can be anything you like, as long as it uniquely identifies all experiments of type ProcedureA.) The processing of the experiment happens in the method runExperiment(). It receives two objects of type Parameters, which are Maps from Strings (parameter names) to any Java Object. The first contains the experiment's input parameters (you read from them), and the second is where you write the experiments's output once it has finished.

Suppose Procedure A simply computes as its output parameter z the sum of input parameters x and y. This can be written as follows:

  public void runExperiment(Parameters input, Parameters output) {
    int x = input.getNumber("x").intValue();
    int y = input.getNumber("y").intValue();
    int result = x + y;
    output.put("z", result);
    stopWithStatus(Status.DONE);
  }

The first two lines get the values of parameters x and y. Since we know they are numbers, we can call the getNumber() method that readily casts them as Numbers (recall that otherwise, what you receive using simply get() is an Object. Line 3 computes the sum, and line 4 puts that result in the output Parameters object, giving it the name z. Finally, the last line is used to indicate that the experiment has finished with success (anoher value can be used to indicate failure, as we shall see later).

Creating an experiment suite

Our procedure is now ready to be experimented with multiple values. To do so, we create a ExperimentSuite. A experiment suite contains a Benchmark, which coordinates the execution of multiple instances of Experiment objects. An empty experiment suite looks like this:

import ca.uqac.lif.parkbench.*;

class MyExperimentSuite extends ExperimentSuite {

  public static void main(String[] args) {
    new MyExperimentSuite().initialize(args);
  }
  
  public void setup(Benchmark b) {
  }
}

The main() method is again composed of a single line; just make sure the second argument to initialize() matches the name of your experiment suite. The setup of your benchmark occurs in the setup() method. This method receives as an argument an empty benchmark you are about to configure and fill with experiments. The simplest thing you can do is add a single instance of Procedure A, say with x=2 and y=3:

  public void setup(Benchmark b) {
    ProcedureA my_experiment = new ProcedureA();
    my_experiment.setParameter("x", 2).setParameter("y", 3);
    b.addExperiment(my_experiment);
  }

Of course, you can write some more code to add multiple experiments to the benchmark. For example, you can create all experiments where x ranges between 1 and 4, and y ranges between 1 and 2, as follows (no big deal here):

  public void setup(Benchmark b) {
    for (int x = 1; x <= 4; x++)
      for (int y = 1; y <= 2; y++) {
        ProcedureA my_experiment = new ProcedureA();
        my_experiment.setParameter("x", x).setParameter("y", y);
        b.addExperiment(my_experiment);
      }
  }

Note that a single benchmark does not need to contain experiments of the same class. As a matter of fact, you could create ProcedureB, and add instances of these experiments to the same benchmark. This is useful, for example, for comparing the results of various methods on the same inputs.

Compiling and running

You can run you experiment suite directly from the files or from the IDE of your choice. However, if Ant is intalled on your computer, you can simply type ant on the command line, and you will get a stand-alone, runnable JAR file (called ExperimentSuite.jar using the defaults) that you can move around and run as you wish.

Run the experiment suite as follows:

java -jar MyExperimentSuite.jar

You should see an output like this:

ParkBench, a versatile benchmark environment
Running Untitled experiment suite in batch mode, using 2 threads
Saving results in Untitled.json
Queued  Prereq  Running Done   Failed Time
    10       0        2      1      0 2 s

As you can see, the experiment suite starts and runs all the experiments added to the benchmark one by one. Well, not exactly one by one. You can see that the experiment suite is actually using two threads, meaning that there are always two experiments running at the same time. (We shall see later that the number of threads can be configured.)

Moreover, the results of the benchmark are saved to a file; in this case, this file is called Untitled.json (this is because we haven't given a name to our benchmark, in which case it uses the default "Untitled"). Once the experiment suite is over (which should be almost instantaneous), you can open that file and see that its contents look roughly like this:

{
  "name"  : "Untitled",
  "tests" : [
    {
      "name"   : "Procedure A",
      "id"     : 4,
      "input"  : {
        "x" : 3,
        "y" : 2
      },
      "output" : {
        "z" : 6,
      }
    }
    …
  ]
}

This file uses the JSON notation to structure its data; it should be fairly intuitive. For each experiment in the benchmark, there exists one structure in the element experiments that gives all the information about that particular experiment instance: its ID (whose value is only relevant to ParkBench), its name ("Procedure A"), as well as all its input and output parameters. If you look at the actual file, you will see that it contains much more data, such as each experiment's start and end time, information about its status, etc. More on that later. Still, from that point on, you can use that file to do whatever you like: parse it back and process it to generate graphs, etc. Yet more on that later.

It shall be noted that the execution of your experiment suite provides another benefit, as it is "crash resistant". That is, information about all running experiments is periodically saved to Untitled.json, so that if you computer shuts down, or you hit Ctrl+C, whatever results were in at that point will be in the JSON file. We shall see later that you can resume the execution of a benchmark from where it stopped, using that file.

Using the web interface

The benefits of using ParkBench extend beyond the multi-threaded execution of the experiments in your experiment suite. ParkBench also provides a user-friendly interface to monitor and control the execution of the experiments through a web browser. Simply type:

java -jar MyExperimentSuite.jar --interactive

You will see this output:

ParkBench, a versatile benchmark environment
Listening requests on port 21212

This time, you experiment suite does not execute any experiment. However, if you open the web browser of your choice and enter the URL http://localhost:21212/index.html, you shall see a window that allows you to interact with all the experiments in your experiment suite.

The experiment list

The main part of the page is the list of experiments. Each experiment of the experiment suite is one line of the table. For each experiment, you get information about all of its input parameters, buttons to start/interrupt each experiment individually. If the experiment is running or has finished, real-time information about its execution time is displayed in the column "Duration". You can sort experiments according to a parameter by clicking on its column header; hold Shift to sort according to multiple parameters.

Name	x	y
Procedure A	1	1
Procedure A	1	2
Procedure A	2	1
Procedure A	1	2

The "Status" column shows a small square whose colour is explained in the key just above the list of experiments.

Not ready
The experiment has not run and its prerequisites are not fulfilled
Ready
The experiment has not run but is ready to run
Queued
The experiment is in the queue, waiting to be started
Prerequisites
The experiment is currently generating its prerequisites
Running
The experiment is currently running
Done
The experiment has completed successfully
Failed
The execution of the experiment has failed or was manually cancelled

Try it. Select a few or all of the experiments (options above the list allow you to select all experiments, or only those with a specific status), and start/stop them using the buttons.

Since Procedure A does not do much, its execution is almost instantaneous. To simulate the execution of an experiment that does more work (and takes more time), add the following line in the method runExperiment() of ProcedureA, just before the last line:

waitFor(5);

This will suspend the execution of the experiment for five seconds before moving on to the last instruction, where the experiment indicates that it has finished. Now restart your experiment suite and refresh your browser. If you select all experiments and run them at once, you should see how the benchmark manages their execution: experiments are first put in the waiting queue (Queued), eventually get to the "running" state (Running), and after a few seconds end up in the "finished" state (Done). You can also see that two experiments are always running in parallel, as the benchmark uses two threads.

Remote control

Since your experiment suite is also a web server, you can access it from a different computer, so that the experiments run on a computer, but are controlled from another computer. Suppose 10.10.10.1 is the IP address of the computer where MyExperimentSuite.jar is currently running. You can open a web browser in another computer, and type http://10.10.10.1:21212/index.html to open the web interface. Voilà! No need to SSH into the machine to start/stop experiment scripts.

Through the web interface, you can also download the Untitled.json file that contains all the experiments results. Simply click on the "Download" button above the experiment list and you will receive a copy of the experiment suite's current state. No need to FTP into the machine to download experiment results.

What you've got so far

So far, we've written 25 lines of code (including boilerplate code we can simply copy-paste). Here's what we got in exchange:

Queuing and multi-threaded execution of all experiments in your experiment suite from the command line
Automated saving of all experiment information (input and results) into a JSON file you can reuse to create graphs, etc.
A nice web interface to control and monitor the execution of your experiments from any computer

Ready for some more advanced features?