ParkBench is an environment for running benchmarks. A benchmark is a set of experiments, each of which takes a set of input parameters, can be run, and produces a set of output values (the "results" of the experiments). Simple examples of benchmarks could be:
- Comparing the running time of various sorting algorithms. Each experiment corresponds to the sorting of a given list by a given algorithm. A benchmark contains multiple such experiments, varying the list length and the algorithm used in each.
- Comparing the precision of the solutions computed by various algebraic algorithms.
- Etc.
Quick Links
- Creating an experiment class
- Creating an experiment suite
- Compiling and running
- Using the web interface
- What you've got so far
Once you've finished reading this quick tutorial, you can have a look at some of the slightly more advanced features of ParkBench.
Creating an experiment class
Suppose you have a procedure called Procedure A, which takes as input two numerical parameters, x and y, and produces as an output a single numerical value (let us call it z). You would like to run it and compute its output value z for various combinations of input parameters.
You start by creating an empty class ProcedureA
that extends the
Experiment
class, as follows:
import ca.uqac.lif.parkbench.*;
class ProcedureA extends Experiment {
public ProcedureA() {
super("Procedure A");
}
public Experiment newExperiment() {
return new ProcedureA();
}
public void runExperiment(Parameters input, Parameters output) {
}
}
The first two methods (constructor and newExperiment()
) are boilerplate
methods that you can simply copy-paste by replacing them with whathever name you
use for your class. (The string within the call to super()
can be
anything you like, as long as it uniquely identifies all experiments of type
ProcedureA
.) The processing of the experiment happens in the method
runExperiment()
. It receives two objects of type Parameters
, which
are Map
s from String
s (parameter names) to any Java
Object
. The first contains the experiment's input parameters (you read from
them), and the second is where you write the experiments's output once it has
finished.
Suppose Procedure A simply computes as its output parameter z the sum of input parameters x and y. This can be written as follows:
public void runExperiment(Parameters input, Parameters output) {
int x = input.getNumber("x").intValue();
int y = input.getNumber("y").intValue();
int result = x + y;
output.put("z", result);
stopWithStatus(Status.DONE);
}
The first two lines get the values of parameters x and y.
Since we know they are numbers, we can call the getNumber()
method
that readily casts them as Number
s (recall that otherwise, what you
receive using simply get()
is an Object
. Line 3 computes the
sum, and line 4 puts that result in the output Parameters
object,
giving it the name z. Finally, the last line is used to indicate that the
experiment has finished with success (anoher value can be used to indicate failure, as
we shall see later).
Creating an experiment suite
Our procedure is now ready to be experimented with multiple values. To do so,
we create a ExperimentSuite
. A experiment suite contains a Benchmark
,
which coordinates the execution
of multiple instances of Experiment
objects. An empty experiment suite looks
like this:
import ca.uqac.lif.parkbench.*;
class MyExperimentSuite extends ExperimentSuite {
public static void main(String[] args) {
new MyExperimentSuite().initialize(args);
}
public void setup(Benchmark b) {
}
}
The main()
method is again composed of a single line; just make
sure the second argument to initialize()
matches the name of your
experiment suite. The setup of your benchmark occurs in the setup()
method. This method receives as an argument an empty benchmark you are about to
configure and fill with experiments. The simplest thing you can do is add a single
instance of Procedure A, say with x=2 and y=3:
public void setup(Benchmark b) {
ProcedureA my_experiment = new ProcedureA();
my_experiment.setParameter("x", 2).setParameter("y", 3);
b.addExperiment(my_experiment);
}
Of course, you can write some more code to add multiple experiments to the benchmark. For example, you can create all experiments where x ranges between 1 and 4, and y ranges between 1 and 2, as follows (no big deal here):
public void setup(Benchmark b) {
for (int x = 1; x <= 4; x++)
for (int y = 1; y <= 2; y++) {
ProcedureA my_experiment = new ProcedureA();
my_experiment.setParameter("x", x).setParameter("y", y);
b.addExperiment(my_experiment);
}
}
Note that a single benchmark does not need to contain experiments of the same class.
As a matter of fact, you could create ProcedureB
, and add instances
of these experiments to the same benchmark. This is useful, for example, for comparing
the results of various methods on the same inputs.
Compiling and running
You can run you experiment suite directly from the files or from the IDE of your
choice. However, if Ant is intalled on your computer, you can simply type
ant
on the command line, and you will get a stand-alone, runnable
JAR file (called ExperimentSuite.jar
using the defaults) that you can
move around and run as you wish.
Run the experiment suite as follows:
java -jar MyExperimentSuite.jar
You should see an output like this:
ParkBench, a versatile benchmark environment
Running Untitled experiment suite in batch mode, using 2 threads
Saving results in Untitled.json
Queued Prereq Running Done Failed Time
10 0 2 1 0 2 s
As you can see, the experiment suite starts and runs all the experiments added to the benchmark one by one. Well, not exactly one by one. You can see that the experiment suite is actually using two threads, meaning that there are always two experiments running at the same time. (We shall see later that the number of threads can be configured.)
Moreover, the results of the benchmark are saved to a file; in this case,
this file is called Untitled.json
(this is because we haven't given
a name to our benchmark, in which case it uses the default "Untitled"). Once the
experiment suite is over (which should be almost instantaneous), you can open that
file and see that its contents look roughly like this:
{
"name" : "Untitled",
"tests" : [
{
"name" : "Procedure A",
"id" : 4,
"input" : {
"x" : 3,
"y" : 2
},
"output" : {
"z" : 6,
}
}
…
]
}
This file uses the JSON notation to structure its data; it should be fairly
intuitive. For each experiment in the benchmark, there exists one structure in the
element experiments
that gives all the information about that particular
experiment instance: its ID (whose value is only relevant to ParkBench), its name
("Procedure A"
), as well as all its input and output
parameters. If you look at the actual file, you will see that it contains much more data,
such as each experiment's start and end time, information about its status, etc. More
on that later. Still, from that point on, you can use that file to do whatever
you like: parse it back and process it to generate graphs, etc. Yet more on that
later.
It shall be noted that the execution of your experiment suite provides another
benefit, as it is "crash resistant". That is, information about all running
experiments is periodically saved to Untitled.json
, so that if you
computer shuts down, or you hit Ctrl+C, whatever results were in at that point
will be in the JSON file. We shall see later that you can resume the execution
of a benchmark from where it stopped, using that file.
Using the web interface
The benefits of using ParkBench extend beyond the multi-threaded execution of the experiments in your experiment suite. ParkBench also provides a user-friendly interface to monitor and control the execution of the experiments through a web browser. Simply type:
java -jar MyExperimentSuite.jar --interactive
You will see this output:
ParkBench, a versatile benchmark environment
Listening requests on port 21212
This time, you experiment suite does not execute any experiment. However, if you open the web browser of your choice and enter the URL http://localhost:21212/index.html, you shall see a window that allows you to interact with all the experiments in your experiment suite.
The experiment list
The main part of the page is the list of experiments. Each experiment of the experiment suite is one line of the table. For each experiment, you get information about all of its input parameters, buttons to start/interrupt each experiment individually. If the experiment is running or has finished, real-time information about its execution time is displayed in the column "Duration". You can sort experiments according to a parameter by clicking on its column header; hold Shift to sort according to multiple parameters.
Status | Duration | Name | x | y | ||
---|---|---|---|---|---|---|
Procedure A | 1 | 1 | ||||
Procedure A | 1 | 2 | ||||
Procedure A | 2 | 1 | ||||
Procedure A | 1 | 2 |
The "Status" column shows a small square whose colour is explained in the key just above the list of experiments.
- The experiment has not run and its prerequisites are not fulfilled
- The experiment has not run but is ready to run
- The experiment is in the queue, waiting to be started
- The experiment is currently generating its prerequisites
- The experiment is currently running
- The experiment has completed successfully
- The execution of the experiment has failed or was manually cancelled
Try it. Select a few or all of the experiments (options above the list allow you to select all experiments, or only those with a specific status), and start/stop them using the buttons.
Since Procedure A does not do much, its execution is almost instantaneous. To simulate the execution of an experiment that does more work (and takes more time), add the following line in the method runExperiment() of ProcedureA, just before the last line:
waitFor(5);
This will suspend the execution of the experiment for five seconds before moving on to the last instruction, where the experiment indicates that it has finished. Now restart your experiment suite and refresh your browser. If you select all experiments and run them at once, you should see how the benchmark manages their execution: experiments are first put in the waiting queue (
), eventually get to the "running" state ( ), and after a few seconds end up in the "finished" state ( ). You can also see that two experiments are always running in parallel, as the benchmark uses two threads.Remote control
Since your experiment suite is also a web server, you can access it from a
different computer, so that the experiments run on a computer, but are controlled from
another computer. Suppose 10.10.10.1 is the IP address of the computer
where MyExperimentSuite.jar
is currently running. You can open a web
browser in another computer, and type
http://10.10.10.1:21212/index.html to open the web interface.
Voilà! No need to SSH into the machine to start/stop experiment scripts.
Through the web interface, you can also download the Untitled.json file that contains all the experiments results. Simply click on the "Download" button above the experiment list and you will receive a copy of the experiment suite's current state. No need to FTP into the machine to download experiment results.
What you've got so far
So far, we've written 25 lines of code (including boilerplate code we can simply copy-paste). Here's what we got in exchange:
- Queuing and multi-threaded execution of all experiments in your experiment suite from the command line
- Automated saving of all experiment information (input and results) into a JSON file you can reuse to create graphs, etc.
- A nice web interface to control and monitor the execution of your experiments from any computer
Ready for some more advanced features?