In general, there are two possibilities for benchmarking, microbenchmarking or benchmarking a whole query.


For microbenchmarking, the google benchmark framework () is used. This document will be a brief explainer how to use gbenchmark. Additionally, it will also show some configurations. For a full detailed description, please have a look at the README of Google Benchmark.

Set Up

Setting up gbenchmark is as simple as adding -DNES_BUILD_BENCHMARKS=1 to your cmake options. Gbenchmark will then automatically be set up as soon as your cmake file is reloaded.

Adding a benchmark

Adding a benchmark is done similar to adding a unit test. Create a .cpp file in nebulastream/benchmark. The skeleton file BenchmarkSkeleton.cpp is a good starting point for that as it has a MWE that can be quickly adapted.

After creating the new benchmark file, add the following to CMakeLists.txt in nebulastream/benchmark.

add_executable(<benchmark_name> "/path/to/benchmark_file.cpp")
target_link_libraries(<benchmark_name>  nes ${GBENCHMARK_LIBRARIES})
add_test(NAME <benchmark_name> COMMAND <benchmark_name>)

Afterwards, rebuild the project and execute the new benchmark.

Benchmark Info

Every benchmark is executed serially and outputs its time (cpu and wall time) and its number of iterations. Per default each benchmark is run at least once, but not more then 1e9 iterations -- for more details have a look at Average timings over all iterations is then reported for each benchmark. This behavior can be changed by setting the number of iterations with ->Repetitions(n).

All benchmarks can be executed via IDE or via a command line. For all possible CLI options, execute -help. An example of a benchmark can be found in nebulastream/benchmark/BenchmarkSuite/SandBoxBenchmark.cpp.

Benchmark Arguments

As the goal of this guide is to be a brief explainer only a handful of possible arguments are shown. Instead of writing for every possible input a new benchmark, gbenchmark has the opportunity to declare arguments, e.g. BENCHMARK(BM_SomeWonderfulBenchmark)->Range(8, 1024);

  • ->Args(const vector<int64_t> &args) creates a vector that is passed to the benchmark
  • ->Range(a, b) creates a range of benchmark arguments from a to b with a default multiplier of 8
  • ->RangeMultiplier(n) changes the multiplier of the following range commands from default 8 to n
  • ->DenseRange(a, b, step) creates a range from a to b with a step size of step
  • ->Ranges(const vector<pair<int64_t, int64_t>> &ranges) creates ranges with multiple inputs

Accessing the inputs can be done through state.range(<input position>). An example with two inputs can be seen below.

//Multiple input example from example from
static void BM_SetInsert(benchmark::State& state) {
  std::set<int> data;
  for (auto _ : state) {
    data = ConstructRandomSet(state.range(0));
    for (int j = 0; j < state.range(1); ++j)

Benchmark Statistics and Custom Counters

Per default, average time (wall and cpu) and number of iterations are shown for each benchmark on the console. Nonetheless, this can be changed as every benchmark function has a state.counters parameter. state.counters is a std::map so custom counter can be declared, incremented in the function-to-be-timed and then written to state.counters, as can be seen in the example below. Custom counters are also outputted for each benchmark.

//Custom Counter example from
static void UserCountersExample1(benchmark::State& state) {
  double numFoos = 0, numBars = 0, numBazs = 0;
  for (auto _ : state) {
    // ... count Foo,Bar,Baz events
  state.counters["Foo"] = numFoos;
  state.counters["Bar"] = numBars;
  state.counters["Baz"] = numBazs;

As already stated above, the number of iterations can be forced with ->Repetitions(n). If so, then every iterations is printed, as well as the mean, median and stddev. With ->ReportAggregatesOnly(true) only the only mean, median and stddev are printed.

Benchmark Output Format

Per default, the benchmark output is shown on the console. With --benchmark_format=<console|json|csv> or by setting the environment variableBENCHMARK_FORMAT=<console|json|csv> other output formats can be obtained. For CSV format, the context (date, cpu info,...) is written on stderr and the CSV itself is written on stdout.

Instead of redirecting output via pipes to a file, either --benchmark_out=<filename> or BENCHMARK_OUT can be set. Redirecting the output does not suppress console output. If no output format has been set, then the default format is JSON.

Benchmark Units and Pause Timing

The default time unit is nanosecond. As some benchmarks may take a long time, also milliseconds and microseconds are available through ->Unit(benchmark::kMillisecond) or ->Unit(benchmark::kMicrosecond).

Sometimes there is the need of pausing the timing for some part of the benchmark code. Pausing and unpausing can be done through

<Some code that should not be timed>

Benchmark Execution

All benchmarks can be executed in an IDE but also via command line. Using command line a subset of all possible benchmarks can be executed with --benchmark_filter=<regex> or BENCHMARK_FILTER=<regex>. This can be useful if not the whole benchmark suite should be run.

Query Benchmarking

Benchmarking a whole query is possible. A helper class BenchmarkUtils can be found in benchmark/include/util/BenchmarkUtils.hpp. Additionally, BenchmarkUtils.hpp defines BM_AddBenchmark(benchmarkName, benchmarkQuery, benchmarkSource, benchmarkSink, csvHeaderString, customCSVOutputs). This macro can be used for creating an experiment that will be run.

One experiment consists of doing all possible combination of predefined ingestion rates, experimentDuration, periodLengths and workerThreads. For more information please take a look at benchmark/include/util/BenchmarkUtils.hpp. Furthermore, two small examples (FilterQueryBenchmarks.cpp and MapQueryBenchmarks.cpp) can be found in benchmark/BenchmarkSuite/. These files are also a good starting point, if one would like to create a different query.

As the macro BM_AddBenchmark(...) requires a benchmark sink and source, a simple benchmark source and sink can be found in benchmark/include/util/. The benchmark sink does nothing than just except the incoming tuples and incrementing a counter. The benchmark source requires a schema and a ingestion rate. As long as the benchmark source is active, it will produce tuples.

Python Script Automation

For automation, a python script was created that runs all benchmarks and plots the results. Executing the script with -h prints a help message which includes a description of the params. In short, -b tell the script what benchmarks should be run, and -f provides the script with the location of the executables.

Custom Plotting

By default, a horizontal bar plot will be used for benchmark results. If one would like a different plot, this can be achieved by creating a file <benchmark-executable>.py in /benchmark/scripts/. checks if it can find such a file. If a custom plotting file was found, this will be executed instead of default plotting. is an example for overwriting default plotting. Additionally, it is not necessary to rerun all benchmark, if one would like to just plot a given csv file. This can be done by setting -jp flag.

As can be seen in, the location of the result csv file will be in resultCsvFile. Furthermore, there are certain helper functions that may be useful

  • printFail(message), printSuccess(message) and printHighlight(message) are printing with color to stdout
  • print2Log(message, file=__file__) will print to log file
  • millify(floatNumber) will turn a number into a millfied version of it, e.g. 1234566 --> 1.2M
how_to_add_a_benchmark.txt · Last modified: 2021/05/25 16:50 by
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki