]> git.proxmox.com Git - ceph.git/blame - ceph/src/arrow/docs/source/developers/benchmarks.rst
import quincy 17.2.0
[ceph.git] / ceph / src / arrow / docs / source / developers / benchmarks.rst
CommitLineData
1d09f67e
TL
1.. Licensed to the Apache Software Foundation (ASF) under one
2.. or more contributor license agreements. See the NOTICE file
3.. distributed with this work for additional information
4.. regarding copyright ownership. The ASF licenses this file
5.. to you under the Apache License, Version 2.0 (the
6.. "License"); you may not use this file except in compliance
7.. with the License. You may obtain a copy of the License at
8
9.. http://www.apache.org/licenses/LICENSE-2.0
10
11.. Unless required by applicable law or agreed to in writing,
12.. software distributed under the License is distributed on an
13.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14.. KIND, either express or implied. See the License for the
15.. specific language governing permissions and limitations
16.. under the License.
17
18.. _benchmarks:
19
20==========
21Benchmarks
22==========
23
24Setup
25=====
26
27First install the :ref:`Archery <archery>` utility to run the benchmark suite.
28
29Running the benchmark suite
30===========================
31
32The benchmark suites can be run with the ``benchmark run`` sub-command.
33
34.. code-block:: shell
35
36 # Run benchmarks in the current git workspace
37 archery benchmark run
38 # Storing the results in a file
39 archery benchmark run --output=run.json
40
41Sometimes, it is required to pass custom CMake flags, e.g.
42
43.. code-block:: shell
44
45 export CC=clang-8 CXX=clang++8
46 archery benchmark run --cmake-extras="-DARROW_SIMD_LEVEL=NONE"
47
48Additionally a full CMake build directory may be specified.
49
50.. code-block:: shell
51
52 archery benchmark run $HOME/arrow/cpp/release-build
53
54Comparison
55==========
56
57One goal with benchmarking is to detect performance regressions. To this end,
58``archery`` implements a benchmark comparison facility via the ``benchmark
59diff`` sub-command.
60
61In the default invocation, it will compare the current source (known as the
62current workspace in git) with local master branch:
63
64.. code-block:: shell
65
66 archery --quiet benchmark diff --benchmark-filter=FloatParsing
67 -----------------------------------------------------------------------------------
68 Non-regressions: (1)
69 -----------------------------------------------------------------------------------
70 benchmark baseline contender change % counters
71 FloatParsing<FloatType> 105.983M items/sec 105.983M items/sec 0.0 {}
72
73 ------------------------------------------------------------------------------------
74 Regressions: (1)
75 ------------------------------------------------------------------------------------
76 benchmark baseline contender change % counters
77 FloatParsing<DoubleType> 209.941M items/sec 109.941M items/sec -47.632 {}
78
79For more information, invoke the ``archery benchmark diff --help`` command for
80multiple examples of invocation.
81
82Iterating efficiently
83~~~~~~~~~~~~~~~~~~~~~
84
85Iterating with benchmark development can be a tedious process due to long
86build time and long run times. Multiple tricks can be used with
87``archery benchmark diff`` to reduce this overhead.
88
89First, the benchmark command supports comparing existing
90build directories, This can be paired with the ``--preserve`` flag to
91avoid rebuilding sources from zero.
92
93.. code-block:: shell
94
95 # First invocation clone and checkouts in a temporary directory. The
96 # directory is preserved with --preserve
97 archery benchmark diff --preserve
98
99 # Modify C++ sources
100
101 # Re-run benchmark in the previously created build directory.
102 archery benchmark diff /tmp/arrow-bench*/{WORKSPACE,master}/build
103
104Second, a benchmark run result can be saved in a json file. This also avoids
105rebuilding the sources, but also executing the (sometimes) heavy benchmarks.
106This technique can be used as a poor's man caching.
107
108.. code-block:: shell
109
110 # Run the benchmarks on a given commit and save the result
111 archery benchmark run --output=run-head-1.json HEAD~1
112 # Compare the previous captured result with HEAD
113 archery benchmark diff HEAD run-head-1.json
114
115Third, the benchmark command supports filtering suites (``--suite-filter``)
116and benchmarks (``--benchmark-filter``), both options supports regular
117expressions.
118
119.. code-block:: shell
120
121 # Taking over a previous run, but only filtering for benchmarks matching
122 # `Kernel` and suite matching `compute-aggregate`.
123 archery benchmark diff \
124 --suite-filter=compute-aggregate --benchmark-filter=Kernel \
125 /tmp/arrow-bench*/{WORKSPACE,master}/build
126
127Instead of rerunning benchmarks on comparison, a JSON file (generated by
128``archery benchmark run``) may be specified for the contender and/or the
129baseline.
130
131.. code-block:: shell
132
133 archery benchmark run --output=baseline.json $HOME/arrow/cpp/release-build
134 git checkout some-feature
135 archery benchmark run --output=contender.json $HOME/arrow/cpp/release-build
136 archery benchmark diff contender.json baseline.json
137
138Regression detection
139====================
140
141Writing a benchmark
142~~~~~~~~~~~~~~~~~~~
143
1441. The benchmark command will filter (by default) benchmarks with the regular
145 expression ``^Regression``. This way, not all benchmarks are run by default.
146 Thus, if you want your benchmark to be verified for regression
147 automatically, the name must match.
148
1492. The benchmark command will run with the ``--benchmark_repetitions=K``
150 options for statistical significance. Thus, a benchmark should not override
151 the repetitions in the (C++) benchmark's arguments definition.
152
1533. Due to #2, a benchmark should run sufficiently fast. Often, when the input
154 does not fit in memory (L2/L3), the benchmark will be memory bound instead
155 of CPU bound. In this case, the input can be downsized.
156
1574. By default, google's benchmark library will use the cputime metric, which
158 is the sum of runtime dedicated on the CPU for all threads of the process.
159 By contrast to realtime which is the wall clock time, e.g. the difference
160 between end_time - start_time. In a single thread model, the cputime is
161 preferable since it is less affected by context switching. In a multi thread
162 scenario, the cputime will give incorrect result since the since it'll
163 be inflated by the number of threads and can be far off realtime. Thus, if
164 the benchmark is multi threaded, it might be better to use
165 ``SetRealtime()``, see this `example <https://github.com/apache/arrow/blob/a9582ea6ab2db055656809a2c579165fe6a811ba/cpp/src/arrow/io/memory-benchmark.cc#L223-L227>`_.
166
167Scripting
168=========
169
170``archery`` is written as a python library with a command line frontend. The
171library can be imported to automate some tasks.
172
173Some invocation of the command line interface can be quite verbose due to build
174output. This can be controlled/avoided with the ``--quiet`` option or the
175``--output=<file>`` can be used, e.g.
176
177.. code-block:: shell
178
179 archery benchmark diff --benchmark-filter=Kernel --output=compare.json ...