效能基準#

設定#

首先安裝 Archery 工具來執行基準測試套件。

執行基準測試套件#

基準測試套件可以使用 benchmark run 子命令執行。

# Run benchmarks in the current git workspace
archery benchmark run
# Storing the results in a file
archery benchmark run --output=run.json

有時，需要傳遞自訂 CMake 標誌，例如：

export CC=clang-8 CXX=clang++8
archery benchmark run --cmake-extras="-DARROW_SIMD_LEVEL=NONE"

此外，可以指定完整的 CMake 建置目錄。

archery benchmark run $HOME/arrow/cpp/release-build

比較#

基準測試的一個目標是偵測效能衰退。為此，archery 透過 benchmark diff 子命令實作了基準比較功能。

在預設調用中，它將比較目前來源（在 git 中稱為目前工作區）與本機 main 分支

archery --quiet benchmark diff --benchmark-filter=FloatParsing
-----------------------------------------------------------------------------------
Non-regressions: (1)
-----------------------------------------------------------------------------------
               benchmark            baseline           contender  change % counters
 FloatParsing<FloatType>  105.983M items/sec  105.983M items/sec       0.0       {}

------------------------------------------------------------------------------------
Regressions: (1)
------------------------------------------------------------------------------------
                benchmark            baseline           contender  change % counters
 FloatParsing<DoubleType>  209.941M items/sec  109.941M items/sec   -47.632       {}

如需更多資訊，請調用 archery benchmark diff --help 命令，以取得多個調用範例。

有效率地迭代#

由於建置時間和執行時間長，基準開發的迭代過程可能很繁瑣。可以搭配 archery benchmark diff 使用多種技巧來減少這種開銷。

首先，基準命令支援比較現有的建置目錄。這可以與 --preserve 標誌搭配使用，以避免從頭開始重建來源。

# First invocation clone and checkouts in a temporary directory. The
# directory is preserved with --preserve
archery benchmark diff --preserve

# Modify C++ sources

# Re-run benchmark in the previously created build directory.
archery benchmark diff /tmp/arrow-bench*/{WORKSPACE,master}/build

其次，基準執行結果可以儲存為 json 檔案。這也可以避免重建來源，同時也避免執行（有時）繁重的基準測試。此技術可以用作簡陋的快取。

# Run the benchmarks on a given commit and save the result
archery benchmark run --output=run-head-1.json HEAD~1
# Compare the previous captured result with HEAD
archery benchmark diff HEAD run-head-1.json

第三，基準命令支援過濾套件 (--suite-filter) 和基準測試 (--benchmark-filter)，這兩個選項都支援正規表示式。

# Taking over a previous run, but only filtering for benchmarks matching
# `Kernel` and suite matching `compute-aggregate`.
archery benchmark diff                                       \
  --suite-filter=compute-aggregate --benchmark-filter=Kernel \
  /tmp/arrow-bench*/{WORKSPACE,master}/build

可以為競爭者和/或基準指定 JSON 檔案（由 archery benchmark run 產生），而不是在比較時重新執行基準測試。

archery benchmark run --output=baseline.json $HOME/arrow/cpp/release-build
git checkout some-feature
archery benchmark run --output=contender.json $HOME/arrow/cpp/release-build
archery benchmark diff contender.json baseline.json

衰退偵測#

撰寫基準測試#

基準命令將（預設情況下）使用正規表示式 ^Regression 過濾基準測試。這樣一來，並非所有基準測試都會預設執行。因此，如果您希望自動驗證您的基準測試是否存在衰退，則名稱必須符合。
基準命令將使用 --benchmark_repetitions=K 選項執行，以達到統計顯著性。因此，基準測試不應覆寫 (C++) 基準測試引數定義中的重複次數。
由於 #2，基準測試應執行得夠快。通常，當輸入不適合記憶體 (L2/L3) 時，基準測試將受記憶體限制而不是受 CPU 限制。在這種情況下，可以縮小輸入大小。
預設情況下，google 的基準測試函式庫將使用 cputime 指標，這是進程所有執行緒在 CPU 上專用運行時間的總和。相較於 realtime，realtime 是實際經過時間，例如 end_time - start_time 之間的差異。在單執行緒模型中，cputime 更佳，因為它較少受到上下文切換的影響。在多執行緒情境中，cputime 會給出不正確的結果，因為它會因執行緒數量而膨脹，並且可能與 realtime 相差甚遠。因此，如果基準測試是多執行緒的，則最好使用 SetRealtime()，請參閱此範例。

腳本編寫#

archery 是以 python 函式庫的形式編寫，並具有命令列前端。該函式庫可以匯入以自動化某些任務。

由於建置輸出，命令列介面的某些調用可能相當冗長。這可以使用 --quiet 選項或 --output=<file> 來控制/避免，例如：

archery benchmark diff --benchmark-filter=Kernel --output=compare.json ...