LLVM Clang benchmarks for a future article.
Clang 13 Processor: 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads), Motherboard: Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS), Chipset: Intel Device 0998, Memory: 504GB, Disk: 7682GB INTEL SSDPF2KX076TZ, Graphics: ASPEED, Monitor: VE228, Network: 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP
OS: Ubuntu 21.04, Kernel: 5.14.0-rc1-folio (x86_64) 20210715, Desktop: GNOME Shell 3.38.4, Display Server: X Server 1.20.11, Compiler: Clang 13.0.0-++20210820072921+23ba3732246a-1~exp1~20210820174536.53, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd0002a0Python Notes: Python 3.9.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Clang 12 OS: Ubuntu 21.04, Kernel: 5.14.0-rc1-folio (x86_64) 20210715, Desktop: GNOME Shell 3.38.4, Display Server: X Server 1.20.11, Compiler: Clang 12.0.0-3ubuntu1~21.04.1, File-System: ext4, Screen Resolution: 1920x1080
Clang 11 OS: Ubuntu 21.04, Kernel: 5.14.0-rc1-folio (x86_64) 20210715, Desktop: GNOME Shell 3.38.4, Display Server: X Server 1.20.11, Compiler: Clang 11.0.1-2ubuntu4, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd0002a0Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Aircrack-ng Aircrack-ng is a tool for assessing WiFi/WLAN network security. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org k/s, More Is Better Aircrack-ng 1.5.2 Clang 11 Clang 12 Clang 13 50K 100K 150K 200K 250K SE +/- 607.21, N = 3 SE +/- 504.25, N = 3 SE +/- 467.06, N = 3 211799.58 210157.65 212825.68 1. (CXX) g++ options: -O3 -fvisibility=hidden -masm=intel -fcommon -rdynamic -lsqlite3 -lpthread -lz -lcrypto -lhwloc -ldl -lm -pthread
AOBench AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time Clang 11 Clang 12 Clang 13 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.24, N = 3 SE +/- 0.02, N = 3 36.17 36.43 36.00 1. (CC) gcc options: -lm -O3 -march=native
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K Clang 11 Clang 12 Clang 13 5 10 15 20 25 SE +/- 0.23, N = 3 SE +/- 0.28, N = 3 SE +/- 0.24, N = 3 19.32 19.39 19.60 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K Clang 11 Clang 12 Clang 13 10 20 30 40 50 SE +/- 0.36, N = 9 SE +/- 0.40, N = 3 SE +/- 0.42, N = 7 44.06 44.84 45.30 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K Clang 11 Clang 12 Clang 13 13 26 39 52 65 SE +/- 0.16, N = 3 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 54.44 54.47 56.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Apache HTTP Server This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 500 Clang 11 Clang 12 Clang 13 30K 60K 90K 120K 150K SE +/- 798.53, N = 3 SE +/- 1145.12, N = 15 SE +/- 843.46, N = 3 122576.13 121643.80 145589.94 1. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 1000 Clang 11 Clang 12 Clang 13 30K 60K 90K 120K 150K SE +/- 924.52, N = 15 SE +/- 534.15, N = 3 SE +/- 485.15, N = 3 120827.45 112287.17 112600.02 1. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI Clang 11 Clang 12 Clang 13 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 74.40 76.92 77.09 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt Clang 11 Clang 12 Clang 13 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.23, N = 3 SE +/- 0.02, N = 3 73.56 76.24 76.25 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 Clang 11 Clang 12 Clang 13 1200 2400 3600 4800 6000 SE +/- 0.30, N = 3 SE +/- 5.15, N = 3 SE +/- 0.33, N = 3 5743.35 5735.25 5761.30 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt Clang 11 Clang 12 Clang 13 1200 2400 3600 4800 6000 SE +/- 0.51, N = 3 SE +/- 0.79, N = 3 SE +/- 0.20, N = 3 5714.92 5700.33 5753.64 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish Clang 11 Clang 12 Clang 13 60 120 180 240 300 SE +/- 0.21, N = 3 SE +/- 0.97, N = 3 SE +/- 0.22, N = 3 286.03 293.73 288.81 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt Clang 11 Clang 12 Clang 13 60 120 180 240 300 SE +/- 0.36, N = 3 SE +/- 0.24, N = 3 SE +/- 0.27, N = 3 279.96 288.61 290.47 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish Clang 11 Clang 12 Clang 13 70 140 210 280 350 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 326.77 329.53 327.93 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt Clang 11 Clang 12 Clang 13 70 140 210 280 350 SE +/- 0.09, N = 3 SE +/- 3.35, N = 3 SE +/- 0.04, N = 3 325.70 326.80 333.28 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 Clang 11 Clang 12 Clang 13 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 114.75 115.33 116.10 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt Clang 11 Clang 12 Clang 13 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 115.65 114.83 114.50 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 Clang 11 Clang 12 Clang 13 200 400 600 800 1000 SE +/- 0.40, N = 3 SE +/- 0.87, N = 3 SE +/- 4.73, N = 3 869.35 882.17 855.76 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt Clang 11 Clang 12 Clang 13 200 400 600 800 1000 SE +/- 0.04, N = 3 SE +/- 0.95, N = 3 SE +/- 1.46, N = 3 866.10 874.91 849.96 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Clang 11 Clang 12 Clang 13 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.80 15.22 12.59 1. (CC) gcc options: -lm -lpthread -O3 -march=native
dav1d OpenBenchmarking.org FPS, More Is Better dav1d 0.9.1 Video Input: Summer Nature 4K Clang 11 Clang 12 Clang 13 120 240 360 480 600 SE +/- 2.65, N = 3 SE +/- 0.77, N = 3 SE +/- 0.44, N = 3 528.63 532.51 533.02 MIN: 176.65 / MAX: 587.68 MIN: 186.64 / MAX: 587.12 MIN: 186.74 / MAX: 587.2 1. (CC) gcc options: -O3 -march=native -pthread
OpenBenchmarking.org FPS, More Is Better dav1d 0.9.1 Video Input: Chimera 1080p 10-bit Clang 11 Clang 12 Clang 13 200 400 600 800 1000 SE +/- 0.82, N = 3 SE +/- 1.52, N = 3 SE +/- 2.33, N = 3 842.95 844.78 843.96 MIN: 515.35 / MAX: 1115.74 MIN: 517.09 / MAX: 1121.33 MIN: 503.8 / MAX: 1131.86 1. (CC) gcc options: -O3 -march=native -pthread
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Clang 11 Clang 12 Clang 13 4K 8K 12K 16K 20K SE +/- 187.54, N = 4 SE +/- 34.47, N = 3 SE +/- 43.73, N = 3 18135 18443 18489 1. (CC) gcc options: -pthread -O3 -march=native -lm
FinanceBench FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP Clang 11 Clang 12 Clang 13 8K 16K 24K 32K 40K SE +/- 19.15, N = 3 SE +/- 290.04, N = 3 SE +/- 450.24, N = 4 37221.00 37760.89 38153.40 1. (CXX) g++ options: -O3 -march=native -fopenmp
OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP Clang 11 Clang 12 Clang 13 13K 26K 39K 52K 65K SE +/- 38.93, N = 3 SE +/- 82.28, N = 3 SE +/- 17.71, N = 3 58558.26 58502.85 58281.79 1. (CXX) g++ options: -O3 -march=native -fopenmp
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Lion Clang 11 Clang 12 Clang 13 1300 2600 3900 5200 6500 SE +/- 1.73, N = 3 5827 5870 5967 1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Church Facade Clang 11 Clang 12 Clang 13 1500 3000 4500 6000 7500 SE +/- 3.18, N = 3 SE +/- 4.26, N = 3 7000 7057 7202 1. (CXX) g++ options: -O3 -march=native
Google SynthMark SynthMark is a cross platform tool for benchmarking CPU performance under a variety of real-time audio workloads. It uses a polyphonic synthesizer model to provide standardized tests for latency, jitter and computational throughput. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 Clang 11 Clang 12 Clang 13 110 220 330 440 550 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 491.03 503.93 513.01 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl Clang 11 Clang 12 Clang 13 400 800 1200 1600 2000 SE +/- 13.09, N = 3 SE +/- 8.51, N = 3 SE +/- 8.69, N = 3 1860 1864 1874 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate Clang 11 Clang 12 Clang 13 160 320 480 640 800 SE +/- 2.40, N = 3 SE +/- 2.65, N = 3 SE +/- 7.80, N = 3 728 723 728 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen Clang 11 Clang 12 Clang 13 200 400 600 800 1000 SE +/- 3.18, N = 3 SE +/- 0.67, N = 3 SE +/- 5.24, N = 3 745 758 877 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced Clang 11 Clang 12 Clang 13 300 600 900 1200 1500 SE +/- 0.67, N = 3 SE +/- 5.49, N = 3 SE +/- 5.78, N = 3 1104 1104 1174 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing Clang 11 Clang 12 Clang 13 100 200 300 400 500 SE +/- 14.35, N = 12 SE +/- 9.74, N = 15 SE +/- 8.56, N = 15 473 452 485 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian Clang 11 Clang 12 Clang 13 130 260 390 520 650 SE +/- 2.33, N = 3 SE +/- 4.10, N = 3 SE +/- 3.06, N = 3 623 600 600 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space Clang 11 Clang 12 Clang 13 200 400 600 800 1000 SE +/- 6.84, N = 3 SE +/- 5.60, N = 15 SE +/- 8.76, N = 3 852 852 866 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.9.0-jumbo-1 Test: MD5 Clang 11 Clang 12 Clang 13 2M 4M 6M 8M 10M SE +/- 8504.90, N = 3 SE +/- 16973.84, N = 3 SE +/- 3785.94, N = 3 8880000 10275333 10430000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt -lbz2
LibRaw LibRaw is a RAW image decoder for digital camera photos. This test profile runs LibRaw's post-processing benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark Clang 11 Clang 12 Clang 13 9 18 27 36 45 SE +/- 0.13, N = 3 SE +/- 0.23, N = 3 SE +/- 0.19, N = 3 34.55 36.65 39.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -ljpeg -lz -lm
MariaDB This is a MariaDB MySQL database server benchmark making use of mysqlslap. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.6.4 Clients: 2048 Clang 11 Clang 12 Clang 13 150 300 450 600 750 SE +/- 7.30, N = 9 SE +/- 6.96, N = 9 SE +/- 1.71, N = 3 700 698 712 -lbz2 -lsnappy -lpthread -lm -lstdc++ -lpthread -lm -lstdc++ 1. (CXX) g++ options: -fPIC -O3 -march=native -fstack-protector -shared -pthread -ldl -lz -lrt
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.6.4 Clients: 4096 Clang 11 Clang 12 Clang 13 70 140 210 280 350 SE +/- 1.96, N = 3 SE +/- 1.07, N = 3 SE +/- 1.31, N = 3 319 319 321 -lbz2 -lsnappy -lpthread -lm -lstdc++ -lpthread -lm -lstdc++ 1. (CXX) g++ options: -fPIC -O3 -march=native -fstack-protector -shared -pthread -ldl -lz -lrt
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mobilenet Clang 11 Clang 12 Clang 13 4 8 12 16 20 SE +/- 0.27, N = 14 SE +/- 0.16, N = 3 SE +/- 0.11, N = 3 17.30 17.65 17.38 MIN: 15.79 / MAX: 37.29 MIN: 16.51 / MAX: 19.27 MIN: 16.17 / MAX: 40.05 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: shufflenet-v2 Clang 11 Clang 12 Clang 13 3 6 9 12 15 SE +/- 0.67, N = 14 SE +/- 1.05, N = 3 SE +/- 0.01, N = 3 11.46 11.62 7.84 MIN: 7.56 / MAX: 31.76 MIN: 7.71 / MAX: 25.56 MIN: 7.49 / MAX: 13.76 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: mnasnet Clang 11 Clang 12 Clang 13 3 6 9 12 15 SE +/- 0.66, N = 14 SE +/- 1.31, N = 3 SE +/- 1.33, N = 3 10.90 11.01 8.79 MIN: 6.9 / MAX: 32.84 MIN: 7.09 / MAX: 20.3 MIN: 6.69 / MAX: 26.58 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: efficientnet-b0 Clang 11 Clang 12 Clang 13 4 8 12 16 20 SE +/- 1.01, N = 14 SE +/- 2.00, N = 3 SE +/- 2.13, N = 3 15.32 13.75 12.66 MIN: 8.59 / MAX: 45.64 MIN: 9.12 / MAX: 41.76 MIN: 8.73 / MAX: 27.07 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: blazeface Clang 11 Clang 12 Clang 13 2 4 6 8 10 SE +/- 0.47, N = 14 SE +/- 1.08, N = 3 SE +/- 0.01, N = 3 6.07 6.94 4.86 MIN: 4.35 / MAX: 27.78 MIN: 4.49 / MAX: 19.88 MIN: 4.57 / MAX: 9.45 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: googlenet Clang 11 Clang 12 Clang 13 5 10 15 20 25 SE +/- 1.02, N = 14 SE +/- 1.87, N = 3 SE +/- 0.31, N = 3 21.59 21.24 17.87 MIN: 16.19 / MAX: 50.35 MIN: 17.28 / MAX: 39.77 MIN: 16.58 / MAX: 37.88 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: alexnet Clang 11 Clang 12 Clang 13 3 6 9 12 15 SE +/- 0.16, N = 14 SE +/- 1.31, N = 3 SE +/- 0.17, N = 3 9.97 11.04 9.64 MIN: 8.85 / MAX: 29.23 MIN: 9.24 / MAX: 14.73 MIN: 8.9 / MAX: 11.05 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: resnet50 Clang 11 Clang 12 Clang 13 5 10 15 20 25 SE +/- 0.61, N = 14 SE +/- 0.14, N = 3 SE +/- 0.29, N = 3 22.36 21.48 21.71 MIN: 18.98 / MAX: 71.34 MIN: 19.9 / MAX: 40.42 MIN: 19.54 / MAX: 44.52 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: yolov4-tiny Clang 11 Clang 12 Clang 13 7 14 21 28 35 SE +/- 0.26, N = 14 SE +/- 0.55, N = 3 SE +/- 0.54, N = 3 26.79 28.35 27.94 MIN: 23.92 / MAX: 56.1 MIN: 25.63 / MAX: 60.64 MIN: 25.71 / MAX: 39.58 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: squeezenet_ssd Clang 11 Clang 12 Clang 13 5 10 15 20 25 SE +/- 0.53, N = 14 SE +/- 0.22, N = 3 SE +/- 0.23, N = 3 19.72 19.26 19.21 MIN: 17.67 / MAX: 58.39 MIN: 18.05 / MAX: 35.83 MIN: 18.04 / MAX: 38.36 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: CPU - Model: regnety_400m Clang 11 Clang 12 Clang 13 9 18 27 36 45 SE +/- 2.48, N = 14 SE +/- 3.84, N = 3 SE +/- 1.77, N = 3 36.26 37.11 23.98 MIN: 18.78 / MAX: 124.57 MIN: 19.88 / MAX: 96.76 MIN: 20.12 / MAX: 66.47 1. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 1 Clang 11 Clang 12 Clang 13 16K 32K 48K 64K 80K SE +/- 131.50, N = 3 SE +/- 102.54, N = 3 SE +/- 160.67, N = 3 74606.79 75015.29 75270.65 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 20 Clang 11 Clang 12 Clang 13 50K 100K 150K 200K 250K SE +/- 2398.86, N = 3 SE +/- 1222.93, N = 3 SE +/- 1379.02, N = 3 254323.92 251352.56 254133.55 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 100 Clang 11 Clang 12 Clang 13 40K 80K 120K 160K 200K SE +/- 1506.43, N = 3 SE +/- 431.14, N = 3 SE +/- 987.48, N = 3 205660.13 206297.92 208590.36 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 200 Clang 11 Clang 12 Clang 13 40K 80K 120K 160K 200K SE +/- 2823.39, N = 3 SE +/- 312.30, N = 3 SE +/- 911.83, N = 3 196465.21 194567.38 196023.59 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 500 Clang 11 Clang 12 Clang 13 40K 80K 120K 160K 200K SE +/- 554.68, N = 3 SE +/- 63.72, N = 3 SE +/- 936.49, N = 3 204519.00 204058.28 204880.02 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 1000 Clang 11 Clang 12 Clang 13 40K 80K 120K 160K 200K SE +/- 947.34, N = 3 SE +/- 632.41, N = 3 SE +/- 1305.06, N = 3 205224.87 205740.39 205977.68 1. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU Clang 11 Clang 12 Clang 13 0.6292 1.2584 1.8876 2.5168 3.146 SE +/- 0.00451, N = 3 SE +/- 0.00591, N = 3 SE +/- 0.00435, N = 3 2.78642 2.79658 2.78910 MIN: 2.65 MIN: 2.65 MIN: 2.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU Clang 11 Clang 12 Clang 13 0.3883 0.7766 1.1649 1.5532 1.9415 SE +/- 0.00619, N = 3 SE +/- 0.00427, N = 3 SE +/- 0.00225, N = 3 1.71640 1.72558 1.71770 MIN: 1.53 MIN: 1.57 MIN: 1.55 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU Clang 11 Clang 12 Clang 13 0.4682 0.9364 1.4046 1.8728 2.341 SE +/- 0.00072, N = 3 SE +/- 0.00197, N = 3 SE +/- 0.00196, N = 3 2.08089 2.06918 2.07168 MIN: 1.99 MIN: 1.99 MIN: 1.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU Clang 11 Clang 12 Clang 13 0.7383 1.4766 2.2149 2.9532 3.6915 SE +/- 0.00886, N = 3 SE +/- 0.00384, N = 3 SE +/- 0.00367, N = 3 3.05315 3.05821 3.28114 MIN: 2.87 MIN: 2.86 MIN: 3.08 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU Clang 11 Clang 12 Clang 13 0.8134 1.6268 2.4402 3.2536 4.067 SE +/- 0.00881, N = 3 SE +/- 0.00151, N = 3 SE +/- 0.00211, N = 3 3.59463 3.57952 3.61489 MIN: 3.5 MIN: 3.5 MIN: 3.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Clang 11 Clang 12 Clang 13 130 260 390 520 650 SE +/- 5.71, N = 13 SE +/- 6.83, N = 4 SE +/- 0.92, N = 3 608.60 611.93 598.02 MIN: 570.79 MIN: 569.44 MIN: 575.43 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Clang 11 Clang 12 Clang 13 80 160 240 320 400 SE +/- 2.59, N = 3 SE +/- 1.79, N = 3 SE +/- 0.40, N = 3 368.25 368.45 365.73 MIN: 349.99 MIN: 347.27 MIN: 355.4 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode Clang 11 Clang 12 Clang 13 3 6 9 12 15 SE +/- 0.013, N = 5 SE +/- 0.006, N = 5 SE +/- 0.069, N = 5 9.423 9.908 9.632 1. (CXX) g++ options: -O3 -march=native -logg -lm
PJSIP PJSIP is a free and open source multimedia communication library written in C language implementing standard based protocols such as SIP, SDP, RTP, STUN, TURN, and ICE. It combines signaling protocol (SIP) with rich multimedia framework and NAT traversal functionality into high level API that is portable and suitable for almost any type of systems ranging from desktops, embedded systems, to mobile handsets. This test profile is making use of pjsip-perf with both the client/server on teh system. More details on the PJSIP benchmark at https://www.pjsip.org/high-performance-sip.htm Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: INVITE Clang 11 Clang 12 Clang 13 1100 2200 3300 4400 5500 SE +/- 8.50, N = 3 SE +/- 10.68, N = 3 SE +/- 11.10, N = 3 5289 5291 5271 1. (CC) gcc options: -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lssl -lcrypto -luuid -lm -lrt -lpthread -lasound -O3 -march=native
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateful Clang 11 Clang 12 Clang 13 2K 4K 6K 8K 10K SE +/- 4.48, N = 3 SE +/- 11.92, N = 3 SE +/- 43.11, N = 3 10195 10185 10140 1. (CC) gcc options: -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lssl -lcrypto -luuid -lm -lrt -lpthread -lasound -O3 -march=native
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateless Clang 11 Clang 12 Clang 13 9K 18K 27K 36K 45K SE +/- 575.37, N = 3 SE +/- 510.13, N = 3 SE +/- 380.52, N = 3 41305 40364 41871 1. (CC) gcc options: -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lssl -lcrypto -luuid -lm -lrt -lpthread -lasound -O3 -march=native
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only Clang 11 Clang 12 Clang 13 200K 400K 600K 800K 1000K SE +/- 16009.81, N = 15 SE +/- 3346.94, N = 3 SE +/- 10672.94, N = 3 967214 981525 973432 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency Clang 11 Clang 12 Clang 13 0.0585 0.117 0.1755 0.234 0.2925 SE +/- 0.005, N = 15 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 0.260 0.255 0.257 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Write Clang 11 Clang 12 Clang 13 20K 40K 60K 80K 100K SE +/- 786.14, N = 3 SE +/- 85.43, N = 3 SE +/- 119.61, N = 3 85162 84132 84854 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Write - Average Latency Clang 11 Clang 12 Clang 13 0.6689 1.3378 2.0067 2.6756 3.3445 SE +/- 0.027, N = 3 SE +/- 0.003, N = 3 SE +/- 0.004, N = 3 2.938 2.973 2.948 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 Clang 11 Clang 12 Clang 13 600 1200 1800 2400 3000 SE +/- 5.39, N = 3 SE +/- 4.59, N = 3 SE +/- 8.72, N = 3 2589.5 2606.3 2657.3 1. (CXX) g++ options: -O3 -march=native -rdynamic
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS Clang 11 Clang 12 Clang 13 50K 100K 150K 200K 250K SE +/- 87.48, N = 3 SE +/- 339.82, N = 3 SE +/- 40.95, N = 3 252089 250849 240447 1. (CC) gcc options: -pedantic -O3
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 4 - Input: Bosphorus 4K Clang 11 Clang 12 Clang 13 1.1176 2.2352 3.3528 4.4704 5.588 SE +/- 0.020, N = 3 SE +/- 0.004, N = 3 SE +/- 0.027, N = 3 4.895 4.849 4.967 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 8 - Input: Bosphorus 4K Clang 11 Clang 12 Clang 13 13 26 39 52 65 SE +/- 0.15, N = 3 SE +/- 0.14, N = 3 SE +/- 0.38, N = 3 57.84 57.92 58.98 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
SVT-HEVC This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p Clang 11 Clang 12 Clang 13 80 160 240 320 400 SE +/- 2.06, N = 3 SE +/- 1.14, N = 3 SE +/- 3.85, N = 3 344.59 340.21 353.72 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p Clang 11 Clang 12 Clang 13 140 280 420 560 700 SE +/- 2.02, N = 3 SE +/- 2.53, N = 3 SE +/- 3.02, N = 3 616.45 604.45 626.33 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
SVT-VP9 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample YUV input video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p Clang 11 Clang 12 Clang 13 100 200 300 400 500 SE +/- 4.56, N = 3 SE +/- 4.57, N = 3 SE +/- 5.26, N = 3 460.20 456.18 476.72 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p Clang 11 Clang 12 Clang 13 80 160 240 320 400 SE +/- 1.62, N = 3 SE +/- 2.53, N = 3 SE +/- 4.27, N = 3 362.67 365.79 374.91 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Tachyon This is a test of the threaded Tachyon, a parallel ray-tracing system, measuring the time to ray-trace a sample scene. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Tachyon 0.99b6 Total Time Clang 11 Clang 12 Clang 13 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 14.25 13.95 13.47 1. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet Clang 11 Clang 12 Clang 13 1500 3000 4500 6000 7500 SE +/- 3.50, N = 3 SE +/- 1.66, N = 3 SE +/- 2.30, N = 3 6912.12 4353.39 4370.10 MIN: 6888.33 / MAX: 6947.75 MIN: 4332.87 / MAX: 4442 MIN: 4351.72 / MAX: 4450.96 1. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 Clang 11 Clang 12 Clang 13 170 340 510 680 850 SE +/- 0.69, N = 3 SE +/- 0.70, N = 3 SE +/- 0.66, N = 3 773.47 541.93 539.65 MIN: 690.55 / MAX: 820.36 MIN: 536.56 / MAX: 554.17 MIN: 536.06 / MAX: 559.33 1. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 Clang 11 Clang 12 Clang 13 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 105.40 83.86 84.61 MIN: 104.83 / MAX: 106.1 MIN: 83.28 / MAX: 84.63 MIN: 84.23 / MAX: 85 1. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 Clang 11 Clang 12 Clang 13 140 280 420 560 700 SE +/- 0.05, N = 3 SE +/- 0.19, N = 3 SE +/- 0.27, N = 3 657.10 402.26 400.26 MIN: 656.47 / MAX: 657.89 MIN: 401.13 / MAX: 403.41 MIN: 399.7 / MAX: 401.33 1. (CXX) g++ options: -O3 -march=native -fopenmp=libomp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
TSCP This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance Clang 11 Clang 12 Clang 13 300K 600K 900K 1200K 1500K SE +/- 794.00, N = 5 SE +/- 1489.72, N = 5 SE +/- 806.80, N = 5 1477410 1481388 1491816 1. (CC) gcc options: -O3 -march=native
VP9 libvpx Encoding This is a standard video encoding performance test of Google's libvpx library and the vpxenc command for the VP9 video format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.10.0 Speed: Speed 0 - Input: Bosphorus 4K Clang 11 Clang 12 Clang 13 1.2848 2.5696 3.8544 5.1392 6.424 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 5.60 5.62 5.71 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=native -fPIC -U_FORTIFY_SOURCE -std=gnu++11
OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.10.0 Speed: Speed 5 - Input: Bosphorus 4K Clang 11 Clang 12 Clang 13 4 8 12 16 20 SE +/- 0.13, N = 3 SE +/- 0.09, N = 3 SE +/- 0.14, N = 15 14.32 14.11 15.35 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=native -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed Clang 11 Clang 12 Clang 13 1400 2800 4200 5600 7000 SE +/- 78.33, N = 3 SE +/- 60.99, N = 3 SE +/- 47.01, N = 3 6433.3 6321.0 6697.9 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed Clang 11 Clang 12 Clang 13 600 1200 1800 2400 3000 SE +/- 32.58, N = 3 SE +/- 34.31, N = 15 SE +/- 7.16, N = 3 2729.0 2560.6 2775.9 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed Clang 11 Clang 12 Clang 13 20 40 60 80 100 SE +/- 0.85, N = 15 SE +/- 0.45, N = 3 SE +/- 0.64, N = 10 84.8 80.4 83.6 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed Clang 11 Clang 12 Clang 13 200 400 600 800 1000 SE +/- 1.30, N = 3 SE +/- 2.11, N = 3 SE +/- 1.29, N = 3 826.7 865.6 863.1 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Compression Speed Clang 11 Clang 12 Clang 13 200 400 600 800 1000 SE +/- 0.29, N = 3 SE +/- 1.68, N = 3 SE +/- 5.79, N = 3 798.8 833.6 828.0 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed Clang 11 Clang 12 Clang 13 11 22 33 44 55 SE +/- 0.51, N = 4 SE +/- 0.55, N = 15 SE +/- 0.53, N = 15 45.5 45.7 47.0 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
Clang 13 Processor: 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads), Motherboard: Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS), Chipset: Intel Device 0998, Memory: 504GB, Disk: 7682GB INTEL SSDPF2KX076TZ, Graphics: ASPEED, Monitor: VE228, Network: 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP
OS: Ubuntu 21.04, Kernel: 5.14.0-rc1-folio (x86_64) 20210715, Desktop: GNOME Shell 3.38.4, Display Server: X Server 1.20.11, Compiler: Clang 13.0.0-++20210820072921+23ba3732246a-1~exp1~20210820174536.53, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd0002a0Python Notes: Python 3.9.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 22 August 2021 07:45 by user phoronix.
Clang 12 Processor: 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads), Motherboard: Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS), Chipset: Intel Device 0998, Memory: 504GB, Disk: 7682GB INTEL SSDPF2KX076TZ, Graphics: ASPEED, Monitor: VE228, Network: 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP
OS: Ubuntu 21.04, Kernel: 5.14.0-rc1-folio (x86_64) 20210715, Desktop: GNOME Shell 3.38.4, Display Server: X Server 1.20.11, Compiler: Clang 12.0.0-3ubuntu1~21.04.1, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd0002a0Python Notes: Python 3.9.5Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 22 August 2021 16:19 by user phoronix.
Clang 11 Processor: 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads), Motherboard: Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS), Chipset: Intel Device 0998, Memory: 504GB, Disk: 7682GB INTEL SSDPF2KX076TZ, Graphics: ASPEED, Monitor: VE228, Network: 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP
OS: Ubuntu 21.04, Kernel: 5.14.0-rc1-folio (x86_64) 20210715, Desktop: GNOME Shell 3.38.4, Display Server: X Server 1.20.11, Compiler: Clang 11.0.1-2ubuntu4, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd0002a0Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 23 August 2021 05:54 by user phoronix.