AMD EPYC 7763 64-Core testing with a Supermicro H12SSL-i v1.01 (2.0 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
Clang 12.0 Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Clang 11.0 OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 11.0.0-2~ubuntu20.04.1, File-System: ext4, Screen Resolution: 1024x768
Clang 12.0 LTO OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
GCC 9.3 OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 9.3.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
GCC 10.3 OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 10.3.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
GCC 11.0.1 OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 11.0.1 20210413, File-System: ext4, Screen Resolution: 1024x768
AMD AOCC 3.0 OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: (unknown)Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 11.0 Clang 12.0 AMD AOCC 3.0 2 4 6 8 10 SE +/- 0.03687, N = 3 SE +/- 0.02683, N = 3 SE +/- 0.00568, N = 3 SE +/- 0.00123, N = 3 SE +/- 0.00485, N = 3 7.23686 7.19213 1.45757 1.44425 1.37059 -fopenmp - MIN: 6.18 -fopenmp - MIN: 6.14 -fopenmp=libomp - MIN: 1.35 -fopenmp=libomp - MIN: 1.34 -fopenmp=libomp - MIN: 1.28 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
ViennaCL ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY Clang 12.0 GCC 10.3 GCC 9.3 GCC 11.0.1 Clang 11.0 AMD AOCC 3.0 400 800 1200 1600 2000 SE +/- 15.32, N = 11 SE +/- 131.59, N = 12 SE +/- 9.19, N = 15 SE +/- 2.67, N = 15 SE +/- 8.32, N = 15 SE +/- 9.88, N = 12 604.0 1461.2 1587.0 1599.0 1877.0 1944.0 -fopenmp=libomp -fopenmp -fopenmp -fopenmp -fopenmp=libomp -fopenmp=libomp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 9.3 GCC 10.3 GCC 11.0.1 500 1000 1500 2000 2500 SE +/- 20.06, N = 12 SE +/- 3.59, N = 12 SE +/- 1.59, N = 15 SE +/- 2.06, N = 15 SE +/- 194.35, N = 12 SE +/- 2.74, N = 15 878.0 1017.0 1043.0 1521.0 2158.4 2359.0 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
Etcpak Etcpack is the self-proclaimed "fastest ETC compressor on the planet" with focused on providing open-source, very fast ETC and S3 texture compression support. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 GCC 9.3 GCC 10.3 Clang 11.0 AMD AOCC 3.0 Clang 12.0 Clang 12.0 LTO 600 1200 1800 2400 3000 SE +/- 0.16, N = 3 SE +/- 0.48, N = 3 SE +/- 1.69, N = 3 SE +/- 8.09, N = 3 SE +/- 2.64, N = 3 SE +/- 6.09, N = 3 1082.37 1114.60 1872.76 2654.72 2718.53 2719.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
ViennaCL ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 9.3 GCC 10.3 GCC 11.0.1 20 40 60 80 100 SE +/- 0.05, N = 12 SE +/- 0.06, N = 15 SE +/- 0.04, N = 12 SE +/- 0.16, N = 15 SE +/- 1.05, N = 12 SE +/- 0.29, N = 15 48.6 83.6 84.0 98.5 98.7 100.5 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 9.3 GCC 10.3 GCC 11.0.1 20 40 60 80 100 SE +/- 0.09, N = 12 SE +/- 0.02, N = 15 SE +/- 0.05, N = 12 SE +/- 0.08, N = 15 SE +/- 0.62, N = 12 51.9 88.3 90.0 100.9 104.0 104.0 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
dav1d Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Chimera 1080p 10-bit Clang 11.0 AMD AOCC 3.0 GCC 9.3 Clang 12.0 GCC 10.3 GCC 11.0.1 70 140 210 280 350 SE +/- 0.48, N = 3 SE +/- 0.39, N = 3 SE +/- 0.71, N = 3 SE +/- 0.93, N = 3 SE +/- 0.21, N = 3 SE +/- 1.11, N = 3 184.19 192.00 305.36 308.32 316.14 334.35 -lm - MIN: 114.52 / MAX: 310.5 -lm - MIN: 118.57 / MAX: 324.98 -lm - MIN: 210.86 / MAX: 493.21 MIN: 220.53 / MAX: 490.51 -lm - MIN: 218.19 / MAX: 515.85 -lm - MIN: 234.24 / MAX: 544.9 1. (CC) gcc options: -O3 -march=native -pthread
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing GCC 11.0.1 GCC 10.3 GCC 9.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 500 1000 1500 2000 2500 SE +/- 17.34, N = 3 SE +/- 14.93, N = 3 SE +/- 18.77, N = 3 SE +/- 52.84, N = 15 SE +/- 27.29, N = 3 SE +/- 41.63, N = 12 1188 1208 1238 1866 2034 2136 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt GCC 10.3 GCC 9.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 200 400 600 800 1000 SE +/- 0.02, N = 3 SE +/- 0.40, N = 3 SE +/- 3.17, N = 3 SE +/- 0.16, N = 3 SE +/- 4.64, N = 3 476.18 611.98 838.09 840.64 843.40 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 11.0.1 GCC 9.3 GCC 10.3 4 8 12 16 20 SE +/- 0.023, N = 3 SE +/- 0.063, N = 3 SE +/- 0.009, N = 3 SE +/- 0.027, N = 3 SE +/- 0.014, N = 3 SE +/- 0.014, N = 3 15.870 15.649 15.599 9.227 9.158 9.029 1. (CC) gcc options: -lm -lpthread -O3 -march=native
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 GCC 10.3 GCC 9.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 200 400 600 800 1000 SE +/- 0.28, N = 3 SE +/- 0.13, N = 3 SE +/- 3.15, N = 3 SE +/- 0.62, N = 3 SE +/- 4.85, N = 3 485.02 616.10 845.14 848.24 850.50 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 10.3 GCC 9.3 0.2638 0.5276 0.7914 1.0552 1.319 SE +/- 0.004576, N = 3 SE +/- 0.004625, N = 3 SE +/- 0.006530, N = 3 SE +/- 0.005622, N = 3 SE +/- 0.003430, N = 3 1.172580 1.170440 1.151400 0.788192 0.717782 -fopenmp=libomp - MIN: 1.12 -fopenmp=libomp - MIN: 1.11 -fopenmp=libomp - MIN: 1.09 -fopenmp - MIN: 0.74 -fopenmp - MIN: 0.67 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
LibRaw LibRaw is a RAW image decoder for digital camera photos. This test profile runs LibRaw's post-processing benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 11.0.1 GCC 10.3 GCC 9.3 13 26 39 52 65 SE +/- 0.33, N = 3 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 SE +/- 0.16, N = 3 SE +/- 0.23, N = 3 SE +/- 0.19, N = 3 38.71 41.64 41.78 57.24 58.90 60.20 1. (CXX) g++ options: -O3 -march=native -fopenmp -ljpeg -lz -lm
FinanceBench FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP GCC 9.3 Clang 11.0 AMD AOCC 3.0 GCC 10.3 Clang 12.0 GCC 11.0.1 16K 32K 48K 64K 80K SE +/- 971.24, N = 3 SE +/- 4.51, N = 3 SE +/- 242.64, N = 3 SE +/- 23.55, N = 3 SE +/- 10.95, N = 3 SE +/- 42.62, N = 3 76805.58 51900.43 51885.52 51770.51 51596.87 51376.82 1. (CXX) g++ options: -O3 -march=native -fopenmp
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Clang 12.0 GCC 10.3 GCC 9.3 Clang 11.0 AMD AOCC 3.0 0.2748 0.5496 0.8244 1.0992 1.374 SE +/- 0.018279, N = 4 SE +/- 0.001247, N = 3 SE +/- 0.001032, N = 3 SE +/- 0.000480, N = 3 SE +/- 0.000645, N = 3 1.221320 0.870784 0.869308 0.841169 0.833921 -fopenmp=libomp - MIN: 1.13 -fopenmp - MIN: 0.84 -fopenmp - MIN: 0.84 -fopenmp=libomp - MIN: 0.82 -fopenmp=libomp - MIN: 0.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
ViennaCL ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.56, N = 12 SE +/- 0.07, N = 12 SE +/- 0.03, N = 15 SE +/- 0.59, N = 12 SE +/- 0.08, N = 15 SE +/- 0.07, N = 15 65.7 78.8 79.3 94.4 95.0 95.3 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT Clang 12.0 Clang 11.0 GCC 10.3 GCC 9.3 GCC 11.0.1 AMD AOCC 3.0 300 600 900 1200 1500 SE +/- 17.06, N = 12 SE +/- 1.49, N = 15 SE +/- 95.41, N = 12 SE +/- 1.59, N = 15 SE +/- 1.87, N = 15 SE +/- 2.61, N = 12 819.00 933.00 1056.42 1133.00 1153.00 1165.00 -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp -fopenmp=libomp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
SVT-AV1 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 0 - Input: 1080p GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 11.0 Clang 12.0 AMD AOCC 3.0 0.0412 0.0824 0.1236 0.1648 0.206 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.129 0.169 0.176 0.181 0.183 0.183 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
toyBrot Fractal Generator ToyBrot is a Mandelbrot fractal generator supporting C++ threads/tasks, OpenMP, Intel Threaded Building Blocks (TBB), and other targets. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Threads Clang 12.0 AMD AOCC 3.0 Clang 12.0 LTO Clang 11.0 GCC 10.3 GCC 9.3 1500 3000 4500 6000 7500 SE +/- 30.90, N = 3 SE +/- 24.26, N = 3 SE +/- 15.06, N = 3 SE +/- 25.04, N = 3 SE +/- 6.12, N = 3 SE +/- 8.33, N = 3 7220 7144 7143 6395 5383 5142 -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -flto -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc 1. (CXX) g++ options: -O3 -march=native -lpthread
Etcpak Etcpack is the self-proclaimed "fastest ETC compressor on the planet" with focused on providing open-source, very fast ETC and S3 texture compression support. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 Clang 11.0 AMD AOCC 3.0 GCC 9.3 GCC 10.3 Clang 12.0 Clang 12.0 LTO 60 120 180 240 300 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.11, N = 3 SE +/- 0.06, N = 3 205.07 211.73 269.67 281.15 284.64 284.76 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
toyBrot Fractal Generator ToyBrot is a Mandelbrot fractal generator supporting C++ threads/tasks, OpenMP, Intel Threaded Building Blocks (TBB), and other targets. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: TBB Clang 12.0 LTO AMD AOCC 3.0 Clang 12.0 Clang 11.0 GCC 10.3 GCC 9.3 1500 3000 4500 6000 7500 SE +/- 86.43, N = 3 SE +/- 52.54, N = 3 SE +/- 87.21, N = 3 SE +/- 67.11, N = 7 SE +/- 67.68, N = 3 SE +/- 74.84, N = 3 7085 6945 6780 6247 5181 5107 -flto -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc 1. (CXX) g++ options: -O3 -march=native -lpthread
OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: OpenMP Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 10.3 GCC 9.3 1600 3200 4800 6400 8000 SE +/- 14.89, N = 3 SE +/- 22.73, N = 3 SE +/- 20.42, N = 3 SE +/- 2.60, N = 3 SE +/- 3.18, N = 3 7507 7477 7029 5524 5451 1. (CXX) g++ options: -O3 -march=native -lpthread -lm -lgcc -lgcc_s -lc
OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Tasks Clang 12.0 Clang 12.0 LTO AMD AOCC 3.0 Clang 11.0 GCC 10.3 GCC 9.3 1600 3200 4800 6400 8000 SE +/- 33.67, N = 3 SE +/- 17.21, N = 3 SE +/- 41.46, N = 3 SE +/- 7.31, N = 3 SE +/- 31.52, N = 3 SE +/- 49.08, N = 3 7437 7367 7189 6836 5610 5414 -lm -lgcc -lgcc_s -lc -flto -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc 1. (CXX) g++ options: -O3 -march=native -lpthread
ViennaCL ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 9.3 GCC 10.3 GCC 11.0.1 20 40 60 80 100 SE +/- 0.07, N = 12 SE +/- 0.02, N = 14 SE +/- 0.08, N = 12 SE +/- 0.05, N = 15 SE +/- 0.60, N = 12 SE +/- 0.05, N = 15 73.0 84.0 84.4 97.9 98.5 99.3 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 11.0.1 GCC 9.3 GCC 10.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 1000 2000 3000 4000 5000 SE +/- 0.39, N = 3 SE +/- 1.69, N = 3 SE +/- 0.86, N = 3 SE +/- 10.41, N = 3 SE +/- 3.87, N = 3 SE +/- 5.98, N = 3 3462.66 3765.88 3820.77 4280.22 4590.37 4594.27 1. (CC) gcc options: -O3 -march=native -lm
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 9.3 GCC 10.3 90 180 270 360 450 SE +/- 1.73, N = 3 SE +/- 1.14, N = 3 SE +/- 0.05, N = 3 SE +/- 0.09, N = 3 SE +/- 0.11, N = 3 319.23 319.79 380.05 412.85 422.14 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen Clang 11.0 Clang 12.0 AMD AOCC 3.0 GCC 9.3 GCC 10.3 GCC 11.0.1 200 400 600 800 1000 SE +/- 2.03, N = 3 SE +/- 0.58, N = 3 613 614 617 806 807 809 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 0.6758 1.3516 2.0274 2.7032 3.379 SE +/- 0.00883, N = 3 SE +/- 0.00845, N = 3 SE +/- 0.02100, N = 3 SE +/- 0.02389, N = 3 SE +/- 0.00564, N = 3 3.00341 2.99759 2.36797 2.31859 2.28755 -fopenmp - MIN: 2.35 -fopenmp - MIN: 2.24 -fopenmp=libomp - MIN: 2.01 -fopenmp=libomp - MIN: 1.92 -fopenmp=libomp - MIN: 1.91 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 0.1355 0.271 0.4065 0.542 0.6775 SE +/- 0.001964, N = 3 SE +/- 0.001469, N = 3 SE +/- 0.002843, N = 3 SE +/- 0.001652, N = 3 SE +/- 0.000365, N = 3 0.602155 0.599140 0.491940 0.489278 0.459724 -fopenmp - MIN: 0.57 -fopenmp - MIN: 0.56 -fopenmp=libomp - MIN: 0.47 -fopenmp=libomp - MIN: 0.46 -fopenmp=libomp - MIN: 0.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 11.0.1 GCC 10.3 GCC 9.3 200 400 600 800 1000 SE +/- 0.67, N = 3 SE +/- 1.20, N = 3 SE +/- 0.88, N = 3 SE +/- 0.67, N = 3 605 614 616 771 772 785 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Clang 12.0 GCC 9.3 GCC 10.3 Clang 11.0 AMD AOCC 3.0 0.1598 0.3196 0.4794 0.6392 0.799 SE +/- 0.011383, N = 3 SE +/- 0.003317, N = 3 SE +/- 0.003112, N = 3 SE +/- 0.008914, N = 3 SE +/- 0.000764, N = 3 0.710124 0.654010 0.646252 0.594729 0.554231 -fopenmp=libomp - MIN: 0.64 -fopenmp - MIN: 0.59 -fopenmp - MIN: 0.6 -fopenmp=libomp - MIN: 0.53 -fopenmp=libomp - MIN: 0.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
FinanceBench FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 12.0 Clang 11.0 AMD AOCC 3.0 9K 18K 27K 36K 45K SE +/- 453.41, N = 14 SE +/- 102.23, N = 3 SE +/- 3.94, N = 3 SE +/- 64.93, N = 3 SE +/- 0.81, N = 3 SE +/- 9.32, N = 3 42399.81 34979.29 34199.60 33246.84 33178.50 33146.03 1. (CXX) g++ options: -O3 -march=native -fopenmp
SVT-AV1 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 4 - Input: 1080p GCC 9.3 GCC 10.3 Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 11.0.1 3 6 9 12 15 SE +/- 0.086, N = 3 SE +/- 0.111, N = 9 SE +/- 0.170, N = 3 SE +/- 0.189, N = 3 SE +/- 0.164, N = 4 SE +/- 0.139, N = 3 9.325 11.230 11.474 11.690 11.821 11.905 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Clang 12.0 GCC 9.3 GCC 10.3 Clang 11.0 AMD AOCC 3.0 0.4581 0.9162 1.3743 1.8324 2.2905 SE +/- 0.01922, N = 12 SE +/- 0.01150, N = 3 SE +/- 0.00384, N = 3 SE +/- 0.00118, N = 3 SE +/- 0.00195, N = 3 2.03606 1.66260 1.64268 1.60540 1.59597 -fopenmp=libomp - MIN: 1.81 -fopenmp - MIN: 1.59 -fopenmp - MIN: 1.58 -fopenmp=libomp - MIN: 1.55 -fopenmp=libomp - MIN: 1.54 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
ViennaCL ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T Clang 12.0 Clang 11.0 GCC 10.3 AMD AOCC 3.0 GCC 11.0.1 GCC 9.3 200 400 600 800 1000 SE +/- 4.04, N = 12 SE +/- 1.41, N = 14 SE +/- 66.49, N = 12 SE +/- 1.94, N = 12 SE +/- 2.88, N = 15 SE +/- 2.10, N = 14 626.0 677.0 741.4 783.0 794.0 798.0 -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp=libomp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
SVT-AV1 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 8 - Input: 1080p GCC 9.3 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 30 60 90 120 150 SE +/- 0.83, N = 3 SE +/- 1.05, N = 3 SE +/- 0.18, N = 3 SE +/- 0.33, N = 3 SE +/- 0.46, N = 3 SE +/- 0.10, N = 3 92.98 109.70 110.70 116.49 117.39 118.07 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
Coremark This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second AMD AOCC 3.0 Clang 12.0 Clang 11.0 GCC 9.3 GCC 10.3 GCC 11.0.1 500K 1000K 1500K 2000K 2500K SE +/- 3670.84, N = 3 SE +/- 984.68, N = 3 SE +/- 971.31, N = 3 SE +/- 4791.32, N = 3 SE +/- 2170.85, N = 3 SE +/- 5755.65, N = 3 1720060.44 1785466.28 1790837.01 2086609.98 2110880.43 2176407.67 1. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 12.0 Clang 11.0 AMD AOCC 3.0 1.0968 2.1936 3.2904 4.3872 5.484 SE +/- 0.0035, N = 3 SE +/- 0.0047, N = 3 SE +/- 0.0099, N = 3 SE +/- 0.0116, N = 3 SE +/- 0.0013, N = 3 SE +/- 0.0042, N = 3 4.8745 4.8699 4.8160 4.0058 3.9837 3.8811 1. (CXX) g++ options: -O3 -march=native -flto -pthread
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 11.0 Clang 12.0 AMD AOCC 3.0 0.085 0.17 0.255 0.34 0.425 SE +/- 0.004341, N = 3 SE +/- 0.000576, N = 3 SE +/- 0.000247, N = 3 SE +/- 0.000321, N = 3 SE +/- 0.000492, N = 3 0.377733 0.376992 0.315522 0.313689 0.301885 -fopenmp - MIN: 0.36 -fopenmp - MIN: 0.36 -fopenmp=libomp - MIN: 0.3 -fopenmp=libomp - MIN: 0.3 -fopenmp=libomp - MIN: 0.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 2048 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 9.3 GCC 10.3 GCC 11.0.1 12K 24K 36K 48K 60K SE +/- 756.91, N = 3 SE +/- 582.34, N = 3 SE +/- 439.50, N = 3 SE +/- 725.00, N = 3 SE +/- 743.81, N = 3 SE +/- 156.75, N = 3 44412 50084 51254 52749 53497 54710 1. (CC) gcc options: -pthread -O3 -march=native -lm
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 GCC 10.3 GCC 9.3 GCC 11.0.1 Clang 11.0 AMD AOCC 3.0 Clang 12.0 800M 1600M 2400M 3200M 4000M SE +/- 1679616.36, N = 3 SE +/- 3384441.53, N = 3 SE +/- 6016181.88, N = 3 SE +/- 1559202.08, N = 3 SE +/- 1543084.93, N = 3 SE +/- 883804.91, N = 3 3005033333 3012066667 3055766667 3596533333 3606466667 3643766667 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 140 280 420 560 700 SE +/- 0.61, N = 3 SE +/- 0.64, N = 3 SE +/- 9.50, N = 3 SE +/- 0.83, N = 3 SE +/- 0.53, N = 3 659.27 658.66 593.97 563.20 544.10 -fopenmp - MIN: 642.67 -fopenmp - MIN: 639.86 -fopenmp=libomp - MIN: 570.44 -fopenmp=libomp - MIN: 550.23 -fopenmp=libomp - MIN: 532.32 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU GCC 9.3 GCC 10.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 140 280 420 560 700 SE +/- 1.25, N = 3 SE +/- 0.83, N = 3 SE +/- 1.89, N = 3 SE +/- 0.25, N = 3 SE +/- 0.90, N = 3 659.19 658.28 590.18 562.97 544.31 -fopenmp - MIN: 642.05 -fopenmp - MIN: 639.78 -fopenmp=libomp - MIN: 575.41 -fopenmp=libomp - MIN: 551.49 -fopenmp=libomp - MIN: 531.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU GCC 10.3 GCC 9.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 140 280 420 560 700 SE +/- 1.86, N = 3 SE +/- 0.52, N = 3 SE +/- 3.02, N = 3 SE +/- 0.10, N = 3 SE +/- 0.62, N = 3 658.04 657.88 597.48 563.25 544.60 -fopenmp - MIN: 635.78 -fopenmp - MIN: 638.35 -fopenmp=libomp - MIN: 580.8 -fopenmp=libomp - MIN: 551.31 -fopenmp=libomp - MIN: 532.91 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 500 1000 1500 2000 2500 SE +/- 0.12, N = 3 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 1785.42 1785.45 1785.50 2038.15 2148.84 2149.15 1. (CC) gcc options: -O3 -march=native -lm
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 10.3 GCC 9.3 GCC 11.0.1 120 240 360 480 600 SE +/- 1.00, N = 3 SE +/- 0.33, N = 3 SE +/- 1.00, N = 3 457 463 466 544 547 550 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU GCC 9.3 Clang 11.0 Clang 12.0 GCC 10.3 AMD AOCC 3.0 2K 4K 6K 8K 10K SE +/- 138.76, N = 3 SE +/- 102.76, N = 8 SE +/- 88.25, N = 12 SE +/- 7.52, N = 3 SE +/- 171.77, N = 3 9419 9797 9904 10197 11325 -fopenmp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp=libomp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt Clang 11.0 Clang 12.0 AMD AOCC 3.0 GCC 9.3 GCC 10.3 90 180 270 360 450 SE +/- 2.03, N = 3 SE +/- 0.04, N = 3 SE +/- 1.17, N = 3 SE +/- 0.12, N = 3 SE +/- 0.95, N = 3 351.08 351.28 355.06 412.07 420.85 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Etcpak Etcpack is the self-proclaimed "fastest ETC compressor on the planet" with focused on providing open-source, very fast ETC and S3 texture compression support. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 Clang 11.0 GCC 10.3 GCC 9.3 AMD AOCC 3.0 Clang 12.0 Clang 12.0 LTO 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 168.82 173.23 174.81 178.85 202.09 202.10 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 9.3 GCC 10.3 1200 2400 3600 4800 6000 SE +/- 2.14, N = 3 SE +/- 0.05, N = 3 SE +/- 2.16, N = 3 SE +/- 42.69, N = 3 SE +/- 4.47, N = 3 4659.34 4891.07 4901.13 5484.68 5525.71 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 11.0 Clang 12.0 AMD AOCC 3.0 2 4 6 8 10 SE +/- 0.0029, N = 3 SE +/- 0.0011, N = 3 SE +/- 0.0034, N = 3 SE +/- 0.0026, N = 3 SE +/- 0.0028, N = 3 SE +/- 0.0015, N = 3 7.8537 7.8370 7.6989 6.7674 6.7647 6.6409 1. (CXX) g++ options: -O3 -march=native -flto -pthread
FLAC Audio Encoding This test times how long it takes to encode a sample WAV file to FLAC format five times. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.2 WAV To FLAC AMD AOCC 3.0 GCC 11.0.1 GCC 10.3 GCC 9.3 Clang 11.0 Clang 12.0 3 6 9 12 15 SE +/- 0.006, N = 5 SE +/- 0.006, N = 5 SE +/- 0.008, N = 5 SE +/- 0.011, N = 5 SE +/- 0.006, N = 5 SE +/- 0.007, N = 5 9.280 8.709 8.567 8.534 7.979 7.854 -fvisibility=hidden -fvisibility=hidden -fvisibility=hidden 1. (CXX) g++ options: -O3 -march=native -logg -lm
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 9.3 GCC 10.3 1200 2400 3600 4800 6000 SE +/- 4.78, N = 3 SE +/- 3.70, N = 3 SE +/- 1.35, N = 3 SE +/- 11.31, N = 3 SE +/- 5.42, N = 3 4682.46 4887.57 4895.56 5391.99 5529.40 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
LAME MP3 Encoding LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 11.0.1 GCC 10.3 GCC 9.3 2 4 6 8 10 SE +/- 0.003, N = 3 SE +/- 0.021, N = 3 SE +/- 0.008, N = 3 SE +/- 0.005, N = 3 SE +/- 0.006, N = 3 SE +/- 0.019, N = 3 8.256 8.250 8.142 7.473 7.231 7.011 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr 1. (CC) gcc options: -O3 -pipe -march=native -lncurses -lm
TSCP This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 12.0 Clang 11.0 AMD AOCC 3.0 400K 800K 1200K 1600K 2000K SE +/- 760.80, N = 5 SE +/- 956.77, N = 5 SE +/- 1626.80, N = 5 SE +/- 1798.40, N = 5 SE +/- 2852.59, N = 5 SE +/- 2098.00, N = 5 1446372 1467179 1494250 1570966 1638265 1697846 1. (CC) gcc options: -O3 -march=native
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 11.0.1 GCC 9.3 300 600 900 1200 1500 SE +/- 0.88, N = 3 SE +/- 1.53, N = 3 SE +/- 1.86, N = 3 SE +/- 0.33, N = 3 SE +/- 1.53, N = 3 1039 1057 1068 1076 1082 1217 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 10.3 GCC 11.0.1 GCC 9.3 30 60 90 120 150 SE +/- 0.53, N = 3 SE +/- 0.22, N = 3 SE +/- 0.06, N = 3 SE +/- 0.48, N = 3 SE +/- 1.53, N = 3 SE +/- 1.32, N = 3 118.87 103.93 103.83 103.60 103.01 101.54 -lstdc++ -lstdc++ -lstdc++ -lstdc++ 1. (CC) gcc options: -O3 -march=native -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: PartialTweets GCC 9.3 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 1.035 2.07 3.105 4.14 5.175 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 3.93 4.02 4.33 4.41 4.60 1. (CXX) g++ options: -O3 -march=native -pthread
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 GCC 9.3 GCC 10.3 Clang 11.0 Clang 12.0 Clang 12.0 LTO AMD AOCC 3.0 600 1200 1800 2400 3000 SE +/- 4.53, N = 3 SE +/- 2.06, N = 3 SE +/- 1.01, N = 3 SE +/- 1.92, N = 3 SE +/- 1.62, N = 3 SE +/- 2.28, N = 3 2338.9 2392.6 2640.2 2653.8 2657.8 2725.7 1. (CXX) g++ options: -O3 -march=native -rdynamic
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: DistinctUserID GCC 9.3 GCC 10.3 Clang 11.0 AMD AOCC 3.0 Clang 12.0 1.0395 2.079 3.1185 4.158 5.1975 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.98 4.13 4.41 4.47 4.62 1. (CXX) g++ options: -O3 -march=native -pthread
OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: LargeRandom Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 10.3 GCC 9.3 0.2115 0.423 0.6345 0.846 1.0575 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.81 0.82 0.84 0.90 0.94 1. (CXX) g++ options: -O3 -march=native -pthread
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU Clang 12.0 Clang 11.0 GCC 9.3 GCC 10.3 AMD AOCC 3.0 80 160 240 320 400 SE +/- 4.15, N = 4 SE +/- 1.42, N = 3 SE +/- 0.50, N = 3 SE +/- 0.17, N = 3 SE +/- 2.50, N = 3 333 346 351 351 386 -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp=libomp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
libavif avifenc This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6, Lossless GCC 9.3 GCC 11.0.1 GCC 10.3 Clang 11.0 AMD AOCC 3.0 Clang 12.0 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.22, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 29.08 27.06 26.91 26.03 25.78 25.22 1. (CXX) g++ options: -O3 -fPIC -lm
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 11.0.1 GCC 9.3 GCC 10.3 11K 22K 33K 44K 55K SE +/- 671.66, N = 15 SE +/- 542.47, N = 15 SE +/- 413.24, N = 15 SE +/- 227.13, N = 3 SE +/- 228.68, N = 3 SE +/- 844.19, N = 3 45428 45521 46676 51391 52099 52130 1. (CC) gcc options: -pthread -O3 -march=native -lm
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 11.0 Clang 12.0 AMD AOCC 3.0 0.2694 0.5388 0.8082 1.0776 1.347 SE +/- 0.00438, N = 3 SE +/- 0.00597, N = 3 SE +/- 0.00395, N = 3 SE +/- 0.00286, N = 3 SE +/- 0.00668, N = 3 1.19747 1.17434 1.07577 1.07507 1.04484 -fopenmp - MIN: 0.98 -fopenmp - MIN: 0.96 -fopenmp=libomp - MIN: 0.86 -fopenmp=libomp - MIN: 0.87 -fopenmp=libomp - MIN: 0.83 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 9.3 3K 6K 9K 12K 15K SE +/- 16.05, N = 3 SE +/- 45.16, N = 3 SE +/- 41.35, N = 3 SE +/- 20.33, N = 3 SE +/- 24.25, N = 3 SE +/- 67.28, N = 3 12576 12765 13192 13324 13333 14399 1. (CC) gcc options: -pthread -O3 -march=native -lm
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 9.3 GCC 10.3 70 140 210 280 350 SE +/- 0.09, N = 3 SE +/- 0.03, N = 3 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 SE +/- 0.52, N = 3 299.21 305.00 315.41 337.36 341.85 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 Clang 11.0 Clang 12.0 AMD AOCC 3.0 GCC 9.3 GCC 11.0.1 GCC 10.3 4K 8K 12K 16K 20K SE +/- 129.55, N = 3 SE +/- 48.79, N = 3 SE +/- 5.33, N = 3 SE +/- 170.19, N = 8 SE +/- 168.99, N = 3 SE +/- 108.41, N = 3 14590 15649 16146 16590 16590 16650 1. (CC) gcc options: -pthread -O3 -march=native -lm
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 11.0 Clang 12.0 AMD AOCC 3.0 0.2653 0.5306 0.7959 1.0612 1.3265 SE +/- 0.00296, N = 3 SE +/- 0.00349, N = 3 SE +/- 0.00127, N = 3 SE +/- 0.00199, N = 3 SE +/- 0.00160, N = 3 1.17894 1.17486 1.08011 1.07701 1.03899 -fopenmp - MIN: 1.12 -fopenmp - MIN: 1.12 -fopenmp=libomp - MIN: 1.03 -fopenmp=libomp - MIN: 1.04 -fopenmp=libomp - MIN: 0.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression GCC 10.3 GCC 9.3 GCC 11.0.1 AMD AOCC 3.0 Clang 12.0 Clang 11.0 2 4 6 8 10 SE +/- 0.006, N = 3 SE +/- 0.009, N = 3 SE +/- 0.021, N = 3 SE +/- 0.009, N = 3 SE +/- 0.004, N = 3 SE +/- 0.018, N = 3 7.078 7.053 7.003 6.578 6.309 6.243 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -lpng16 -ljpeg
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 AMD AOCC 3.0 30 60 90 120 150 SE +/- 0.29, N = 3 SE +/- 0.50, N = 3 SE +/- 0.17, N = 3 SE +/- 0.44, N = 3 SE +/- 0.50, N = 3 108 112 115 116 122 -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp=libomp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 10.3 GCC 9.3 GCC 11.0.1 500 1000 1500 2000 2500 SE +/- 12.41, N = 3 SE +/- 4.63, N = 3 SE +/- 6.57, N = 3 SE +/- 1.20, N = 3 SE +/- 1.20, N = 3 SE +/- 4.81, N = 3 1915 1929 1993 2112 2129 2161 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 11.0.1 GCC 9.3 GCC 10.3 13M 26M 39M 52M 65M SE +/- 790005.27, N = 3 SE +/- 40360.87, N = 3 SE +/- 47026.00, N = 3 SE +/- 318169.94, N = 3 SE +/- 870702.21, N = 3 SE +/- 6887.99, N = 3 55663000 56307000 57411333 60886333 61404000 62467333 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 10.3 GCC 9.3 70 140 210 280 350 SE +/- 0.15, N = 3 SE +/- 0.06, N = 3 SE +/- 0.16, N = 3 SE +/- 0.44, N = 3 SE +/- 0.04, N = 3 302.41 303.81 321.19 325.39 339.07 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU GCC 9.3 GCC 10.3 Clang 11.0 AMD AOCC 3.0 Clang 12.0 0.8264 1.6528 2.4792 3.3056 4.132 SE +/- 0.03246, N = 3 SE +/- 0.02637, N = 3 SE +/- 0.04735, N = 3 SE +/- 0.02018, N = 3 SE +/- 0.01639, N = 3 3.67278 3.61144 3.52787 3.41583 3.28507 -fopenmp - MIN: 3.39 -fopenmp - MIN: 3.37 -fopenmp=libomp - MIN: 3.29 -fopenmp=libomp - MIN: 3.24 -fopenmp=libomp - MIN: 3.15 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2K 4K 6K 8K 10K SE +/- 15.16, N = 3 SE +/- 43.38, N = 3 SE +/- 101.36, N = 3 SE +/- 57.26, N = 3 SE +/- 48.56, N = 3 SE +/- 20.21, N = 3 9438.6 9603.2 9862.0 10179.0 10205.0 10548.0 1. (CC) gcc options: -pthread -O3 -march=native -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 1024 Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 11.0.1 GCC 10.3 GCC 9.3 2K 4K 6K 8K 10K SE +/- 45.95, N = 3 SE +/- 14.28, N = 3 SE +/- 48.25, N = 3 SE +/- 25.87, N = 3 SE +/- 41.68, N = 3 SE +/- 19.46, N = 3 8809.6 8902.1 9088.3 9238.4 9247.3 9798.6 1. (CC) gcc options: -pthread -O3 -march=native -lm
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 11.0 AMD AOCC 3.0 Clang 12.0 60K 120K 180K 240K 300K SE +/- 537.86, N = 3 SE +/- 1024.96, N = 3 SE +/- 675.55, N = 3 SE +/- 407.86, N = 3 SE +/- 251.99, N = 3 SE +/- 1778.47, N = 3 238935 242700 243861 260119 264637 265204 1. (CC) gcc options: -pedantic -O3
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p Clang 11.0 Clang 12.0 GCC 9.3 GCC 10.3 GCC 11.0.1 20 40 60 80 100 SE +/- 0.53, N = 3 SE +/- 0.31, N = 3 SE +/- 1.10, N = 8 SE +/- 1.76, N = 3 SE +/- 1.15, N = 8 100.55 103.17 106.55 107.46 111.27 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Compression Effort 5 AMD AOCC 3.0 Clang 11.0 GCC 10.3 GCC 9.3 Clang 12.0 2 4 6 8 10 SE +/- 0.028, N = 3 SE +/- 0.022, N = 3 SE +/- 0.017, N = 3 SE +/- 0.017, N = 3 SE +/- 0.006, N = 3 7.403 7.366 6.934 6.753 6.690 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 1024 Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 11.0.1 GCC 10.3 GCC 9.3 3K 6K 9K 12K 15K SE +/- 35.53, N = 3 SE +/- 34.64, N = 3 SE +/- 27.10, N = 3 SE +/- 189.35, N = 3 SE +/- 32.26, N = 3 SE +/- 44.20, N = 3 10564 10669 10805 11044 11319 11689 1. (CC) gcc options: -pthread -O3 -march=native -lm
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Write - Average Latency GCC 11.0.1 GCC 10.3 GCC 9.3 Clang 11.0 Clang 12.0 0.3998 0.7996 1.1994 1.5992 1.999 SE +/- 0.029, N = 3 SE +/- 0.013, N = 3 SE +/- 0.028, N = 3 SE +/- 0.011, N = 3 SE +/- 0.004, N = 3 1.777 1.701 1.688 1.626 1.607 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Write GCC 11.0.1 GCC 10.3 GCC 9.3 Clang 11.0 Clang 12.0 13K 26K 39K 52K 65K SE +/- 899.82, N = 3 SE +/- 469.64, N = 3 SE +/- 994.44, N = 3 SE +/- 400.92, N = 3 SE +/- 162.92, N = 3 56369 58894 59364 61616 62319 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 2048 Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 11.0.1 GCC 10.3 GCC 9.3 2K 4K 6K 8K 10K SE +/- 28.76, N = 3 SE +/- 39.89, N = 3 SE +/- 7.75, N = 3 SE +/- 55.19, N = 3 SE +/- 14.75, N = 3 SE +/- 37.69, N = 3 10004.2 10227.0 10467.0 10675.0 10711.0 11053.0 1. (CC) gcc options: -pthread -O3 -march=native -lm
libavif avifenc This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 2 GCC 9.3 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 7 14 21 28 35 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 27.78 27.39 27.10 25.60 25.47 25.18 1. (CXX) g++ options: -O3 -fPIC -lm
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 11.0.1 GCC 10.3 GCC 9.3 400M 800M 1200M 1600M 2000M SE +/- 2255610.29, N = 3 SE +/- 1331665.62, N = 3 SE +/- 2130988.29, N = 3 SE +/- 17297784.06, N = 3 SE +/- 15763988.50, N = 3 SE +/- 4864497.23, N = 3 1564833333 1578400000 1609633333 1679800000 1718000000 1721900000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 10.3 GCC 11.0.1 GCC 9.3 5K 10K 15K 20K 25K SE +/- 348.10, N = 9 SE +/- 220.77, N = 3 SE +/- 349.17, N = 9 SE +/- 538.47, N = 9 SE +/- 160.97, N = 3 SE +/- 106.49, N = 3 22797 22913 23111 23774 24888 25068 1. (CC) gcc options: -pthread -O3 -march=native -lm
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform Clang 12.0 GCC 9.3 GCC 11.0.1 GCC 10.3 AMD AOCC 3.0 Clang 11.0 90 180 270 360 450 SE +/- 0.46, N = 3 SE +/- 0.66, N = 3 SE +/- 1.03, N = 3 SE +/- 0.25, N = 3 SE +/- 0.70, N = 3 SE +/- 0.67, N = 3 363.85 384.03 388.88 388.98 398.96 399.16 1. (CC) gcc options: -O3 -march=native -lm
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p Clang 11.0 Clang 12.0 GCC 9.3 GCC 10.3 GCC 11.0.1 20 40 60 80 100 SE +/- 0.51, N = 3 SE +/- 1.07, N = 3 SE +/- 0.89, N = 3 SE +/- 0.65, N = 3 SE +/- 0.47, N = 3 86.09 88.78 91.97 93.05 94.40 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
libavif avifenc This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6 GCC 10.3 GCC 9.3 GCC 11.0.1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 3 6 9 12 15 SE +/- 0.032, N = 3 SE +/- 0.031, N = 3 SE +/- 0.052, N = 3 SE +/- 0.016, N = 3 SE +/- 0.022, N = 3 SE +/- 0.014, N = 3 10.417 10.399 10.291 9.725 9.536 9.510 1. (CXX) g++ options: -O3 -fPIC -lm
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 300 600 900 1200 1500 SE +/- 1.75, N = 3 SE +/- 3.05, N = 3 SE +/- 3.61, N = 3 SE +/- 7.11, N = 3 SE +/- 1.97, N = 3 1379.51 1358.56 1307.49 1277.62 1259.59 -fopenmp - MIN: 1361.6 -fopenmp - MIN: 1337.17 -fopenmp=libomp - MIN: 1293.38 -fopenmp=libomp - MIN: 1252.39 -fopenmp=libomp - MIN: 1247.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU GCC 10.3 GCC 9.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 300 600 900 1200 1500 SE +/- 3.72, N = 3 SE +/- 4.44, N = 3 SE +/- 3.92, N = 3 SE +/- 9.46, N = 3 SE +/- 5.94, N = 3 1382.41 1357.29 1302.70 1276.04 1267.18 -fopenmp - MIN: 1360.58 -fopenmp - MIN: 1335.63 -fopenmp=libomp - MIN: 1289.86 -fopenmp=libomp - MIN: 1249.65 -fopenmp=libomp - MIN: 1248.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
libavif avifenc This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 0 GCC 9.3 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 12 24 36 48 60 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 52.22 51.45 51.03 48.13 47.89 47.88 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10 GCC 9.3 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 0.8233 1.6466 2.4699 3.2932 4.1165 SE +/- 0.016, N = 3 SE +/- 0.022, N = 3 SE +/- 0.002, N = 3 SE +/- 0.004, N = 3 SE +/- 0.010, N = 3 SE +/- 0.014, N = 3 3.659 3.643 3.607 3.543 3.429 3.361 1. (CXX) g++ options: -O3 -fPIC -lm
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p GCC 9.3 GCC 10.3 Clang 11.0 Clang 12.0 GCC 11.0.1 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.25, N = 3 SE +/- 0.13, N = 3 SE +/- 0.27, N = 3 SE +/- 0.28, N = 3 24.84 26.49 26.61 26.85 27.01 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Lossless Compression GCC 10.3 Clang 11.0 GCC 9.3 AMD AOCC 3.0 Clang 12.0 90 180 270 360 450 SE +/- 3.10, N = 3 SE +/- 0.17, N = 3 SE +/- 1.92, N = 3 SE +/- 0.39, N = 3 SE +/- 0.49, N = 3 406.03 392.85 388.95 382.99 374.04 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 95, Compression Effort 7 GCC 9.3 GCC 10.3 Clang 12.0 AMD AOCC 3.0 Clang 11.0 50 100 150 200 250 SE +/- 1.32, N = 3 SE +/- 0.46, N = 3 SE +/- 0.07, N = 3 SE +/- 0.17, N = 3 SE +/- 0.66, N = 3 220.94 215.57 207.01 205.03 203.63 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU GCC 10.3 GCC 9.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 300 600 900 1200 1500 SE +/- 2.65, N = 3 SE +/- 4.57, N = 3 SE +/- 1.78, N = 3 SE +/- 9.75, N = 3 SE +/- 0.58, N = 3 1375.71 1356.91 1305.10 1271.91 1268.08 -fopenmp - MIN: 1355.68 -fopenmp - MIN: 1335.04 -fopenmp=libomp - MIN: 1294.76 -fopenmp=libomp - MIN: 1252.33 -fopenmp=libomp - MIN: 1257.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 75, Compression Effort 7 GCC 9.3 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 30 60 90 120 150 SE +/- 0.10, N = 3 SE +/- 0.16, N = 3 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 118.45 116.66 109.81 109.64 109.53 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
LZ4 Compression This test measures the time needed to compress/decompress a sample file (an Ubuntu ISO) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed Clang 12.0 LTO Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 11.0.1 GCC 9.3 GCC 10.3 12 24 36 48 60 SE +/- 0.74, N = 3 SE +/- 0.42, N = 3 SE +/- 0.46, N = 3 SE +/- 0.26, N = 3 SE +/- 0.65, N = 3 SE +/- 0.65, N = 5 SE +/- 0.72, N = 4 48.47 48.50 49.01 50.32 51.17 51.97 52.36 1. (CC) gcc options: -O3
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 2048 AMD AOCC 3.0 Clang 12.0 Clang 11.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2K 4K 6K 8K 10K SE +/- 19.99, N = 3 SE +/- 65.76, N = 3 SE +/- 27.38, N = 3 SE +/- 56.49, N = 3 SE +/- 36.00, N = 3 SE +/- 50.36, N = 3 7784.8 7789.9 7878.5 8134.5 8231.1 8408.5 1. (CC) gcc options: -pthread -O3 -march=native -lm
Timed MrBayes Analysis This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis GCC 10.3 Clang 12.0 LTO GCC 11.0.1 GCC 9.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 20 40 60 80 100 SE +/- 1.29, N = 4 SE +/- 1.09, N = 3 SE +/- 0.33, N = 3 SE +/- 0.16, N = 3 SE +/- 0.98, N = 3 SE +/- 0.98, N = 3 SE +/- 0.26, N = 3 93.66 93.63 89.43 89.16 89.12 88.62 86.74 -mabm -flto -mabm -mabm 1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -O3 -std=c99 -pedantic -march=native -lm
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate AMD AOCC 3.0 Clang 11.0 GCC 10.3 GCC 11.0.1 GCC 9.3 Clang 12.0 150 300 450 600 750 SE +/- 1.33, N = 3 SE +/- 5.21, N = 3 SE +/- 6.43, N = 3 SE +/- 2.60, N = 3 660 665 689 694 709 712 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
SVT-HEVC This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p GCC 9.3 GCC 11.0.1 GCC 10.3 AMD AOCC 3.0 Clang 12.0 Clang 11.0 140 280 420 560 700 SE +/- 3.83, N = 3 SE +/- 5.75, N = 3 SE +/- 2.42, N = 3 SE +/- 3.03, N = 3 SE +/- 3.01, N = 3 SE +/- 5.55, N = 3 605.50 611.73 615.62 638.10 643.58 652.74 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 1.11, N = 6 SE +/- 0.12, N = 3 SE +/- 1.37, N = 3 SE +/- 0.12, N = 3 SE +/- 0.43, N = 3 SE +/- 0.60, N = 3 95.96 91.99 90.53 90.43 90.26 89.09 -lstdc++ -lstdc++ -lstdc++ -lstdc++ 1. (CC) gcc options: -O3 -march=native -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 12.0 Clang 11.0 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 6.69 6.87 6.95 7.10 7.20 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-HEVC This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p GCC 9.3 GCC 11.0.1 GCC 10.3 AMD AOCC 3.0 Clang 12.0 Clang 11.0 80 160 240 320 400 SE +/- 1.20, N = 3 SE +/- 1.51, N = 3 SE +/- 1.54, N = 3 SE +/- 1.09, N = 3 SE +/- 1.56, N = 3 SE +/- 3.43, N = 3 322.42 329.32 330.53 343.85 345.30 346.89 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI GCC 10.3 Clang 11.0 Clang 12.0 AMD AOCC 3.0 GCC 9.3 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 79.12 79.15 82.64 82.83 84.86 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time GCC 9.3 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 3 6 9 12 15 SE +/- 0.053, N = 3 SE +/- 0.049, N = 3 SE +/- 0.026, N = 3 SE +/- 0.032, N = 3 SE +/- 0.041, N = 3 9.968 9.570 9.494 9.408 9.296 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lSDL -lXpm -lSM -lICE -lX11 -lIlmImf -lImath -lHalf -lIex -lIexMath -lIlmThread -lpthread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 1024 AMD AOCC 3.0 Clang 12.0 Clang 11.0 GCC 11.0.1 GCC 10.3 GCC 9.3 11K 22K 33K 44K 55K SE +/- 621.84, N = 15 SE +/- 952.64, N = 12 SE +/- 585.78, N = 3 SE +/- 568.96, N = 3 SE +/- 439.64, N = 15 SE +/- 788.42, N = 3 49685 50350 50740 51706 52054 53275 1. (CC) gcc options: -pthread -O3 -march=native -lm
libavif avifenc This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10, Lossless GCC 11.0.1 GCC 9.3 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 2 4 6 8 10 SE +/- 0.017, N = 3 SE +/- 0.022, N = 3 SE +/- 0.007, N = 3 SE +/- 0.022, N = 3 SE +/- 0.011, N = 3 SE +/- 0.013, N = 3 6.149 6.131 6.107 5.948 5.879 5.746 1. (CXX) g++ options: -O3 -fPIC -lm
SVT-HEVC This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p GCC 9.3 GCC 11.0.1 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 9 18 27 36 45 SE +/- 0.18, N = 3 SE +/- 0.17, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 SE +/- 0.17, N = 3 38.41 38.86 39.03 40.95 41.01 41.09 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Write GCC 10.3 GCC 11.0.1 GCC 9.3 Clang 11.0 Clang 12.0 12K 24K 36K 48K 60K SE +/- 591.89, N = 7 SE +/- 211.73, N = 3 SE +/- 396.40, N = 3 SE +/- 883.12, N = 3 SE +/- 702.52, N = 15 53019 53102 53825 54488 56684 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
JPEG XL The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: PNG - Encode Speed: 7 AMD AOCC 3.0 Clang 11.0 Clang 12.0 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 11.37 12.01 12.15 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Write - Average Latency GCC 10.3 GCC 11.0.1 GCC 9.3 Clang 11.0 Clang 12.0 1.0645 2.129 3.1935 4.258 5.3225 SE +/- 0.052, N = 7 SE +/- 0.021, N = 3 SE +/- 0.034, N = 3 SE +/- 0.074, N = 3 SE +/- 0.054, N = 15 4.731 4.722 4.657 4.603 4.431 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
JPEG XL The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: PNG - Encode Speed: 5 Clang 12.0 Clang 11.0 AMD AOCC 3.0 20 40 60 80 100 SE +/- 0.17, N = 3 SE +/- 0.24, N = 3 SE +/- 0.41, N = 3 74.27 78.41 79.23 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 11.0.1 GCC 9.3 Clang 11.0 Clang 12.0 GCC 10.3 AMD AOCC 3.0 150 300 450 600 750 SE +/- 0.29, N = 3 SE +/- 0.14, N = 3 SE +/- 0.40, N = 3 SE +/- 0.40, N = 3 SE +/- 1.71, N = 3 SE +/- 0.18, N = 3 647.82 668.10 674.86 675.13 682.87 690.94 1. (CC) gcc options: -O3 -march=native -lm
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K GCC 9.3 GCC 10.3 Clang 11.0 Clang 12.0 GCC 11.0.1 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 16.29 17.03 17.13 17.22 17.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Default GCC 10.3 AMD AOCC 3.0 GCC 9.3 Clang 11.0 Clang 12.0 0.6566 1.3132 1.9698 2.6264 3.283 SE +/- 0.032, N = 7 SE +/- 0.010, N = 3 SE +/- 0.038, N = 3 SE +/- 0.031, N = 3 SE +/- 0.027, N = 3 2.918 2.816 2.778 2.743 2.739 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K Clang 11.0 Clang 12.0 GCC 9.3 GCC 10.3 GCC 11.0.1 9 18 27 36 45 SE +/- 0.31, N = 3 SE +/- 0.43, N = 3 SE +/- 0.38, N = 3 SE +/- 0.19, N = 3 SE +/- 0.29, N = 3 37.28 38.11 39.12 39.32 39.71 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K Clang 12.0 GCC 10.3 Clang 11.0 GCC 11.0.1 GCC 9.3 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 6 8.99 9.10 9.14 9.41 9.57 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
x265 This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K GCC 10.3 GCC 11.0.1 GCC 9.3 Clang 11.0 Clang 12.0 AMD AOCC 3.0 7 14 21 28 35 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 SE +/- 0.25, N = 3 SE +/- 0.23, N = 3 SE +/- 0.13, N = 3 28.60 28.79 28.91 29.94 30.32 30.44 1. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K Clang 11.0 Clang 12.0 GCC 9.3 GCC 10.3 GCC 11.0.1 8 16 24 32 40 SE +/- 0.22, N = 3 SE +/- 0.48, N = 3 SE +/- 0.12, N = 3 SE +/- 0.19, N = 3 SE +/- 0.47, N = 3 33.14 33.39 34.56 35.26 35.26 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 12.0 Clang 11.0 0.1193 0.2386 0.3579 0.4772 0.5965 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.50 0.52 0.52 0.53 0.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Tachyon This is a test of the threaded Tachyon, a parallel ray-tracing system, measuring the time to ray-trace a sample scene. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Tachyon 0.99b6 Total Time Clang 11.0 GCC 10.3 AMD AOCC 3.0 Clang 12.0 GCC 9.3 GCC 11.0.1 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 16.41 16.15 16.06 16.05 15.68 15.50 1. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread
LZ4 Compression This test measures the time needed to compress/decompress a sample file (an Ubuntu ISO) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed Clang 12.0 LTO GCC 11.0.1 Clang 12.0 Clang 11.0 GCC 10.3 AMD AOCC 3.0 GCC 9.3 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.73, N = 4 SE +/- 0.80, N = 3 SE +/- 0.33, N = 3 SE +/- 0.01, N = 3 SE +/- 0.48, N = 3 SE +/- 0.77, N = 4 50.93 51.32 52.07 52.35 52.87 53.77 53.83 1. (CC) gcc options: -O3
SVT-VP9 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample YUV input video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 12.0 AMD AOCC 3.0 Clang 11.0 80 160 240 320 400 SE +/- 3.83, N = 3 SE +/- 0.47, N = 3 SE +/- 0.70, N = 3 SE +/- 1.11, N = 3 SE +/- 2.72, N = 3 SE +/- 1.91, N = 3 354.21 364.12 366.39 372.49 373.89 373.99 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 11.0 Clang 12.0 AMD AOCC 3.0 700M 1400M 2100M 2800M 3500M SE +/- 2961043.36, N = 3 SE +/- 4643753.27, N = 3 SE +/- 1154700.54, N = 3 SE +/- 2452436.43, N = 3 SE +/- 6045475.81, N = 3 SE +/- 1234233.91, N = 3 2940466667 2942866667 2989400000 3051366667 3070633333 3100400000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Only GCC 11.0.1 GCC 9.3 Clang 12.0 GCC 10.3 Clang 11.0 5K 10K 15K 20K 25K SE +/- 281.76, N = 15 SE +/- 41.57, N = 3 SE +/- 303.43, N = 3 SE +/- 118.05, N = 3 SE +/- 289.16, N = 3 23661 23895 24310 24845 24943 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless GCC 9.3 AMD AOCC 3.0 Clang 12.0 GCC 10.3 Clang 11.0 GCC 11.0.1 5 10 15 20 25 SE +/- 0.13, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 19.30 19.13 19.02 18.88 18.57 18.31 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -lpng16 -ljpeg
SVT-VP9 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample YUV input video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p GCC 9.3 GCC 11.0.1 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 110 220 330 440 550 SE +/- 0.82, N = 3 SE +/- 1.15, N = 3 SE +/- 0.24, N = 3 SE +/- 2.67, N = 3 SE +/- 0.23, N = 3 SE +/- 1.37, N = 3 463.12 472.32 472.61 476.95 481.05 487.43 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p GCC 9.3 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 110 220 330 440 550 SE +/- 0.32, N = 3 SE +/- 2.08, N = 3 SE +/- 1.13, N = 3 SE +/- 1.94, N = 3 SE +/- 1.76, N = 3 SE +/- 0.73, N = 3 464.57 477.67 478.16 478.62 482.02 488.23 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Only - Average Latency GCC 11.0.1 GCC 9.3 Clang 12.0 GCC 10.3 Clang 11.0 0.0095 0.019 0.0285 0.038 0.0475 SE +/- 0.001, N = 15 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 0.042 0.042 0.041 0.040 0.040 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K GCC 9.3 Clang 12.0 Clang 11.0 GCC 10.3 GCC 11.0.1 0.0473 0.0946 0.1419 0.1892 0.2365 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.20 0.21 0.21 0.21 0.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt Clang 11.0 GCC 10.3 AMD AOCC 3.0 GCC 9.3 Clang 12.0 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 80.22 81.45 82.95 84.13 84.23 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Default GCC 9.3 GCC 11.0.1 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 0.3143 0.6286 0.9429 1.2572 1.5715 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 1.397 1.386 1.372 1.351 1.336 1.331 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -lpng16 -ljpeg
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization Clang 12.0 AMD AOCC 3.0 Clang 11.0 GCC 9.3 GCC 10.3 GCC 11.0.1 2K 4K 6K 8K 10K SE +/- 7.16, N = 3 SE +/- 0.22, N = 3 SE +/- 77.81, N = 3 SE +/- 28.39, N = 3 SE +/- 33.93, N = 3 SE +/- 25.06, N = 3 8848.40 9021.83 9146.88 9178.97 9248.89 9263.55 1. (CC) gcc options: -O3 -march=native -lm
dav1d Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Chimera 1080p GCC 9.3 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 300 600 900 1200 1500 SE +/- 5.12, N = 3 SE +/- 3.74, N = 3 SE +/- 1.75, N = 3 SE +/- 0.97, N = 3 SE +/- 6.69, N = 3 SE +/- 2.95, N = 3 1145.50 1171.04 1180.44 1188.43 1190.41 1198.22 -lm - MIN: 664.19 / MAX: 1441.54 -lm - MIN: 683.28 / MAX: 1473.51 -lm - MIN: 680.31 / MAX: 1485.74 -lm - MIN: 703.73 / MAX: 1484.94 -lm - MIN: 685.16 / MAX: 1496.36 MIN: 700.24 / MAX: 1494.16 1. (CC) gcc options: -O3 -march=native -pthread
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt GCC 9.3 Clang 11.0 GCC 10.3 AMD AOCC 3.0 Clang 12.0 30 60 90 120 150 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 SE +/- 0.32, N = 3 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 127.34 127.74 127.78 128.01 133.05 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 GCC 9.3 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 30 60 90 120 150 SE +/- 0.09, N = 3 SE +/- 0.33, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 127.30 127.74 127.77 128.59 132.82 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 11.0.1 Clang 12.0 GCC 9.3 GCC 10.3 AMD AOCC 3.0 Clang 11.0 700 1400 2100 2800 3500 SE +/- 5.19, N = 3 SE +/- 1.11, N = 3 SE +/- 5.86, N = 3 SE +/- 6.50, N = 3 SE +/- 1.29, N = 3 SE +/- 15.12, N = 3 3182.35 3190.62 3229.22 3235.94 3298.29 3319.34 1. (CC) gcc options: -O3 -march=native -lm
Gcrypt Library Libgcrypt is a general purpose cryptographic library developed as part of the GnuPG project. This is a benchmark of libgcrypt's integrated benchmark and is measuring the time to run the benchmark command with a cipher/mac/hash repetition count set for 50 times as simple, high level look at the overall crypto performance of the system under test. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 11.0.1 GCC 9.3 GCC 10.3 50 100 150 200 250 SE +/- 0.82, N = 3 SE +/- 0.28, N = 3 SE +/- 0.44, N = 3 SE +/- 0.18, N = 3 SE +/- 0.54, N = 3 SE +/- 0.32, N = 3 240.41 240.21 236.92 233.51 232.57 231.24 1. (CC) gcc options: -O3 -march=native -fvisibility=hidden
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 11.0.1 GCC 10.3 GCC 9.3 1500 3000 4500 6000 7500 SE +/- 35.20, N = 3 SE +/- 60.67, N = 3 SE +/- 65.81, N = 3 SE +/- 23.67, N = 3 SE +/- 25.90, N = 3 SE +/- 30.40, N = 3 6744.1 6823.8 6875.3 6948.2 6974.0 7007.3 1. (CC) gcc options: -pthread -O3 -march=native -lm
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive GCC 11.0.1 GCC 9.3 GCC 10.3 Clang 11.0 Clang 12.0 AMD AOCC 3.0 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 19.62 19.48 19.46 19.03 18.99 18.91 1. (CXX) g++ options: -O3 -march=native -flto -pthread
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression GCC 9.3 GCC 10.3 Clang 12.0 AMD AOCC 3.0 GCC 11.0.1 Clang 11.0 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 39.07 38.55 38.45 38.34 37.95 37.73 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -lpng16 -ljpeg
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 12.0 Clang 11.0 1.1138 2.2276 3.3414 4.4552 5.569 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 4.78 4.84 4.84 4.87 4.95 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 GCC 11.0.1 GCC 9.3 AMD AOCC 3.0 Clang 11.0 GCC 10.3 Clang 12.0 0.5117 1.0234 1.5351 2.0468 2.5585 SE +/- 0.007, N = 3 SE +/- 0.005, N = 3 SE +/- 0.001, N = 3 SE +/- 0.000, N = 3 SE +/- 0.010, N = 3 SE +/- 0.001, N = 3 2.274 2.273 2.262 2.240 2.225 2.199 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -lpng16 -ljpeg
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 2048 AMD AOCC 3.0 GCC 9.3 GCC 11.0.1 Clang 11.0 Clang 12.0 GCC 10.3 7K 14K 21K 28K 35K SE +/- 378.89, N = 6 SE +/- 37.37, N = 3 SE +/- 209.56, N = 3 SE +/- 146.10, N = 3 SE +/- 77.17, N = 3 SE +/- 14.99, N = 3 31013 31341 31662 31741 31935 32061 1. (CC) gcc options: -pthread -O3 -march=native -lm
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: Kostya Clang 11.0 AMD AOCC 3.0 Clang 12.0 GCC 9.3 GCC 10.3 0.6233 1.2466 1.8699 2.4932 3.1165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.68 2.73 2.75 2.75 2.77 1. (CXX) g++ options: -O3 -march=native -pthread
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p GCC 9.3 GCC 10.3 Clang 11.0 GCC 11.0.1 Clang 12.0 5 10 15 20 25 SE +/- 0.11, N = 3 SE +/- 0.09, N = 3 SE +/- 0.15, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 21.42 21.64 22.00 22.11 22.13 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
JPEG XL The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: JPEG - Encode Speed: 8 Clang 11.0 AMD AOCC 3.0 Clang 12.0 7 14 21 28 35 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 27.24 27.29 28.13 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Only - Average Latency GCC 9.3 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 0.0214 0.0428 0.0642 0.0856 0.107 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.095 0.094 0.094 0.093 0.092 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Only GCC 9.3 Clang 12.0 Clang 11.0 GCC 10.3 GCC 11.0.1 200K 400K 600K 800K 1000K SE +/- 1623.23, N = 3 SE +/- 720.87, N = 3 SE +/- 1740.88, N = 3 SE +/- 183.22, N = 3 SE +/- 1514.63, N = 3 1057125 1069022 1069367 1076357 1090824 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Write Clang 12.0 GCC 9.3 Clang 11.0 GCC 10.3 GCC 11.0.1 700 1400 2100 2800 3500 SE +/- 3.48, N = 3 SE +/- 4.79, N = 3 SE +/- 14.62, N = 3 SE +/- 11.40, N = 3 SE +/- 28.00, N = 3 3281 3298 3312 3369 3383 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
x265 This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p GCC 11.0.1 GCC 9.3 GCC 10.3 Clang 11.0 AMD AOCC 3.0 Clang 12.0 16 32 48 64 80 SE +/- 0.56, N = 3 SE +/- 0.26, N = 3 SE +/- 0.32, N = 3 SE +/- 0.49, N = 3 SE +/- 0.63, N = 3 SE +/- 0.49, N = 3 71.79 72.14 72.60 73.36 73.51 74.00 1. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Write - Average Latency Clang 12.0 GCC 9.3 Clang 11.0 GCC 10.3 GCC 11.0.1 0.0686 0.1372 0.2058 0.2744 0.343 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 0.305 0.303 0.302 0.297 0.296 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
LZ4 Compression This test measures the time needed to compress/decompress a sample file (an Ubuntu ISO) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed AMD AOCC 3.0 Clang 12.0 LTO GCC 10.3 GCC 11.0.1 GCC 9.3 Clang 12.0 Clang 11.0 3K 6K 9K 12K 15K SE +/- 33.89, N = 3 SE +/- 46.50, N = 3 SE +/- 6.60, N = 4 SE +/- 62.74, N = 3 SE +/- 17.75, N = 5 SE +/- 65.90, N = 3 SE +/- 23.21, N = 3 13561.5 13698.7 13806.6 13857.4 13895.3 13926.5 13927.9 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed AMD AOCC 3.0 Clang 12.0 LTO GCC 9.3 Clang 11.0 GCC 11.0.1 GCC 10.3 Clang 12.0 3K 6K 9K 12K 15K SE +/- 73.30, N = 3 SE +/- 60.82, N = 3 SE +/- 37.19, N = 4 SE +/- 15.91, N = 3 SE +/- 34.44, N = 4 SE +/- 42.32, N = 3 SE +/- 71.01, N = 3 13562.5 13715.0 13793.4 13840.3 13882.2 13906.1 13911.5 1. (CC) gcc options: -O3
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode Clang 12.0 GCC 9.3 GCC 10.3 Clang 11.0 GCC 11.0.1 2 4 6 8 10 SE +/- 0.013, N = 5 SE +/- 0.002, N = 5 SE +/- 0.003, N = 5 SE +/- 0.002, N = 5 SE +/- 0.002, N = 5 7.567 7.504 7.469 7.392 7.381 -fvisibility=hidden -fvisibility=hidden -fvisibility=hidden 1. (CXX) g++ options: -O3 -march=native -logg -lm
JPEG XL The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: PNG - Encode Speed: 8 Clang 11.0 AMD AOCC 3.0 Clang 12.0 0.1845 0.369 0.5535 0.738 0.9225 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.80 0.81 0.82 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
dav1d Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Summer Nature 4K GCC 9.3 GCC 10.3 GCC 11.0.1 Clang 12.0 AMD AOCC 3.0 Clang 11.0 120 240 360 480 600 SE +/- 1.35, N = 3 SE +/- 0.67, N = 3 SE +/- 2.51, N = 3 SE +/- 1.79, N = 3 SE +/- 1.13, N = 3 SE +/- 1.43, N = 3 530.82 536.71 538.28 541.56 541.58 543.43 -lm - MIN: 248.84 / MAX: 574.28 -lm - MIN: 256.44 / MAX: 577.82 -lm - MIN: 251.6 / MAX: 584.38 MIN: 252.01 / MAX: 587.53 -lm - MIN: 259.4 / MAX: 585.8 -lm - MIN: 256.75 / MAX: 593.99 1. (CC) gcc options: -O3 -march=native -pthread
PostgreSQL pgbench This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only Clang 11.0 GCC 9.3 Clang 12.0 GCC 10.3 GCC 11.0.1 200K 400K 600K 800K 1000K SE +/- 13844.42, N = 3 SE +/- 8843.08, N = 3 SE +/- 6289.60, N = 3 SE +/- 8859.63, N = 3 SE +/- 8885.95, N = 3 1065506 1067486 1071209 1089731 1090160 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency GCC 9.3 Clang 11.0 Clang 12.0 GCC 11.0.1 GCC 10.3 0.0529 0.1058 0.1587 0.2116 0.2645 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 0.235 0.235 0.234 0.230 0.230 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
dav1d Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Summer Nature 1080p GCC 9.3 Clang 12.0 GCC 10.3 GCC 11.0.1 Clang 11.0 AMD AOCC 3.0 300 600 900 1200 1500 SE +/- 2.25, N = 3 SE +/- 7.87, N = 3 SE +/- 8.15, N = 3 SE +/- 1.96, N = 3 SE +/- 2.13, N = 3 SE +/- 4.95, N = 3 1228.63 1244.11 1245.11 1249.74 1251.25 1251.91 -lm - MIN: 555.28 / MAX: 1361.68 MIN: 549.81 / MAX: 1390.03 -lm - MIN: 539.07 / MAX: 1398.87 -lm - MIN: 559.74 / MAX: 1387.11 -lm - MIN: 556.46 / MAX: 1394.06 -lm - MIN: 543.89 / MAX: 1394.16 1. (CC) gcc options: -O3 -march=native -pthread
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU GCC 9.3 GCC 10.3 Clang 12.0 Clang 11.0 AMD AOCC 3.0 0.177 0.354 0.531 0.708 0.885 SE +/- 0.002405, N = 3 SE +/- 0.002532, N = 3 SE +/- 0.004246, N = 3 SE +/- 0.001200, N = 3 SE +/- 0.001713, N = 3 0.786762 0.782476 0.779776 0.779101 0.773233 -fopenmp - MIN: 0.75 -fopenmp - MIN: 0.73 -fopenmp=libomp - MIN: 0.73 -fopenmp=libomp - MIN: 0.73 -fopenmp=libomp - MIN: 0.72 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 1024 GCC 11.0.1 GCC 10.3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 9.3 8K 16K 24K 32K 40K SE +/- 442.82, N = 3 SE +/- 301.69, N = 3 SE +/- 455.21, N = 12 SE +/- 530.09, N = 4 SE +/- 165.99, N = 3 SE +/- 79.87, N = 3 35718 35973 36100 36181 36239 36321 1. (CC) gcc options: -pthread -O3 -march=native -lm
JPEG XL The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: JPEG - Encode Speed: 5 AMD AOCC 3.0 Clang 11.0 Clang 12.0 15 30 45 60 75 SE +/- 0.17, N = 3 SE +/- 0.20, N = 3 SE +/- 0.14, N = 3 65.57 65.58 66.66 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: JPEG - Encode Speed: 7 Clang 11.0 AMD AOCC 3.0 Clang 12.0 15 30 45 60 75 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.16, N = 3 65.43 65.68 66.38 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU AMD AOCC 3.0 Clang 12.0 Clang 11.0 GCC 9.3 GCC 10.3 1200 2400 3600 4800 6000 SE +/- 174.98, N = 12 SE +/- 126.29, N = 12 SE +/- 169.87, N = 9 SE +/- 2.40, N = 3 SE +/- 17.50, N = 3 4383 4456 4523 5183 5559 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU AMD AOCC 3.0 Clang 11.0 GCC 9.3 Clang 12.0 GCC 10.3 110 220 330 440 550 SE +/- 10.39, N = 12 SE +/- 5.55, N = 3 SE +/- 4.64, N = 12 SE +/- 10.30, N = 12 SE +/- 0.87, N = 3 459 471 495 498 505 -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp=libomp -fopenmp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
ViennaCL ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N Clang 11.0 AMD AOCC 3.0 GCC 10.3 GCC 11.0.1 GCC 9.3 Clang 12.0 15 30 45 60 75 SE +/- 3.65, N = 15 SE +/- 3.44, N = 12 SE +/- 5.30, N = 12 SE +/- 3.83, N = 15 SE +/- 4.17, N = 15 SE +/- 2.22, N = 12 51.2 55.2 56.2 63.9 65.0 69.1 -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp -fopenmp=libomp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 10.3 GCC 9.3 GCC 11.0.1 140 280 420 560 700 SE +/- 35.24, N = 12 SE +/- 38.96, N = 15 SE +/- 37.59, N = 12 SE +/- 53.43, N = 12 SE +/- 0.80, N = 15 SE +/- 2.60, N = 15 434.00 462.00 477.00 592.97 636.00 649.00 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY AMD AOCC 3.0 Clang 12.0 Clang 11.0 GCC 9.3 GCC 10.3 GCC 11.0.1 300 600 900 1200 1500 SE +/- 26.90, N = 12 SE +/- 15.69, N = 12 SE +/- 34.43, N = 15 SE +/- 2.85, N = 15 SE +/- 132.58, N = 12 SE +/- 62.40, N = 15 326.0 357.0 412.0 813.0 1350.0 1496.0 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY Clang 12.0 Clang 11.0 AMD AOCC 3.0 GCC 10.3 GCC 11.0.1 GCC 9.3 300 600 900 1200 1500 SE +/- 15.30, N = 12 SE +/- 36.50, N = 15 SE +/- 32.29, N = 12 SE +/- 101.07, N = 12 SE +/- 25.34, N = 15 SE +/- 25.85, N = 15 471.00 495.00 531.00 1065.60 1210.00 1217.00 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
Clang 12.0 Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 10 April 2021 12:16 by user phoronix.
Clang 11.0 Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 11.0.0-2~ubuntu20.04.1, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 11 April 2021 06:09 by user phoronix.
Clang 12.0 LTO Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 12 April 2021 09:50 by user phoronix.
GCC 9.3 Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 9.3.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 12 April 2021 18:26 by user phoronix.
GCC 10.3 Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 10.3.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 13 April 2021 09:13 by user phoronix.
GCC 11.0.1 Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: GCC 11.0.1 20210413, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 13 April 2021 18:14 by user phoronix.
AMD AOCC 3.0 Processor: AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe
OS: Ubuntu 20.04, Kernel: 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Compiler: Clang 12.0.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: (unknown)Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119Python Notes: Python 3.8.2Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 14 April 2021 07:23 by user phoronix.