Core i9 7960X 2021 Intel Core i9-7960X testing with a MSI X299 SLI PLUS (MS-7A93) v1.0 (1.A0 BIOS) and Gigabyte AMD Radeon R7 370 R9 270X/370X 2GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101126-HA-COREI979660&gru .
Core i9 7960X 2021 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Display Server Display Driver Compiler File-System Screen Resolution 1 2 3 4 Intel Core i9-7960X @ 4.40GHz (16 Cores / 32 Threads) MSI X299 SLI PLUS (MS-7A93) v1.0 (1.A0 BIOS) Intel Sky Lake-E DMI3 Registers 16GB 256GB INTEL SSDPEKKW256G8 Gigabyte AMD Radeon R7 370 R9 270X/370X 2GB Realtek ALC1220 G237HL Intel I219-V + Intel I211 Ubuntu 20.04 5.4.0-58-generic (x86_64) X Server 1.20.8 modesetting 1.20.8 GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0x2006a08 Security Details - itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
Core i9 7960X 2021 amg: dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit cryptsetup: PBKDF2-sha512 cryptsetup: PBKDF2-whirlpool cryptsetup: AES-XTS 256b Encryption cryptsetup: AES-XTS 256b Decryption cryptsetup: Serpent-XTS 256b Encryption cryptsetup: Serpent-XTS 256b Decryption cryptsetup: Twofish-XTS 256b Encryption cryptsetup: Twofish-XTS 256b Decryption cryptsetup: AES-XTS 512b Encryption cryptsetup: AES-XTS 512b Decryption cryptsetup: Serpent-XTS 512b Encryption cryptsetup: Serpent-XTS 512b Decryption cryptsetup: Twofish-XTS 512b Encryption cryptsetup: Twofish-XTS 512b Decryption lammps: Rhodopsin Protein clomp: Static OMP Speedup kripke: lulesh: onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 cp2k: Fayalite-FIST Data openfoam: Motorbike 30M openfoam: Motorbike 60M qe: AUSURF112 build-eigen: Time To Compile encode-ape: WAV To APE encode-opus: WAV To Opus Encode encode-wavpack: WAV To WavPack unpack-firefox: firefox-84.0.source.tar.xz 1 2 3 4 451046600 775.54 234.16 664.26 115.80 1796533 764640 2340.6 2350.4 780.9 808.7 438.2 443.3 2168.0 2176.0 786.8 808.8 440.2 442.6 10.759 23.2 64854833 450.54776 2.14939 4.14079 0.762049 1.07587 4.74363 2.34004 9.31601 2.31268 2.68731 8.84027 1.02346 1.61596 1428.49 831.021 1436.77 8.67790 10.5044 11.8035 836.775 1.48423 1432.10 831.888 0.537078 1.91750 7.725 40.551 4.531 2.753 51.724 1257.875 100.83 656.35 1812.29 74.890 11.196 10.417 14.791 19.715 454801300 774.42 233.77 660.72 115.62 1798586 765011 2368.7 2364.8 789.4 806.0 440.8 443.1 2175.6 2189.4 788.5 809.0 440.8 443.4 10.758 23.2 64827057 454.34511 2.15213 4.16477 0.763675 1.08143 4.74629 2.33914 9.33021 2.32967 2.69347 8.85607 1.02380 1.61541 1433.27 837.283 1426.35 8.68640 10.4998 11.8010 821.689 1.48580 1434.02 815.833 0.539013 1.91998 7.870 40.796 4.611 2.712 52.553 1278.27 100.73 653.80 1833.34 74.733 11.197 10.409 14.791 19.771 455200400 774.33 233.81 661.89 115.80 1798586 765011 2366.7 2364.9 789.1 809.0 441.1 437.1 2174.6 2182.8 789.4 805.4 440.5 442.9 10.772 23.2 454.44403 2.15001 4.21074 0.762372 1.08145 4.74742 2.35709 9.34375 2.30843 2.72724 8.85744 1.02339 1.62464 1443.10 837.390 1440.06 8.67758 10.5051 11.8063 838.854 1.48983 1438.52 841.775 0.542373 1.91412 7.783 40.471 4.519 2.717 51.542 1260.811 100.74 653.57 1828.94 74.887 11.197 10.420 14.790 455774200 773.57 234.04 652.08 115.74 1796533 764640 2368.0 2357.2 786.9 809.5 440.2 443.1 2176.5 2189.9 788.3 806.6 440.6 443.6 10.737 23.2 64370483 454.05166 2.15120 4.17684 0.763387 1.08081 4.74848 2.35504 9.33002 2.31694 2.68022 8.83959 1.02349 1.60273 1429.61 822.725 1576.25 8.67773 10.5074 11.7967 822.889 1.48589 1449.46 819.975 0.536757 1.92441 7.819 40.502 4.582 2.727 52.116 1259.577 100.57 654.47 1856.05 74.845 11.249 10.428 14.786 19.686 OpenBenchmarking.org
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 1 2 3 4 100M 200M 300M 400M 500M SE +/- 6385919.79, N = 4 SE +/- 3250074.73, N = 3 SE +/- 2882257.48, N = 3 SE +/- 2349402.30, N = 3 451046600 454801300 455200400 455774200 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 1 2 3 4 200 400 600 800 1000 SE +/- 0.90, N = 3 SE +/- 0.89, N = 3 SE +/- 0.56, N = 3 SE +/- 1.56, N = 3 775.54 774.42 774.33 773.57 MIN: 587.11 / MAX: 994.76 MIN: 586.49 / MAX: 988.95 MIN: 586.86 / MAX: 988.64 MIN: 585.79 / MAX: 989.33 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 1 2 3 4 50 100 150 200 250 SE +/- 0.14, N = 3 SE +/- 0.19, N = 3 SE +/- 0.45, N = 3 SE +/- 0.20, N = 3 234.16 233.77 233.81 234.04 MIN: 193.15 / MAX: 262.47 MIN: 195.31 / MAX: 261.8 MIN: 194.08 / MAX: 262.55 MIN: 194.97 / MAX: 261 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 1 2 3 4 140 280 420 560 700 SE +/- 1.26, N = 3 SE +/- 3.33, N = 3 SE +/- 1.00, N = 3 SE +/- 9.63, N = 4 664.26 660.72 661.89 652.08 MIN: 486.53 / MAX: 729.82 MIN: 460.61 / MAX: 730.43 MIN: 481.21 / MAX: 728.36 MIN: 303.69 / MAX: 727.58 1. (CC) gcc options: -pthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 1 2 3 4 30 60 90 120 150 SE +/- 0.15, N = 3 SE +/- 0.24, N = 3 SE +/- 0.13, N = 3 SE +/- 0.02, N = 3 115.80 115.62 115.80 115.74 MIN: 75.91 / MAX: 259.68 MIN: 75.89 / MAX: 255.43 MIN: 75.96 / MAX: 255.51 MIN: 76 / MAX: 253.42 1. (CC) gcc options: -pthread
Cryptsetup PBKDF2-sha512 OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-sha512 1 2 3 4 400K 800K 1200K 1600K 2000K SE +/- 1026.67, N = 3 SE +/- 1026.67, N = 3 1796533 1798586 1798586 1796533
Cryptsetup PBKDF2-whirlpool OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-whirlpool 1 2 3 4 160K 320K 480K 640K 800K SE +/- 371.67, N = 3 SE +/- 371.67, N = 3 SE +/- 371.67, N = 3 SE +/- 371.67, N = 3 764640 765011 765011 764640
Cryptsetup AES-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Encryption 1 2 3 4 500 1000 1500 2000 2500 SE +/- 21.94, N = 3 SE +/- 1.99, N = 3 SE +/- 5.72, N = 3 SE +/- 4.75, N = 3 2340.6 2368.7 2366.7 2368.0
Cryptsetup AES-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Decryption 1 2 3 4 500 1000 1500 2000 2500 SE +/- 7.25, N = 3 SE +/- 1.51, N = 3 SE +/- 3.07, N = 3 SE +/- 2.71, N = 3 2350.4 2364.8 2364.9 2357.2
Cryptsetup Serpent-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Encryption 1 2 3 4 200 400 600 800 1000 SE +/- 7.03, N = 3 SE +/- 1.33, N = 3 SE +/- 0.46, N = 3 SE +/- 2.89, N = 3 780.9 789.4 789.1 786.9
Cryptsetup Serpent-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Decryption 1 2 3 4 200 400 600 800 1000 SE +/- 0.61, N = 3 SE +/- 2.92, N = 3 SE +/- 1.27, N = 3 SE +/- 0.21, N = 3 808.7 806.0 809.0 809.5
Cryptsetup Twofish-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Encryption 1 2 3 4 100 200 300 400 500 SE +/- 1.59, N = 3 SE +/- 0.82, N = 3 SE +/- 0.63, N = 3 SE +/- 1.27, N = 3 438.2 440.8 441.1 440.2
Cryptsetup Twofish-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Decryption 1 2 3 4 100 200 300 400 500 SE +/- 0.49, N = 3 SE +/- 0.40, N = 3 SE +/- 6.51, N = 3 SE +/- 0.42, N = 3 443.3 443.1 437.1 443.1
Cryptsetup AES-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Encryption 1 2 3 4 500 1000 1500 2000 2500 SE +/- 3.99, N = 3 SE +/- 1.28, N = 3 SE +/- 4.02, N = 3 SE +/- 3.99, N = 3 2168.0 2175.6 2174.6 2176.5
Cryptsetup AES-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Decryption 1 2 3 4 500 1000 1500 2000 2500 SE +/- 4.89, N = 3 SE +/- 2.58, N = 3 SE +/- 3.86, N = 3 SE +/- 1.49, N = 3 2176.0 2189.4 2182.8 2189.9
Cryptsetup Serpent-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Encryption 1 2 3 4 200 400 600 800 1000 SE +/- 2.69, N = 3 SE +/- 0.27, N = 3 SE +/- 0.83, N = 3 SE +/- 2.08, N = 3 786.8 788.5 789.4 788.3
Cryptsetup Serpent-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Decryption 1 2 3 4 200 400 600 800 1000 SE +/- 0.90, N = 3 SE +/- 0.87, N = 3 SE +/- 5.20, N = 3 SE +/- 2.02, N = 3 808.8 809.0 805.4 806.6
Cryptsetup Twofish-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Encryption 1 2 3 4 100 200 300 400 500 SE +/- 0.45, N = 3 SE +/- 0.59, N = 3 SE +/- 0.72, N = 3 SE +/- 0.62, N = 3 440.2 440.8 440.5 440.6
Cryptsetup Twofish-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Decryption 1 2 3 4 100 200 300 400 500 SE +/- 0.31, N = 3 SE +/- 0.12, N = 3 SE +/- 0.96, N = 3 SE +/- 0.23, N = 3 442.6 443.4 442.9 443.6
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 4 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 10.76 10.76 10.77 10.74 1. (CXX) g++ options: -O3 -pthread -lm
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 2 3 4 6 12 18 24 30 SE +/- 0.15, N = 3 SE +/- 0.12, N = 3 SE +/- 0.06, N = 3 SE +/- 0.18, N = 3 23.2 23.2 23.2 23.2 1. (CC) gcc options: -fopenmp -O3 -lm
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 1 2 4 14M 28M 42M 56M 70M SE +/- 218961.39, N = 3 SE +/- 103830.55, N = 3 SE +/- 151605.59, N = 3 64854833 64827057 64370483 1. (CXX) g++ options: -O3 -fopenmp
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 1 2 3 4 100 200 300 400 500 SE +/- 4.79, N = 15 SE +/- 4.41, N = 3 SE +/- 5.20, N = 15 SE +/- 4.97, N = 15 450.55 454.35 454.44 454.05 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 4 0.4842 0.9684 1.4526 1.9368 2.421 SE +/- 0.00371, N = 3 SE +/- 0.00131, N = 3 SE +/- 0.00152, N = 3 SE +/- 0.00198, N = 3 2.14939 2.15213 2.15001 2.15120 MIN: 2.07 MIN: 2.07 MIN: 2.08 MIN: 2.07 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 4 0.9474 1.8948 2.8422 3.7896 4.737 SE +/- 0.00022, N = 3 SE +/- 0.00296, N = 3 SE +/- 0.00208, N = 3 SE +/- 0.00345, N = 3 4.14079 4.16477 4.21074 4.17684 MIN: 4.12 MIN: 4.14 MIN: 4.18 MIN: 4.15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.1718 0.3436 0.5154 0.6872 0.859 SE +/- 0.000436, N = 3 SE +/- 0.000665, N = 3 SE +/- 0.000065, N = 3 SE +/- 0.000337, N = 3 0.762049 0.763675 0.762372 0.763387 MIN: 0.75 MIN: 0.75 MIN: 0.74 MIN: 0.75 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.2433 0.4866 0.7299 0.9732 1.2165 SE +/- 0.00117, N = 3 SE +/- 0.00761, N = 3 SE +/- 0.00449, N = 3 SE +/- 0.00168, N = 3 1.07587 1.08143 1.08145 1.08081 MIN: 1.02 MIN: 1.02 MIN: 1.02 MIN: 1.03 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 1.0684 2.1368 3.2052 4.2736 5.342 SE +/- 0.00139, N = 3 SE +/- 0.00206, N = 3 SE +/- 0.00151, N = 3 SE +/- 0.00138, N = 3 4.74363 4.74629 4.74742 4.74848 MIN: 4.71 MIN: 4.71 MIN: 4.72 MIN: 4.71 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 0.5303 1.0606 1.5909 2.1212 2.6515 SE +/- 0.00038, N = 3 SE +/- 0.00250, N = 3 SE +/- 0.00205, N = 3 SE +/- 0.00098, N = 3 2.34004 2.33914 2.35709 2.35504 MIN: 2.32 MIN: 2.31 MIN: 2.33 MIN: 2.33 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.00605, N = 3 SE +/- 0.00254, N = 3 SE +/- 0.00470, N = 3 SE +/- 0.00932, N = 3 9.31601 9.33021 9.34375 9.33002 MIN: 9.26 MIN: 9.29 MIN: 9.3 MIN: 9.28 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 4 0.5242 1.0484 1.5726 2.0968 2.621 SE +/- 0.00191, N = 3 SE +/- 0.00830, N = 3 SE +/- 0.00249, N = 3 SE +/- 0.01010, N = 3 2.31268 2.32967 2.30843 2.31694 MIN: 2.25 MIN: 2.26 MIN: 2.25 MIN: 2.25 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 4 0.6136 1.2272 1.8408 2.4544 3.068 SE +/- 0.00788, N = 3 SE +/- 0.00136, N = 3 SE +/- 0.01559, N = 3 SE +/- 0.00266, N = 3 2.68731 2.69347 2.72724 2.68022 MIN: 2.65 MIN: 2.67 MIN: 2.66 MIN: 2.65 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.00176, N = 3 SE +/- 0.00611, N = 3 SE +/- 0.01395, N = 3 SE +/- 0.00242, N = 3 8.84027 8.85607 8.85744 8.83959 MIN: 8.81 MIN: 8.82 MIN: 8.82 MIN: 8.8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.2304 0.4608 0.6912 0.9216 1.152 SE +/- 0.00031, N = 3 SE +/- 0.00042, N = 3 SE +/- 0.00009, N = 3 SE +/- 0.00002, N = 3 1.02346 1.02380 1.02339 1.02349 MIN: 1.01 MIN: 1.01 MIN: 1.01 MIN: 1.01 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.3655 0.731 1.0965 1.462 1.8275 SE +/- 0.00518, N = 3 SE +/- 0.00230, N = 3 SE +/- 0.00499, N = 3 SE +/- 0.00846, N = 3 1.61596 1.61541 1.62464 1.60273 MIN: 1.6 MIN: 1.59 MIN: 1.6 MIN: 1.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 4 300 600 900 1200 1500 SE +/- 4.41, N = 3 SE +/- 5.17, N = 3 SE +/- 2.99, N = 3 SE +/- 1.78, N = 3 1428.49 1433.27 1443.10 1429.61 MIN: 1418.66 MIN: 1420.15 MIN: 1434.66 MIN: 1424.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 4 200 400 600 800 1000 SE +/- 8.53, N = 3 SE +/- 4.38, N = 3 SE +/- 4.97, N = 3 SE +/- 3.15, N = 3 831.02 837.28 837.39 822.73 MIN: 811.33 MIN: 822.43 MIN: 824.9 MIN: 814.12 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 300 600 900 1200 1500 SE +/- 10.67, N = 3 SE +/- 3.40, N = 3 SE +/- 2.92, N = 3 SE +/- 141.79, N = 15 1436.77 1426.35 1440.06 1576.25 MIN: 1418.03 MIN: 1420.45 MIN: 1434.34 MIN: 1420.11 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.00272, N = 3 SE +/- 0.00560, N = 3 SE +/- 0.00489, N = 3 SE +/- 0.00097, N = 3 8.67790 8.68640 8.67758 8.67773 MIN: 8.51 MIN: 8.51 MIN: 8.51 MIN: 8.5 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 10.50 10.50 10.51 10.51 MIN: 10.42 MIN: 10.42 MIN: 10.42 MIN: 10.42 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 11.80 11.80 11.81 11.80 MIN: 11.76 MIN: 11.75 MIN: 11.76 MIN: 11.75 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 200 400 600 800 1000 SE +/- 6.86, N = 3 SE +/- 4.30, N = 3 SE +/- 7.04, N = 3 SE +/- 4.25, N = 3 836.78 821.69 838.85 822.89 MIN: 820.62 MIN: 814.01 MIN: 824.47 MIN: 815.02 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 4 0.3352 0.6704 1.0056 1.3408 1.676 SE +/- 0.00521, N = 3 SE +/- 0.00306, N = 3 SE +/- 0.00259, N = 3 SE +/- 0.00235, N = 3 1.48423 1.48580 1.48983 1.48589 MIN: 1.45 MIN: 1.46 MIN: 1.46 MIN: 1.46 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 300 600 900 1200 1500 SE +/- 4.88, N = 3 SE +/- 2.70, N = 3 SE +/- 3.80, N = 3 SE +/- 17.05, N = 3 1432.10 1434.02 1438.52 1449.46 MIN: 1419.94 MIN: 1426.23 MIN: 1430.66 MIN: 1428.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 200 400 600 800 1000 SE +/- 8.12, N = 3 SE +/- 0.42, N = 3 SE +/- 7.53, N = 3 SE +/- 1.07, N = 3 831.89 815.83 841.78 819.98 MIN: 811.19 MIN: 812.11 MIN: 824.7 MIN: 815.22 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.122 0.244 0.366 0.488 0.61 SE +/- 0.001168, N = 3 SE +/- 0.003335, N = 3 SE +/- 0.001215, N = 3 SE +/- 0.002601, N = 3 0.537078 0.539013 0.542373 0.536757 MIN: 0.51 MIN: 0.51 MIN: 0.52 MIN: 0.51 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 0.433 0.866 1.299 1.732 2.165 SE +/- 0.00613, N = 3 SE +/- 0.00771, N = 3 SE +/- 0.00384, N = 3 SE +/- 0.00189, N = 3 1.91750 1.91998 1.91412 1.92441 MIN: 1.83 MIN: 1.83 MIN: 1.83 MIN: 1.85 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 1 2 3 4 2 4 6 8 10 SE +/- 0.107, N = 3 SE +/- 0.065, N = 3 SE +/- 0.030, N = 3 SE +/- 0.048, N = 3 7.725 7.870 7.783 7.819 MIN: 7.32 / MAX: 9.8 MIN: 7.5 / MAX: 9.92 MIN: 7.32 / MAX: 9.9 MIN: 7.44 / MAX: 9.22 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 1 2 3 4 9 18 27 36 45 SE +/- 0.17, N = 3 SE +/- 0.23, N = 3 SE +/- 0.12, N = 3 SE +/- 0.05, N = 3 40.55 40.80 40.47 40.50 MIN: 39.97 / MAX: 78.97 MIN: 40.35 / MAX: 75.92 MIN: 39.9 / MAX: 48.6 MIN: 40.25 / MAX: 45.96 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 1 2 3 4 1.0375 2.075 3.1125 4.15 5.1875 SE +/- 0.048, N = 3 SE +/- 0.012, N = 3 SE +/- 0.041, N = 3 SE +/- 0.024, N = 3 4.531 4.611 4.519 4.582 MIN: 4.02 / MAX: 5.98 MIN: 4.17 / MAX: 5.73 MIN: 4.04 / MAX: 6.86 MIN: 4.08 / MAX: 5.99 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 1 2 3 4 0.6194 1.2388 1.8582 2.4776 3.097 SE +/- 0.032, N = 3 SE +/- 0.012, N = 3 SE +/- 0.023, N = 3 SE +/- 0.049, N = 3 2.753 2.712 2.717 2.727 MIN: 2.41 / MAX: 4.24 MIN: 2.52 / MAX: 2.89 MIN: 2.44 / MAX: 4.33 MIN: 2.39 / MAX: 4.04 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 1 2 3 4 12 24 36 48 60 SE +/- 0.23, N = 3 SE +/- 0.05, N = 3 SE +/- 0.30, N = 3 SE +/- 0.11, N = 3 51.72 52.55 51.54 52.12 MIN: 50.92 / MAX: 61.6 MIN: 52.14 / MAX: 58.92 MIN: 50.65 / MAX: 54.82 MIN: 51.67 / MAX: 56.33 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 1 2 3 4 300 600 900 1200 1500 1257.88 1278.27 1260.81 1259.58
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 1 2 3 4 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 100.83 100.73 100.74 100.57 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenFOAM Input: Motorbike 60M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M 1 2 3 4 140 280 420 560 700 SE +/- 0.07, N = 3 SE +/- 0.20, N = 3 SE +/- 0.32, N = 3 SE +/- 0.12, N = 3 656.35 653.80 653.57 654.47 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 1 2 3 4 400 800 1200 1600 2000 SE +/- 20.09, N = 3 SE +/- 7.36, N = 3 SE +/- 10.93, N = 3 SE +/- 2.48, N = 3 1812.29 1833.34 1828.94 1856.05 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 2 3 4 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 74.89 74.73 74.89 74.85
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 2 3 4 3 6 9 12 15 SE +/- 0.02, N = 5 SE +/- 0.03, N = 5 SE +/- 0.03, N = 5 SE +/- 0.01, N = 5 11.20 11.20 11.20 11.25 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 1 2 3 4 3 6 9 12 15 SE +/- 0.02, N = 5 SE +/- 0.02, N = 5 SE +/- 0.02, N = 5 SE +/- 0.02, N = 5 10.42 10.41 10.42 10.43 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1 2 3 4 4 8 12 16 20 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 14.79 14.79 14.79 14.79 1. (CXX) g++ options: -rdynamic
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz 1 2 4 5 10 15 20 25 SE +/- 0.15, N = 4 SE +/- 0.16, N = 19 SE +/- 0.11, N = 4 19.72 19.77 19.69
Phoronix Test Suite v10.8.4