5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS AMD Ryzen 9 5950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3202 BIOS) and AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101203-HA-5950XASUS43&sor&grt .
5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 3003 3202 AMD Ryzen 9 5950X 16-Core @ 3.40GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3003 BIOS) AMD Starship/Matisse 32GB 2000GB Corsair Force MP600 + 2000GB AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB (2100/875MHz) AMD Navi 10 HDMI Audio ASUS MG28U Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.10 5.11.0-051100rc2daily20210108-generic (x86_64) 20210107 GNOME Shell 3.38.1 X Server 1.20.9 amdgpu 19.1.0 4.6 Mesa 21.0.0-devel (git-f01bca8 2021-01-08 groovy-oibaf-ppa) (LLVM 11.0.1) 1.2.164 GCC 10.2.0 ext4 3840x2160 ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3202 BIOS) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201009 Graphics Details - GLAMOR Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected Disk Details - 3202: NONE / errors=remount-ro,relatime,rw / Block Size: 4096
5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS amg: astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive brl-cad: VGR Performance Metric build2: Time To Compile cloverleaf: Lagrangian-Eulerian Hydrodynamics coremark: CoreMark Size 666 - Iterations Per Second cp2k: Fayalite-FIST Data crafty: Elapsed Time dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit deepspeech: CPU dolfyn: Computational Fluid Dynamics espeak: Text-To-Speech Synthesis etlegacy: Renderer2 - 3840 x 2160 etcpak: DXT1 etcpak: ETC1 etcpak: ETC2 etcpak: ETC1 + Dithering synthmark: VoiceMark_100 gromacs: Water Benchmark hpcc: G-HPL hpcc: G-Ffte hpcc: EP-DGEMM hpcc: G-Ptrans hpcc: EP-STREAM Triad hpcc: G-Rand Access hpcc: Rand Ring Latency hpcc: Rand Ring Bandwidth hpcc: Max Ping Pong Bandwidth indigobench: CPU - Bedroom indigobench: CPU - Supercar ior: 2MB - Default Test Directory ior: 4MB - Default Test Directory ior: 8MB - Default Test Directory ior: 256MB - Default Test Directory ior: 512MB - Default Test Directory kripke: lammps: 20k Atoms lammps: Rhodopsin Protein libraw: Post-Processing Benchmark lulesh: compress-lz4: 1 - Compression Speed compress-lz4: 1 - Decompression Speed compress-lz4: 3 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 9 - Decompression Speed mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 encode-ape: WAV To APE namd: ATPase Simulation - 327,506 Atoms ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m numpy: onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onnx: yolov4 - OpenMP CPU onnx: bertsquad-10 - OpenMP CPU onnx: fcn-resnet101-11 - OpenMP CPU onnx: shufflenet-v2-10 - OpenMP CPU onnx: super-resolution-10 - OpenMP CPU openfoam: Motorbike 30M openfoam: Motorbike 60M encode-opus: WAV To Opus Encode phpbench: PHP Benchmark Suite qmcpack: simple-H2O qe: AUSURF112 rav1e: 5 rav1e: 6 rav1e: 10 relion: Basic - CPU rnnoise: sqlite-speedtest: Timed Time - Size 1,000 tesseract: 3840 x 2160 build-eigen: Time To Compile build-godot: Time To Compile hmmer: Pfam Database Search build-linux-kernel: Time To Compile tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 warsow: 3840 x 2160 encode-wavpack: WAV To WavPack webp: Quality 100 webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression x265: Bosphorus 4K x265: Bosphorus 1080p xonotic: 3840 x 2160 - Ultimate yquake2: OpenGL 3.x - 3840 x 2160 compress-zstd: 3 compress-zstd: 19 3003 3202 207261167 4.11 5.35 12.47 99.00 265419 79.997 136.80 829763.126732 787.495 11736507 590.32 224.40 534.95 96.35 70.48086 12.898 21.560 224.3 1532.411 382.932 236.507 349.700 958.662 1.275 53.16620 6.23452 16.85917 2.45326 1.42468 0.04998 0.48933 1.92562 34205.275 4.175 8.803 1540.02 1580.25 1601.21 1339.91 1748.21 72385553 13.539 13.104 52.99 5041.8561 11911.06 13410.2 68.88 13085.1 68.57 13093.1 5.232 24.036 3.402 2.501 31.005 9.805 1.07519 12.04 4.38 4.07 4.38 3.90 5.29 1.80 12.99 60.60 14.51 11.04 25.05 21.14 14.64 17.67 514.13 3.98022 9.48452 0.816147 0.477138 17.2340 2.38690 3.50445 19.1307 2.95002 1.53431 2753.77 1809.76 2752.88 1783.66 0.625862 2769.09 1818.11 1.45970 438 705 99 16013 7056 104.48 1391.39 6.126 831939 22.312 1199.39 1.497 1.958 3.362 1892.609 15.181 41.631 356.1715 60.052 79.198 82.704 45.787 218.293 211.459 429.6 10.947 1.741 13.039 5.465 27.225 24.57 47.87 257.4557090 979.3 4723.9 43.4 210165200 4.10 5.43 12.67 100.90 262742 81.883 134.74 815726.122263 801.638 11427830 592.56 228.62 535.52 96.66 70.53470 13.292 21.692 1562.887 387.141 244.998 355.439 957.951 1.258 53.16863 6.55050 16.57023 2.47841 1.47956 0.04973 0.49204 2.02474 34166.839 4.132 8.669 1388.54 1482.95 1461.18 1251.65 1682.82 72580800 13.390 13.113 53.09 5066.7382 11854.52 13319.6 69.10 12946.4 66.95 12949.0 5.153 23.630 3.325 2.481 30.332 9.851 1.08736 11.99 4.46 4.15 4.42 3.96 5.37 1.82 13.00 59.90 14.57 11.26 25.00 21.29 14.60 17.95 514.08 3.97992 9.51868 0.832251 0.484368 17.2901 2.43213 3.53182 19.2565 2.99402 1.59818 2749.23 1823.19 2763.90 1789.23 0.636735 2761.96 1795.37 1.48786 428 665 98 15863 7324 97.87 1380.20 6.150 834019 22.371 1221.19 1.505 1.965 3.446 1875.48 15.234 42.862 353.5707 61.251 80.072 82.928 46.410 220.329 212.620 430.5 11.112 1.729 12.873 5.360 27.522 24.18 47.62 288.9672535 986.5 4738.4 43.3 OpenBenchmarking.org
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 3202 3003 50M 100M 150M 200M 250M SE +/- 569155.55, N = 3 SE +/- 2159740.57, N = 3 210165200 207261167 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 3202 3003 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 4.10 4.11 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 3003 3202 1.2218 2.4436 3.6654 4.8872 6.109 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 5.35 5.43 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 3003 3202 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 12.47 12.67 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 3003 3202 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.15, N = 3 99.00 100.90 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 3003 3202 60K 120K 180K 240K 300K 265419 262742 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 3003 3202 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 80.00 81.88
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 3202 3003 30 60 90 120 150 SE +/- 0.24, N = 3 SE +/- 0.03, N = 3 134.74 136.80 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 3003 3202 200K 400K 600K 800K 1000K SE +/- 1722.61, N = 3 SE +/- 552.83, N = 3 829763.13 815726.12 1. (CC) gcc options: -O2 -lrt" -lrt
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 3003 3202 200 400 600 800 1000 787.50 801.64
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 3003 3202 3M 6M 9M 12M 15M SE +/- 108514.09, N = 3 SE +/- 40276.62, N = 3 11736507 11427830 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 3202 3003 130 260 390 520 650 SE +/- 0.77, N = 3 SE +/- 0.69, N = 3 592.56 590.32 MIN: 447.8 / MAX: 754.79 MIN: 447.67 / MAX: 749.27 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 3202 3003 50 100 150 200 250 SE +/- 0.36, N = 3 SE +/- 0.49, N = 3 228.62 224.40 MIN: 172.54 / MAX: 238.82 MIN: 172.58 / MAX: 234.67 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 3202 3003 120 240 360 480 600 SE +/- 1.59, N = 3 SE +/- 4.29, N = 3 535.52 534.95 MIN: 453.14 / MAX: 589.94 MIN: 432.57 / MAX: 611.74 1. (CC) gcc options: -pthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 3202 3003 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 96.66 96.35 MIN: 61.56 / MAX: 221.22 MIN: 61.49 / MAX: 217.11 1. (CC) gcc options: -pthread
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU 3003 3202 16 32 48 64 80 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 70.48 70.53
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 3003 3202 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 12.90 13.29
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 3003 3202 5 10 15 20 25 SE +/- 0.08, N = 4 SE +/- 0.08, N = 4 21.56 21.69 1. (CC) gcc options: -O2 -std=c99
ET: Legacy Renderer: Renderer2 - Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better ET: Legacy 2.75 Renderer: Renderer2 - Resolution: 3840 x 2160 3003 50 100 150 200 250 224.3
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 3202 3003 300 600 900 1200 1500 SE +/- 21.94, N = 3 SE +/- 2.32, N = 3 1562.89 1532.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 3202 3003 80 160 240 320 400 SE +/- 1.60, N = 3 SE +/- 1.98, N = 3 387.14 382.93 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 3202 3003 50 100 150 200 250 SE +/- 1.66, N = 3 SE +/- 0.59, N = 3 245.00 236.51 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 3202 3003 80 160 240 320 400 SE +/- 3.87, N = 3 SE +/- 0.36, N = 3 355.44 349.70 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 3003 3202 200 400 600 800 1000 SE +/- 3.00, N = 3 SE +/- 4.96, N = 3 958.66 957.95 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 3003 3202 0.2869 0.5738 0.8607 1.1476 1.4345 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 1.275 1.258 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 3202 3003 12 24 36 48 60 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 53.17 53.17 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 3202 3003 2 4 6 8 10 SE +/- 0.02613, N = 3 SE +/- 0.10692, N = 3 6.55050 6.23452 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 3003 3202 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.17, N = 3 16.86 16.57 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 3202 3003 0.5576 1.1152 1.6728 2.2304 2.788 SE +/- 0.01009, N = 3 SE +/- 0.00427, N = 3 2.47841 2.45326 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 3202 3003 0.3329 0.6658 0.9987 1.3316 1.6645 SE +/- 0.00089, N = 3 SE +/- 0.00070, N = 3 1.47956 1.42468 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 3003 3202 0.0112 0.0224 0.0336 0.0448 0.056 SE +/- 0.00046, N = 3 SE +/- 0.00017, N = 3 0.04998 0.04973 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 3003 3202 0.1107 0.2214 0.3321 0.4428 0.5535 SE +/- 0.00404, N = 3 SE +/- 0.00234, N = 3 0.48933 0.49204 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 3202 3003 0.4556 0.9112 1.3668 1.8224 2.278 SE +/- 0.02919, N = 3 SE +/- 0.02644, N = 3 2.02474 1.92562 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 3003 3202 7K 14K 21K 28K 35K SE +/- 146.47, N = 3 SE +/- 130.59, N = 3 34205.28 34166.84 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 3003 3202 0.9394 1.8788 2.8182 3.7576 4.697 SE +/- 0.006, N = 3 SE +/- 0.017, N = 3 4.175 4.132
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 3003 3202 2 4 6 8 10 SE +/- 0.008, N = 3 SE +/- 0.011, N = 3 8.803 8.669
IOR Block Size: 2MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 2.40, N = 3 SE +/- 14.36, N = 3 1540.02 1388.54 MIN: 1034.78 / MAX: 2149.91 MIN: 890.97 / MAX: 2113.75 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 4MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 4MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 5.96, N = 3 SE +/- 11.11, N = 11 1580.25 1482.95 MIN: 1161.4 / MAX: 2244.12 MIN: 955.9 / MAX: 2484.92 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 8MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 8MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 6.28, N = 3 SE +/- 28.69, N = 13 1601.21 1461.18 MIN: 1005.82 / MAX: 2534.37 MIN: 491.21 / MAX: 2711.8 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 256MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 256MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 19.35, N = 9 SE +/- 14.35, N = 9 1339.91 1251.65 MIN: 282.98 / MAX: 2236.64 MIN: 354.68 / MAX: 2107.13 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 512MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 512MB - Disk Target: Default Test Directory 3003 3202 400 800 1200 1600 2000 SE +/- 11.67, N = 3 SE +/- 22.61, N = 9 1748.21 1682.82 MIN: 534.9 / MAX: 2360.08 MIN: 251.69 / MAX: 2253.72 1. (CC) gcc options: -O2 -lm -pthread -lmpi
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 3202 3003 16M 32M 48M 64M 80M SE +/- 319563.42, N = 3 SE +/- 1023469.37, N = 3 72580800 72385553 1. (CXX) g++ options: -O3 -fopenmp
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 3003 3202 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 13.54 13.39 1. (CXX) g++ options: -O3 -pthread -lm
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 3202 3003 3 6 9 12 15 SE +/- 0.13, N = 15 SE +/- 0.15, N = 15 13.11 13.10 1. (CXX) g++ options: -O3 -pthread -lm
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark 3202 3003 12 24 36 48 60 SE +/- 0.24, N = 3 SE +/- 0.08, N = 3 53.09 52.99 1. (CXX) g++ options: -O2 -fopenmp -ljpeg -lz -lm
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 3202 3003 1100 2200 3300 4400 5500 SE +/- 66.75, N = 3 SE +/- 59.80, N = 3 5066.74 5041.86 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 49.00, N = 3 SE +/- 59.07, N = 3 11911.06 11854.52 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 19.80, N = 3 SE +/- 25.73, N = 3 13410.2 13319.6 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 3202 3003 15 30 45 60 75 SE +/- 0.20, N = 3 SE +/- 0.92, N = 3 69.10 68.88 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 7.36, N = 3 SE +/- 2.16, N = 3 13085.1 12946.4 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 3003 3202 15 30 45 60 75 SE +/- 0.80, N = 12 SE +/- 0.20, N = 3 68.57 66.95 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 11.50, N = 12 SE +/- 11.91, N = 3 13093.1 12949.0 1. (CC) gcc options: -O3
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 3202 3003 1.1772 2.3544 3.5316 4.7088 5.886 SE +/- 0.022, N = 3 SE +/- 0.062, N = 3 5.153 5.232 MIN: 5.02 / MAX: 14.32 MIN: 5.02 / MAX: 8.4 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 3202 3003 6 12 18 24 30 SE +/- 0.19, N = 3 SE +/- 0.25, N = 3 23.63 24.04 MIN: 22.31 / MAX: 33.12 MIN: 21.95 / MAX: 33.08 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 3202 3003 0.7655 1.531 2.2965 3.062 3.8275 SE +/- 0.039, N = 3 SE +/- 0.047, N = 3 3.325 3.402 MIN: 3.16 / MAX: 4.02 MIN: 3.23 / MAX: 5.81 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 3202 3003 0.5627 1.1254 1.6881 2.2508 2.8135 SE +/- 0.027, N = 3 SE +/- 0.094, N = 3 2.481 2.501 MIN: 2.42 / MAX: 2.71 MIN: 2.32 / MAX: 4.53 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 3202 3003 7 14 21 28 35 SE +/- 0.26, N = 3 SE +/- 0.18, N = 3 30.33 31.01 MIN: 29.28 / MAX: 56.48 MIN: 29.92 / MAX: 38.55 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 3003 3202 3 6 9 12 15 SE +/- 0.045, N = 5 SE +/- 0.041, N = 5 9.805 9.851 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 3003 3202 0.2447 0.4894 0.7341 0.9788 1.2235 SE +/- 0.00258, N = 3 SE +/- 0.00500, N = 3 1.07519 1.08736
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 3202 3003 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.15, N = 3 11.99 12.04 MIN: 11.79 / MAX: 12.19 MIN: 11.72 / MAX: 14.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 3003 3202 1.0035 2.007 3.0105 4.014 5.0175 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.38 4.46 MIN: 4.24 / MAX: 7.32 MIN: 4.31 / MAX: 5.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 3003 3202 0.9338 1.8676 2.8014 3.7352 4.669 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 4.07 4.15 MIN: 4.03 / MAX: 5.2 MIN: 4.11 / MAX: 5.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 3003 3202 0.9945 1.989 2.9835 3.978 4.9725 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.38 4.42 MIN: 4.33 / MAX: 4.89 MIN: 4.35 / MAX: 5.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 3003 3202 0.891 1.782 2.673 3.564 4.455 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 3.90 3.96 MIN: 3.78 / MAX: 4.83 MIN: 3.84 / MAX: 5.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 3003 3202 1.2083 2.4166 3.6249 4.8332 6.0415 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.29 5.37 MIN: 5.24 / MAX: 5.79 MIN: 5.31 / MAX: 7.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 3003 3202 0.4095 0.819 1.2285 1.638 2.0475 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.80 1.82 MIN: 1.78 / MAX: 2.28 MIN: 1.79 / MAX: 2.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 3003 3202 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 12.99 13.00 MIN: 12.62 / MAX: 13.57 MIN: 12.64 / MAX: 20.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 3202 3003 14 28 42 56 70 SE +/- 0.09, N = 3 SE +/- 0.13, N = 3 59.90 60.60 MIN: 58.71 / MAX: 61.76 MIN: 59.43 / MAX: 62.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 3003 3202 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.51 14.57 MIN: 14.39 / MAX: 15.06 MIN: 14.45 / MAX: 23.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 3003 3202 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.24, N = 3 11.04 11.26 MIN: 10.95 / MAX: 11.89 MIN: 10.91 / MAX: 19.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 3202 3003 6 12 18 24 30 SE +/- 0.17, N = 3 SE +/- 0.13, N = 3 25.00 25.05 MIN: 24.58 / MAX: 26.31 MIN: 24.65 / MAX: 26.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 3003 3202 5 10 15 20 25 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 21.14 21.29 MIN: 20.72 / MAX: 29.47 MIN: 20.98 / MAX: 21.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 3202 3003 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 14.60 14.64 MIN: 14.21 / MAX: 15.07 MIN: 14.31 / MAX: 15.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 3003 3202 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.13, N = 3 17.67 17.95 MIN: 17.47 / MAX: 18.02 MIN: 17.66 / MAX: 19.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 3003 3202 110 220 330 440 550 SE +/- 0.48, N = 3 SE +/- 4.15, N = 3 514.13 514.08
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3202 3003 0.8955 1.791 2.6865 3.582 4.4775 SE +/- 0.00985, N = 3 SE +/- 0.01644, N = 3 3.97992 3.98022 MIN: 3.76 MIN: 3.72 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3003 3202 3 6 9 12 15 SE +/- 0.01431, N = 3 SE +/- 0.00745, N = 3 9.48452 9.51868 MIN: 9.38 MIN: 9.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.1873 0.3746 0.5619 0.7492 0.9365 SE +/- 0.001756, N = 3 SE +/- 0.001536, N = 3 0.816147 0.832251 MIN: 0.74 MIN: 0.75 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.109 0.218 0.327 0.436 0.545 SE +/- 0.002170, N = 3 SE +/- 0.003401, N = 15 0.477138 0.484368 MIN: 0.44 MIN: 0.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 3003 3202 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 17.23 17.29 MIN: 16.81 MIN: 16.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 3003 3202 0.5472 1.0944 1.6416 2.1888 2.736 SE +/- 0.00251, N = 3 SE +/- 0.00987, N = 3 2.38690 2.43213 MIN: 2.28 MIN: 2.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3003 3202 0.7947 1.5894 2.3841 3.1788 3.9735 SE +/- 0.00613, N = 3 SE +/- 0.00421, N = 3 3.50445 3.53182 MIN: 3.38 MIN: 3.41 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 3003 3202 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 19.13 19.26 MIN: 18.77 MIN: 18.8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.6737 1.3474 2.0211 2.6948 3.3685 SE +/- 0.00671, N = 3 SE +/- 0.00911, N = 3 2.95002 2.99402 MIN: 2.81 MIN: 2.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.3596 0.7192 1.0788 1.4384 1.798 SE +/- 0.00153, N = 3 SE +/- 0.00175, N = 3 1.53431 1.59818 MIN: 1.41 MIN: 1.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 3202 3003 600 1200 1800 2400 3000 SE +/- 2.89, N = 3 SE +/- 13.63, N = 3 2749.23 2753.77 MIN: 2735.56 MIN: 2727.8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 15.33, N = 3 SE +/- 9.60, N = 3 1809.76 1823.19 MIN: 1773.21 MIN: 1798.09 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 11.23, N = 3 SE +/- 9.00, N = 3 2752.88 2763.90 MIN: 2722.06 MIN: 2736.15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 8.25, N = 3 SE +/- 17.01, N = 3 1783.66 1789.23 MIN: 1765.38 MIN: 1761.64 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 3003 3202 0.1433 0.2866 0.4299 0.5732 0.7165 SE +/- 0.000749, N = 3 SE +/- 0.000300, N = 3 0.625862 0.636735 MIN: 0.6 MIN: 0.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 3202 3003 600 1200 1800 2400 3000 SE +/- 4.41, N = 3 SE +/- 9.46, N = 3 2761.96 2769.09 MIN: 2740.57 MIN: 2739.1 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3202 3003 400 800 1200 1600 2000 SE +/- 19.66, N = 4 SE +/- 20.56, N = 3 1795.37 1818.11 MIN: 1755.37 MIN: 1764.89 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.3348 0.6696 1.0044 1.3392 1.674 SE +/- 0.00059, N = 3 SE +/- 0.00214, N = 3 1.45970 1.48786 MIN: 1.39 MIN: 1.39 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 3003 3202 90 180 270 360 450 SE +/- 1.17, N = 3 SE +/- 3.35, N = 3 438 428 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 3003 3202 150 300 450 600 750 SE +/- 1.32, N = 3 SE +/- 11.02, N = 12 705 665 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 3003 3202 20 40 60 80 100 SE +/- 0.33, N = 3 99 98 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 3003 3202 3K 6K 9K 12K 15K SE +/- 20.67, N = 3 SE +/- 68.50, N = 3 16013 15863 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 3202 3003 1600 3200 4800 6400 8000 SE +/- 175.44, N = 12 SE +/- 202.78, N = 12 7324 7056 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 3202 3003 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.09, N = 3 97.87 104.48 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenFOAM Input: Motorbike 60M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M 3202 3003 300 600 900 1200 1500 SE +/- 0.41, N = 3 SE +/- 0.57, N = 3 1380.20 1391.39 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 3003 3202 2 4 6 8 10 SE +/- 0.032, N = 5 SE +/- 0.042, N = 5 6.126 6.150 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 3202 3003 200K 400K 600K 800K 1000K SE +/- 7522.35, N = 3 SE +/- 8175.01, N = 3 834019 831939
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 3003 3202 5 10 15 20 25 SE +/- 0.21, N = 7 SE +/- 0.14, N = 3 22.31 22.37 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 3003 3202 300 600 900 1200 1500 SE +/- 0.92, N = 3 SE +/- 3.58, N = 3 1199.39 1221.19 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 3202 3003 0.3386 0.6772 1.0158 1.3544 1.693 SE +/- 0.007, N = 3 SE +/- 0.005, N = 3 1.505 1.497
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 3202 3003 0.4421 0.8842 1.3263 1.7684 2.2105 SE +/- 0.016, N = 3 SE +/- 0.005, N = 3 1.965 1.958
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 3202 3003 0.7754 1.5508 2.3262 3.1016 3.877 SE +/- 0.037, N = 15 SE +/- 0.037, N = 3 3.446 3.362
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 3.1.1 Test: Basic - Device: CPU 3202 3003 400 800 1200 1600 2000 SE +/- 6.24, N = 3 SE +/- 3.84, N = 3 1875.48 1892.61 1. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -pthread -lmpi_cxx -lmpi
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 3003 3202 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.18, N = 3 15.18 15.23 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 3003 3202 10 20 30 40 50 SE +/- 0.34, N = 3 SE +/- 0.24, N = 15 41.63 42.86 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Tesseract Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better Tesseract 2014-05-12 Resolution: 3840 x 2160 3003 3202 80 160 240 320 400 SE +/- 3.53, N = 15 SE +/- 4.21, N = 15 356.17 353.57
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 3003 3202 14 28 42 56 70 SE +/- 0.21, N = 3 SE +/- 0.01, N = 3 60.05 61.25
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 3003 3202 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.31, N = 3 79.20 80.07
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 3003 3202 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.07, N = 3 82.70 82.93 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile 3003 3202 11 22 33 44 55 SE +/- 0.43, N = 3 SE +/- 0.49, N = 3 45.79 46.41
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 3003 3202 50 100 150 200 250 SE +/- 0.32, N = 3 SE +/- 0.67, N = 3 218.29 220.33 MIN: 208.52 / MAX: 289.05 MIN: 216.95 / MAX: 261.2 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 3003 3202 50 100 150 200 250 SE +/- 0.43, N = 3 SE +/- 0.34, N = 3 211.46 212.62 MIN: 210.71 / MAX: 212.37 MIN: 211.89 / MAX: 213.24 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Warsow Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 3840 x 2160 3202 3003 90 180 270 360 450 SE +/- 0.67, N = 3 SE +/- 0.72, N = 3 430.5 429.6
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 3003 3202 3 6 9 12 15 SE +/- 0.06, N = 5 SE +/- 0.03, N = 5 10.95 11.11 1. (CXX) g++ options: -rdynamic
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 3202 3003 0.3917 0.7834 1.1751 1.5668 1.9585 SE +/- 0.004, N = 3 SE +/- 0.001, N = 3 1.729 1.741 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless 3202 3003 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 12.87 13.04 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression 3202 3003 1.2296 2.4592 3.6888 4.9184 6.148 SE +/- 0.062, N = 3 SE +/- 0.059, N = 3 5.360 5.465 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression 3003 3202 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 27.23 27.52 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 3003 3202 6 12 18 24 30 SE +/- 0.35, N = 3 SE +/- 0.26, N = 4 24.57 24.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 3003 3202 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.22, N = 3 47.87 47.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Xonotic Resolution: 3840 x 2160 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.2 Resolution: 3840 x 2160 - Effects Quality: Ultimate 3202 3003 60 120 180 240 300 SE +/- 3.79, N = 3 SE +/- 4.83, N = 15 288.97 257.46 MIN: 60 / MAX: 571 MIN: 55 / MAX: 623
yquake2 Renderer: OpenGL 3.x - Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 3840 x 2160 3202 3003 200 400 600 800 1000 SE +/- 1.35, N = 3 SE +/- 1.03, N = 3 986.5 979.3 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 3202 3003 1000 2000 3000 4000 5000 SE +/- 12.71, N = 3 SE +/- 8.02, N = 3 4738.4 4723.9 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 3003 3202 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 43.4 43.3 1. (CC) gcc options: -O3 -pthread -lz -llzma
Phoronix Test Suite v10.8.5