5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS AMD Ryzen 9 5950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3202 BIOS) and AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101203-HA-5950XASUS43&rdt&gru .
5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 3003 3202 AMD Ryzen 9 5950X 16-Core @ 3.40GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3003 BIOS) AMD Starship/Matisse 32GB 2000GB Corsair Force MP600 + 2000GB AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB (2100/875MHz) AMD Navi 10 HDMI Audio ASUS MG28U Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.10 5.11.0-051100rc2daily20210108-generic (x86_64) 20210107 GNOME Shell 3.38.1 X Server 1.20.9 amdgpu 19.1.0 4.6 Mesa 21.0.0-devel (git-f01bca8 2021-01-08 groovy-oibaf-ppa) (LLVM 11.0.1) 1.2.164 GCC 10.2.0 ext4 3840x2160 ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3202 BIOS) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201009 Graphics Details - GLAMOR Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected Disk Details - 3202: NONE / errors=remount-ro,relatime,rw / Block Size: 4096
5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS amg: dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit etlegacy: Renderer2 - 3840 x 2160 tesseract: 3840 x 2160 warsow: 3840 x 2160 xonotic: 3840 x 2160 - Ultimate yquake2: OpenGL 3.x - 3840 x 2160 rav1e: 5 rav1e: 6 rav1e: 10 x265: Bosphorus 4K x265: Bosphorus 1080p hpcc: G-Ptrans hpcc: EP-STREAM Triad hpcc: Rand Ring Bandwidth hpcc: G-HPL hpcc: G-Ffte hpcc: EP-DGEMM hpcc: G-Rand Access onnx: yolov4 - OpenMP CPU onnx: bertsquad-10 - OpenMP CPU onnx: fcn-resnet101-11 - OpenMP CPU onnx: shufflenet-v2-10 - OpenMP CPU onnx: super-resolution-10 - OpenMP CPU coremark: CoreMark Size 666 - Iterations Per Second indigobench: CPU - Bedroom indigobench: CPU - Supercar hpcc: Max Ping Pong Bandwidth compress-lz4: 1 - Compression Speed compress-lz4: 1 - Decompression Speed compress-lz4: 3 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 9 - Decompression Speed compress-zstd: 3 compress-zstd: 19 ior: 2MB - Default Test Directory ior: 4MB - Default Test Directory ior: 8MB - Default Test Directory ior: 256MB - Default Test Directory ior: 512MB - Default Test Directory libraw: Post-Processing Benchmark etcpak: DXT1 etcpak: ETC1 etcpak: ETC2 etcpak: ETC1 + Dithering crafty: Elapsed Time gromacs: Water Benchmark lammps: 20k Atoms lammps: Rhodopsin Protein numpy: phpbench: PHP Benchmark Suite kripke: brl-cad: VGR Performance Metric synthmark: VoiceMark_100 lulesh: namd: ATPase Simulation - 327,506 Atoms webp: Quality 100 webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 cloverleaf: Lagrangian-Eulerian Hydrodynamics cp2k: Fayalite-FIST Data dolfyn: Computational Fluid Dynamics hmmer: Pfam Database Search openfoam: Motorbike 30M openfoam: Motorbike 60M qe: AUSURF112 relion: Basic - CPU build-godot: Time To Compile build-linux-kernel: Time To Compile build2: Time To Compile build-eigen: Time To Compile deepspeech: CPU encode-ape: WAV To APE encode-opus: WAV To Opus Encode espeak: Text-To-Speech Synthesis rnnoise: astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive sqlite-speedtest: Timed Time - Size 1,000 encode-wavpack: WAV To WavPack qmcpack: simple-H2O hpcc: Rand Ring Latency 3003 3202 207261167 590.32 224.40 534.95 96.35 224.3 356.1715 429.6 257.4557090 979.3 1.497 1.958 3.362 24.57 47.87 2.45326 1.42468 1.92562 53.16620 6.23452 16.85917 0.04998 438 705 99 16013 7056 829763.126732 4.175 8.803 34205.275 11911.06 13410.2 68.88 13085.1 68.57 13093.1 4723.9 43.4 1540.02 1580.25 1601.21 1339.91 1748.21 52.99 1532.411 382.932 236.507 349.700 11736507 1.275 13.539 13.104 514.13 831939 72385553 265419 958.662 5041.8561 1.07519 1.741 13.039 5.465 27.225 3.98022 9.48452 0.816147 0.477138 17.2340 2.38690 3.50445 19.1307 2.95002 1.53431 2753.77 1809.76 2752.88 1783.66 0.625862 2769.09 1818.11 1.45970 5.232 24.036 3.402 2.501 31.005 12.04 4.38 4.07 4.38 3.90 5.29 1.80 12.99 60.60 14.51 11.04 25.05 21.14 14.64 17.67 218.293 211.459 136.80 787.495 12.898 82.704 104.48 1391.39 1199.39 1892.609 79.198 45.787 79.997 60.052 70.48086 9.805 6.126 21.560 15.181 4.11 5.35 12.47 99.00 41.631 10.947 22.312 0.48933 210165200 592.56 228.62 535.52 96.66 353.5707 430.5 288.9672535 986.5 1.505 1.965 3.446 24.18 47.62 2.47841 1.47956 2.02474 53.16863 6.55050 16.57023 0.04973 428 665 98 15863 7324 815726.122263 4.132 8.669 34166.839 11854.52 13319.6 69.10 12946.4 66.95 12949.0 4738.4 43.3 1388.54 1482.95 1461.18 1251.65 1682.82 53.09 1562.887 387.141 244.998 355.439 11427830 1.258 13.390 13.113 514.08 834019 72580800 262742 957.951 5066.7382 1.08736 1.729 12.873 5.360 27.522 3.97992 9.51868 0.832251 0.484368 17.2901 2.43213 3.53182 19.2565 2.99402 1.59818 2749.23 1823.19 2763.90 1789.23 0.636735 2761.96 1795.37 1.48786 5.153 23.630 3.325 2.481 30.332 11.99 4.46 4.15 4.42 3.96 5.37 1.82 13.00 59.90 14.57 11.26 25.00 21.29 14.60 17.95 220.329 212.620 134.74 801.638 13.292 82.928 97.87 1380.20 1221.19 1875.48 80.072 46.410 81.883 61.251 70.53470 9.851 6.150 21.692 15.234 4.10 5.43 12.67 100.90 42.862 11.112 22.371 0.49204 OpenBenchmarking.org
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 3003 3202 50M 100M 150M 200M 250M SE +/- 2159740.57, N = 3 SE +/- 569155.55, N = 3 207261167 210165200 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 3003 3202 130 260 390 520 650 SE +/- 0.69, N = 3 SE +/- 0.77, N = 3 590.32 592.56 MIN: 447.67 / MAX: 749.27 MIN: 447.8 / MAX: 754.79 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 3003 3202 50 100 150 200 250 SE +/- 0.49, N = 3 SE +/- 0.36, N = 3 224.40 228.62 MIN: 172.58 / MAX: 234.67 MIN: 172.54 / MAX: 238.82 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 3003 3202 120 240 360 480 600 SE +/- 4.29, N = 3 SE +/- 1.59, N = 3 534.95 535.52 MIN: 432.57 / MAX: 611.74 MIN: 453.14 / MAX: 589.94 1. (CC) gcc options: -pthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 3003 3202 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 96.35 96.66 MIN: 61.49 / MAX: 217.11 MIN: 61.56 / MAX: 221.22 1. (CC) gcc options: -pthread
ET: Legacy Renderer: Renderer2 - Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better ET: Legacy 2.75 Renderer: Renderer2 - Resolution: 3840 x 2160 3003 50 100 150 200 250 224.3
Tesseract Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better Tesseract 2014-05-12 Resolution: 3840 x 2160 3003 3202 80 160 240 320 400 SE +/- 3.53, N = 15 SE +/- 4.21, N = 15 356.17 353.57
Warsow Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 3840 x 2160 3003 3202 90 180 270 360 450 SE +/- 0.72, N = 3 SE +/- 0.67, N = 3 429.6 430.5
Xonotic Resolution: 3840 x 2160 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.2 Resolution: 3840 x 2160 - Effects Quality: Ultimate 3003 3202 60 120 180 240 300 SE +/- 4.83, N = 15 SE +/- 3.79, N = 3 257.46 288.97 MIN: 55 / MAX: 623 MIN: 60 / MAX: 571
yquake2 Renderer: OpenGL 3.x - Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 3840 x 2160 3003 3202 200 400 600 800 1000 SE +/- 1.03, N = 3 SE +/- 1.35, N = 3 979.3 986.5 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 3003 3202 0.3386 0.6772 1.0158 1.3544 1.693 SE +/- 0.005, N = 3 SE +/- 0.007, N = 3 1.497 1.505
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 3003 3202 0.4421 0.8842 1.3263 1.7684 2.2105 SE +/- 0.005, N = 3 SE +/- 0.016, N = 3 1.958 1.965
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 3003 3202 0.7754 1.5508 2.3262 3.1016 3.877 SE +/- 0.037, N = 3 SE +/- 0.037, N = 15 3.362 3.446
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 3003 3202 6 12 18 24 30 SE +/- 0.35, N = 3 SE +/- 0.26, N = 4 24.57 24.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 3003 3202 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.22, N = 3 47.87 47.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 3003 3202 0.5576 1.1152 1.6728 2.2304 2.788 SE +/- 0.00427, N = 3 SE +/- 0.01009, N = 3 2.45326 2.47841 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 3003 3202 0.3329 0.6658 0.9987 1.3316 1.6645 SE +/- 0.00070, N = 3 SE +/- 0.00089, N = 3 1.42468 1.47956 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 3003 3202 0.4556 0.9112 1.3668 1.8224 2.278 SE +/- 0.02644, N = 3 SE +/- 0.02919, N = 3 1.92562 2.02474 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 3003 3202 12 24 36 48 60 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 53.17 53.17 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 3003 3202 2 4 6 8 10 SE +/- 0.10692, N = 3 SE +/- 0.02613, N = 3 6.23452 6.55050 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 3003 3202 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.17, N = 3 16.86 16.57 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 3003 3202 0.0112 0.0224 0.0336 0.0448 0.056 SE +/- 0.00046, N = 3 SE +/- 0.00017, N = 3 0.04998 0.04973 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 3003 3202 90 180 270 360 450 SE +/- 1.17, N = 3 SE +/- 3.35, N = 3 438 428 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 3003 3202 150 300 450 600 750 SE +/- 1.32, N = 3 SE +/- 11.02, N = 12 705 665 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 3003 3202 20 40 60 80 100 SE +/- 0.33, N = 3 99 98 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 3003 3202 3K 6K 9K 12K 15K SE +/- 20.67, N = 3 SE +/- 68.50, N = 3 16013 15863 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 3003 3202 1600 3200 4800 6400 8000 SE +/- 202.78, N = 12 SE +/- 175.44, N = 12 7056 7324 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 3003 3202 200K 400K 600K 800K 1000K SE +/- 1722.61, N = 3 SE +/- 552.83, N = 3 829763.13 815726.12 1. (CC) gcc options: -O2 -lrt" -lrt
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 3003 3202 0.9394 1.8788 2.8182 3.7576 4.697 SE +/- 0.006, N = 3 SE +/- 0.017, N = 3 4.175 4.132
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 3003 3202 2 4 6 8 10 SE +/- 0.008, N = 3 SE +/- 0.011, N = 3 8.803 8.669
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 3003 3202 7K 14K 21K 28K 35K SE +/- 146.47, N = 3 SE +/- 130.59, N = 3 34205.28 34166.84 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 49.00, N = 3 SE +/- 59.07, N = 3 11911.06 11854.52 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 19.80, N = 3 SE +/- 25.73, N = 3 13410.2 13319.6 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 3003 3202 15 30 45 60 75 SE +/- 0.92, N = 3 SE +/- 0.20, N = 3 68.88 69.10 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 7.36, N = 3 SE +/- 2.16, N = 3 13085.1 12946.4 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 3003 3202 15 30 45 60 75 SE +/- 0.80, N = 12 SE +/- 0.20, N = 3 68.57 66.95 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 11.50, N = 12 SE +/- 11.91, N = 3 13093.1 12949.0 1. (CC) gcc options: -O3
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 3003 3202 1000 2000 3000 4000 5000 SE +/- 8.02, N = 3 SE +/- 12.71, N = 3 4723.9 4738.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 3003 3202 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 43.4 43.3 1. (CC) gcc options: -O3 -pthread -lz -llzma
IOR Block Size: 2MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 2.40, N = 3 SE +/- 14.36, N = 3 1540.02 1388.54 MIN: 1034.78 / MAX: 2149.91 MIN: 890.97 / MAX: 2113.75 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 4MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 4MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 5.96, N = 3 SE +/- 11.11, N = 11 1580.25 1482.95 MIN: 1161.4 / MAX: 2244.12 MIN: 955.9 / MAX: 2484.92 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 8MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 8MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 6.28, N = 3 SE +/- 28.69, N = 13 1601.21 1461.18 MIN: 1005.82 / MAX: 2534.37 MIN: 491.21 / MAX: 2711.8 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 256MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 256MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 19.35, N = 9 SE +/- 14.35, N = 9 1339.91 1251.65 MIN: 282.98 / MAX: 2236.64 MIN: 354.68 / MAX: 2107.13 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 512MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 512MB - Disk Target: Default Test Directory 3003 3202 400 800 1200 1600 2000 SE +/- 11.67, N = 3 SE +/- 22.61, N = 9 1748.21 1682.82 MIN: 534.9 / MAX: 2360.08 MIN: 251.69 / MAX: 2253.72 1. (CC) gcc options: -O2 -lm -pthread -lmpi
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark 3003 3202 12 24 36 48 60 SE +/- 0.08, N = 3 SE +/- 0.24, N = 3 52.99 53.09 1. (CXX) g++ options: -O2 -fopenmp -ljpeg -lz -lm
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 3003 3202 300 600 900 1200 1500 SE +/- 2.32, N = 3 SE +/- 21.94, N = 3 1532.41 1562.89 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 3003 3202 80 160 240 320 400 SE +/- 1.98, N = 3 SE +/- 1.60, N = 3 382.93 387.14 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 3003 3202 50 100 150 200 250 SE +/- 0.59, N = 3 SE +/- 1.66, N = 3 236.51 245.00 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 3003 3202 80 160 240 320 400 SE +/- 0.36, N = 3 SE +/- 3.87, N = 3 349.70 355.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 3003 3202 3M 6M 9M 12M 15M SE +/- 108514.09, N = 3 SE +/- 40276.62, N = 3 11736507 11427830 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 3003 3202 0.2869 0.5738 0.8607 1.1476 1.4345 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 1.275 1.258 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 3003 3202 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 13.54 13.39 1. (CXX) g++ options: -O3 -pthread -lm
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 3003 3202 3 6 9 12 15 SE +/- 0.15, N = 15 SE +/- 0.13, N = 15 13.10 13.11 1. (CXX) g++ options: -O3 -pthread -lm
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 3003 3202 110 220 330 440 550 SE +/- 0.48, N = 3 SE +/- 4.15, N = 3 514.13 514.08
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 3003 3202 200K 400K 600K 800K 1000K SE +/- 8175.01, N = 3 SE +/- 7522.35, N = 3 831939 834019
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 3003 3202 16M 32M 48M 64M 80M SE +/- 1023469.37, N = 3 SE +/- 319563.42, N = 3 72385553 72580800 1. (CXX) g++ options: -O3 -fopenmp
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 3003 3202 60K 120K 180K 240K 300K 265419 262742 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 3003 3202 200 400 600 800 1000 SE +/- 3.00, N = 3 SE +/- 4.96, N = 3 958.66 957.95 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 3003 3202 1100 2200 3300 4400 5500 SE +/- 59.80, N = 3 SE +/- 66.75, N = 3 5041.86 5066.74 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 3003 3202 0.2447 0.4894 0.7341 0.9788 1.2235 SE +/- 0.00258, N = 3 SE +/- 0.00500, N = 3 1.07519 1.08736
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 3003 3202 0.3917 0.7834 1.1751 1.5668 1.9585 SE +/- 0.001, N = 3 SE +/- 0.004, N = 3 1.741 1.729 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless 3003 3202 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 13.04 12.87 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression 3003 3202 1.2296 2.4592 3.6888 4.9184 6.148 SE +/- 0.059, N = 3 SE +/- 0.062, N = 3 5.465 5.360 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression 3003 3202 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 27.23 27.52 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3003 3202 0.8955 1.791 2.6865 3.582 4.4775 SE +/- 0.01644, N = 3 SE +/- 0.00985, N = 3 3.98022 3.97992 MIN: 3.72 MIN: 3.76 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3003 3202 3 6 9 12 15 SE +/- 0.01431, N = 3 SE +/- 0.00745, N = 3 9.48452 9.51868 MIN: 9.38 MIN: 9.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.1873 0.3746 0.5619 0.7492 0.9365 SE +/- 0.001756, N = 3 SE +/- 0.001536, N = 3 0.816147 0.832251 MIN: 0.74 MIN: 0.75 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.109 0.218 0.327 0.436 0.545 SE +/- 0.002170, N = 3 SE +/- 0.003401, N = 15 0.477138 0.484368 MIN: 0.44 MIN: 0.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 3003 3202 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 17.23 17.29 MIN: 16.81 MIN: 16.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 3003 3202 0.5472 1.0944 1.6416 2.1888 2.736 SE +/- 0.00251, N = 3 SE +/- 0.00987, N = 3 2.38690 2.43213 MIN: 2.28 MIN: 2.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3003 3202 0.7947 1.5894 2.3841 3.1788 3.9735 SE +/- 0.00613, N = 3 SE +/- 0.00421, N = 3 3.50445 3.53182 MIN: 3.38 MIN: 3.41 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 3003 3202 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 19.13 19.26 MIN: 18.77 MIN: 18.8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.6737 1.3474 2.0211 2.6948 3.3685 SE +/- 0.00671, N = 3 SE +/- 0.00911, N = 3 2.95002 2.99402 MIN: 2.81 MIN: 2.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.3596 0.7192 1.0788 1.4384 1.798 SE +/- 0.00153, N = 3 SE +/- 0.00175, N = 3 1.53431 1.59818 MIN: 1.41 MIN: 1.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 13.63, N = 3 SE +/- 2.89, N = 3 2753.77 2749.23 MIN: 2727.8 MIN: 2735.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 15.33, N = 3 SE +/- 9.60, N = 3 1809.76 1823.19 MIN: 1773.21 MIN: 1798.09 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 11.23, N = 3 SE +/- 9.00, N = 3 2752.88 2763.90 MIN: 2722.06 MIN: 2736.15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 8.25, N = 3 SE +/- 17.01, N = 3 1783.66 1789.23 MIN: 1765.38 MIN: 1761.64 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 3003 3202 0.1433 0.2866 0.4299 0.5732 0.7165 SE +/- 0.000749, N = 3 SE +/- 0.000300, N = 3 0.625862 0.636735 MIN: 0.6 MIN: 0.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 9.46, N = 3 SE +/- 4.41, N = 3 2769.09 2761.96 MIN: 2739.1 MIN: 2740.57 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 20.56, N = 3 SE +/- 19.66, N = 4 1818.11 1795.37 MIN: 1764.89 MIN: 1755.37 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.3348 0.6696 1.0044 1.3392 1.674 SE +/- 0.00059, N = 3 SE +/- 0.00214, N = 3 1.45970 1.48786 MIN: 1.39 MIN: 1.39 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 3003 3202 1.1772 2.3544 3.5316 4.7088 5.886 SE +/- 0.062, N = 3 SE +/- 0.022, N = 3 5.232 5.153 MIN: 5.02 / MAX: 8.4 MIN: 5.02 / MAX: 14.32 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 3003 3202 6 12 18 24 30 SE +/- 0.25, N = 3 SE +/- 0.19, N = 3 24.04 23.63 MIN: 21.95 / MAX: 33.08 MIN: 22.31 / MAX: 33.12 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 3003 3202 0.7655 1.531 2.2965 3.062 3.8275 SE +/- 0.047, N = 3 SE +/- 0.039, N = 3 3.402 3.325 MIN: 3.23 / MAX: 5.81 MIN: 3.16 / MAX: 4.02 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 3003 3202 0.5627 1.1254 1.6881 2.2508 2.8135 SE +/- 0.094, N = 3 SE +/- 0.027, N = 3 2.501 2.481 MIN: 2.32 / MAX: 4.53 MIN: 2.42 / MAX: 2.71 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 3003 3202 7 14 21 28 35 SE +/- 0.18, N = 3 SE +/- 0.26, N = 3 31.01 30.33 MIN: 29.92 / MAX: 38.55 MIN: 29.28 / MAX: 56.48 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 3003 3202 3 6 9 12 15 SE +/- 0.15, N = 3 SE +/- 0.01, N = 3 12.04 11.99 MIN: 11.72 / MAX: 14.17 MIN: 11.79 / MAX: 12.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 3003 3202 1.0035 2.007 3.0105 4.014 5.0175 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.38 4.46 MIN: 4.24 / MAX: 7.32 MIN: 4.31 / MAX: 5.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 3003 3202 0.9338 1.8676 2.8014 3.7352 4.669 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 4.07 4.15 MIN: 4.03 / MAX: 5.2 MIN: 4.11 / MAX: 5.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 3003 3202 0.9945 1.989 2.9835 3.978 4.9725 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.38 4.42 MIN: 4.33 / MAX: 4.89 MIN: 4.35 / MAX: 5.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 3003 3202 0.891 1.782 2.673 3.564 4.455 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 3.90 3.96 MIN: 3.78 / MAX: 4.83 MIN: 3.84 / MAX: 5.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 3003 3202 1.2083 2.4166 3.6249 4.8332 6.0415 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.29 5.37 MIN: 5.24 / MAX: 5.79 MIN: 5.31 / MAX: 7.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 3003 3202 0.4095 0.819 1.2285 1.638 2.0475 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.80 1.82 MIN: 1.78 / MAX: 2.28 MIN: 1.79 / MAX: 2.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 3003 3202 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 12.99 13.00 MIN: 12.62 / MAX: 13.57 MIN: 12.64 / MAX: 20.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 3003 3202 14 28 42 56 70 SE +/- 0.13, N = 3 SE +/- 0.09, N = 3 60.60 59.90 MIN: 59.43 / MAX: 62.32 MIN: 58.71 / MAX: 61.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 3003 3202 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.51 14.57 MIN: 14.39 / MAX: 15.06 MIN: 14.45 / MAX: 23.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 3003 3202 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.24, N = 3 11.04 11.26 MIN: 10.95 / MAX: 11.89 MIN: 10.91 / MAX: 19.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 3003 3202 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.17, N = 3 25.05 25.00 MIN: 24.65 / MAX: 26.23 MIN: 24.58 / MAX: 26.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 3003 3202 5 10 15 20 25 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 21.14 21.29 MIN: 20.72 / MAX: 29.47 MIN: 20.98 / MAX: 21.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 3003 3202 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 14.64 14.60 MIN: 14.31 / MAX: 15.31 MIN: 14.21 / MAX: 15.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 3003 3202 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.13, N = 3 17.67 17.95 MIN: 17.47 / MAX: 18.02 MIN: 17.66 / MAX: 19.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 3003 3202 50 100 150 200 250 SE +/- 0.32, N = 3 SE +/- 0.67, N = 3 218.29 220.33 MIN: 208.52 / MAX: 289.05 MIN: 216.95 / MAX: 261.2 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 3003 3202 50 100 150 200 250 SE +/- 0.43, N = 3 SE +/- 0.34, N = 3 211.46 212.62 MIN: 210.71 / MAX: 212.37 MIN: 211.89 / MAX: 213.24 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 3003 3202 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.24, N = 3 136.80 134.74 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 3003 3202 200 400 600 800 1000 787.50 801.64
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 3003 3202 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 12.90 13.29
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 3003 3202 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.07, N = 3 82.70 82.93 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 3003 3202 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.15, N = 3 104.48 97.87 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenFOAM Input: Motorbike 60M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M 3003 3202 300 600 900 1200 1500 SE +/- 0.57, N = 3 SE +/- 0.41, N = 3 1391.39 1380.20 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 3003 3202 300 600 900 1200 1500 SE +/- 0.92, N = 3 SE +/- 3.58, N = 3 1199.39 1221.19 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 3.1.1 Test: Basic - Device: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 3.84, N = 3 SE +/- 6.24, N = 3 1892.61 1875.48 1. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -pthread -lmpi_cxx -lmpi
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 3003 3202 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.31, N = 3 79.20 80.07
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile 3003 3202 11 22 33 44 55 SE +/- 0.43, N = 3 SE +/- 0.49, N = 3 45.79 46.41
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 3003 3202 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 80.00 81.88
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 3003 3202 14 28 42 56 70 SE +/- 0.21, N = 3 SE +/- 0.01, N = 3 60.05 61.25
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU 3003 3202 16 32 48 64 80 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 70.48 70.53
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 3003 3202 3 6 9 12 15 SE +/- 0.045, N = 5 SE +/- 0.041, N = 5 9.805 9.851 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 3003 3202 2 4 6 8 10 SE +/- 0.032, N = 5 SE +/- 0.042, N = 5 6.126 6.150 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 3003 3202 5 10 15 20 25 SE +/- 0.08, N = 4 SE +/- 0.08, N = 4 21.56 21.69 1. (CC) gcc options: -O2 -std=c99
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 3003 3202 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.18, N = 3 15.18 15.23 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 3003 3202 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 4.11 4.10 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 3003 3202 1.2218 2.4436 3.6654 4.8872 6.109 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 5.35 5.43 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 3003 3202 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 12.47 12.67 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 3003 3202 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.15, N = 3 99.00 100.90 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 3003 3202 10 20 30 40 50 SE +/- 0.34, N = 3 SE +/- 0.24, N = 15 41.63 42.86 1. (CC) gcc options: -O2 -ldl -lz -lpthread
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 3003 3202 3 6 9 12 15 SE +/- 0.06, N = 5 SE +/- 0.03, N = 5 10.95 11.11 1. (CXX) g++ options: -rdynamic
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 3003 3202 5 10 15 20 25 SE +/- 0.21, N = 7 SE +/- 0.14, N = 3 22.31 22.37 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 3003 3202 0.1107 0.2214 0.3321 0.4428 0.5535 SE +/- 0.00404, N = 3 SE +/- 0.00234, N = 3 0.48933 0.49204 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
Phoronix Test Suite v10.8.5