5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS AMD Ryzen 9 5950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3202 BIOS) and AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101203-HA-5950XASUS43&sro&grs .
5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 3003 3202 AMD Ryzen 9 5950X 16-Core @ 3.40GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3003 BIOS) AMD Starship/Matisse 32GB 2000GB Corsair Force MP600 + 2000GB AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB (2100/875MHz) AMD Navi 10 HDMI Audio ASUS MG28U Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.10 5.11.0-051100rc2daily20210108-generic (x86_64) 20210107 GNOME Shell 3.38.1 X Server 1.20.9 amdgpu 19.1.0 4.6 Mesa 21.0.0-devel (git-f01bca8 2021-01-08 groovy-oibaf-ppa) (LLVM 11.0.1) 1.2.164 GCC 10.2.0 ext4 3840x2160 ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3202 BIOS) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201009 Graphics Details - GLAMOR Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected Disk Details - 3202: NONE / errors=remount-ro,relatime,rw / Block Size: 4096
5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS ior: 2MB - Default Test Directory ior: 256MB - Default Test Directory openfoam: Motorbike 30M ior: 4MB - Default Test Directory onnx: bertsquad-10 - OpenMP CPU hpcc: Rand Ring Bandwidth hpcc: G-Ffte onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU ior: 512MB - Default Test Directory hpcc: EP-STREAM Triad etcpak: ETC2 dolfyn: Computational Fluid Dynamics sqlite-speedtest: Timed Time - Size 1,000 crafty: Elapsed Time rav1e: 10 compress-lz4: 9 - Compression Speed build2: Time To Compile onnx: yolov4 - OpenMP CPU mnn: MobileNetV2_224 mnn: inception-v3 build-eigen: Time To Compile ncnn: CPU - alexnet etcpak: DXT1 onednn: IP Shapes 1D - u8s8f32 - CPU ncnn: CPU-v3-v3 - mobilenet-v3 webp: Quality 100, Highest Compression onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU astcenc: Exhaustive onednn: Deconvolution Batch shapes_1d - f32 - CPU dav1d: Summer Nature 4K ncnn: CPU-v2-v2 - mobilenet-v2 qe: AUSURF112 cp2k: Fayalite-FIST Data hpcc: EP-DGEMM onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU coremark: CoreMark Size 666 - Iterations Per Second mnn: resnet-v2-50 etcpak: ETC1 + Dithering x265: Bosphorus 4K astcenc: Thorough ncnn: CPU - regnety_400m indigobench: CPU - Supercar ncnn: CPU - mnasnet mnn: SqueezeNetV1.0 cloverleaf: Lagrangian-Eulerian Hydrodynamics onednn: IP Shapes 3D - u8s8f32 - CPU ncnn: CPU - efficientnet-b0 encode-wavpack: WAV To WavPack astcenc: Medium onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU amg: build-linux-kernel: Time To Compile gromacs: Water Benchmark webp: Quality 100, Lossless onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU ncnn: CPU - vgg16 namd: ATPase Simulation - 327,506 Atoms compress-lz4: 9 - Decompression Speed lammps: 20k Atoms ncnn: CPU - blazeface build-godot: Time To Compile etcpak: ETC1 webp: Quality 100, Lossless, Highest Compression compress-lz4: 3 - Decompression Speed indigobench: CPU - Bedroom hpcc: G-Ptrans onnx: fcn-resnet101-11 - OpenMP CPU brl-cad: VGR Performance Metric onnx: shufflenet-v2-10 - OpenMP CPU tnn: CPU - MobileNet v2 relion: Basic - CPU ncnn: CPU - shufflenet-v2 openfoam: Motorbike 60M onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU tesseract: 3840 x 2160 yquake2: OpenGL 3.x - 3840 x 2160 ncnn: CPU - yolov4-tiny webp: Quality 100 compress-lz4: 1 - Decompression Speed onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU espeak: Text-To-Speech Synthesis hpcc: Rand Ring Latency tnn: CPU - SqueezeNet v1.1 rav1e: 5 x265: Bosphorus 1080p hpcc: G-Rand Access lulesh: compress-lz4: 1 - Compression Speed encode-ape: WAV To APE ncnn: CPU - mobilenet ncnn: CPU - resnet18 onednn: Recurrent Neural Network Training - u8s8f32 - CPU encode-opus: WAV To Opus Encode dav1d: Chimera 1080p onednn: IP Shapes 3D - f32 - CPU rav1e: 6 rnnoise: onednn: Convolution Batch Shapes Auto - f32 - CPU dav1d: Chimera 1080p 10-bit compress-lz4: 3 - Compression Speed onednn: Recurrent Neural Network Inference - u8s8f32 - CPU compress-zstd: 3 ncnn: CPU - squeezenet_ssd hmmer: Pfam Database Search kripke: qmcpack: simple-H2O onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU phpbench: PHP Benchmark Suite astcenc: Fast compress-zstd: 19 warsow: 3840 x 2160 ncnn: CPU - resnet50 libraw: Post-Processing Benchmark onednn: Recurrent Neural Network Training - f32 - CPU hpcc: Max Ping Pong Bandwidth dav1d: Summer Nature 1080p ncnn: CPU - googlenet deepspeech: CPU synthmark: VoiceMark_100 lammps: Rhodopsin Protein numpy: onednn: IP Shapes 1D - f32 - CPU hpcc: G-HPL etlegacy: Renderer2 - 3840 x 2160 ior: 8MB - Default Test Directory onnx: super-resolution-10 - OpenMP CPU mnn: mobilenet-v1-1.0 xonotic: 3840 x 2160 - Ultimate 3003 3202 1540.02 1339.91 104.48 1580.25 705 1.92562 6.23452 1.53431 1748.21 1.42468 236.507 12.898 41.631 11736507 3.362 68.57 79.997 438 3.402 31.005 60.052 11.04 1532.411 0.816147 4.07 5.465 1.45970 99.00 2.38690 224.40 4.38 1199.39 787.495 16.85917 0.625862 829763.126732 24.036 349.700 24.57 12.47 17.67 8.803 3.90 5.232 136.80 0.477138 5.29 10.947 5.35 2.95002 207261167 45.787 1.275 13.039 1818.11 60.60 1.07519 13093.1 13.539 1.80 79.198 382.932 27.225 13085.1 4.175 2.45326 99 265419 16013 218.293 1892.609 4.38 1391.39 3.50445 1809.76 356.1715 979.3 21.14 1.741 13410.2 19.1307 21.560 0.48933 211.459 1.497 47.87 0.04998 5041.8561 11911.06 9.805 12.04 14.51 2752.88 6.126 590.32 9.48452 1.958 15.181 17.2340 96.35 68.88 1783.66 4723.9 14.64 82.704 72385553 22.312 2769.09 831939 4.11 43.4 429.6 25.05 52.99 2753.77 34205.275 534.95 12.99 70.48086 958.662 13.104 514.13 3.98022 53.16620 224.3 1601.21 7056 2.501 257.4557090 1388.54 1251.65 97.87 1482.95 665 2.02474 6.55050 1.59818 1682.82 1.47956 244.998 13.292 42.862 11427830 3.446 66.95 81.883 428 3.325 30.332 61.251 11.26 1562.887 0.832251 4.15 5.360 1.48786 100.90 2.43213 228.62 4.46 1221.19 801.638 16.57023 0.636735 815726.122263 23.630 355.439 24.18 12.67 17.95 8.669 3.96 5.153 134.74 0.484368 5.37 11.112 5.43 2.99402 210165200 46.410 1.258 12.873 1795.37 59.90 1.08736 12949.0 13.390 1.82 80.072 387.141 27.522 12946.4 4.132 2.47841 98 262742 15863 220.329 1875.48 4.42 1380.20 3.53182 1823.19 353.5707 986.5 21.29 1.729 13319.6 19.2565 21.692 0.49204 212.620 1.505 47.62 0.04973 5066.7382 11854.52 9.851 11.99 14.57 2763.90 6.150 592.56 9.51868 1.965 15.234 17.2901 96.66 69.10 1789.23 4738.4 14.60 82.928 72580800 22.371 2761.96 834019 4.10 43.3 430.5 25.00 53.09 2749.23 34166.839 535.52 13.00 70.53470 957.951 13.113 514.08 3.97992 53.16863 1461.18 7324 2.481 288.9672535 OpenBenchmarking.org
IOR Block Size: 2MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 2.40, N = 3 SE +/- 14.36, N = 3 1540.02 1388.54 MIN: 1034.78 / MAX: 2149.91 MIN: 890.97 / MAX: 2113.75 1. (CC) gcc options: -O2 -lm -pthread -lmpi
IOR Block Size: 256MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 256MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 19.35, N = 9 SE +/- 14.35, N = 9 1339.91 1251.65 MIN: 282.98 / MAX: 2236.64 MIN: 354.68 / MAX: 2107.13 1. (CC) gcc options: -O2 -lm -pthread -lmpi
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 3003 3202 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.15, N = 3 104.48 97.87 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
IOR Block Size: 4MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 4MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 5.96, N = 3 SE +/- 11.11, N = 11 1580.25 1482.95 MIN: 1161.4 / MAX: 2244.12 MIN: 955.9 / MAX: 2484.92 1. (CC) gcc options: -O2 -lm -pthread -lmpi
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 3003 3202 150 300 450 600 750 SE +/- 1.32, N = 3 SE +/- 11.02, N = 12 705 665 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 3003 3202 0.4556 0.9112 1.3668 1.8224 2.278 SE +/- 0.02644, N = 3 SE +/- 0.02919, N = 3 1.92562 2.02474 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 3003 3202 2 4 6 8 10 SE +/- 0.10692, N = 3 SE +/- 0.02613, N = 3 6.23452 6.55050 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.3596 0.7192 1.0788 1.4384 1.798 SE +/- 0.00153, N = 3 SE +/- 0.00175, N = 3 1.53431 1.59818 MIN: 1.41 MIN: 1.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
IOR Block Size: 512MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 512MB - Disk Target: Default Test Directory 3003 3202 400 800 1200 1600 2000 SE +/- 11.67, N = 3 SE +/- 22.61, N = 9 1748.21 1682.82 MIN: 534.9 / MAX: 2360.08 MIN: 251.69 / MAX: 2253.72 1. (CC) gcc options: -O2 -lm -pthread -lmpi
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 3003 3202 0.3329 0.6658 0.9987 1.3316 1.6645 SE +/- 0.00070, N = 3 SE +/- 0.00089, N = 3 1.42468 1.47956 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 3003 3202 50 100 150 200 250 SE +/- 0.59, N = 3 SE +/- 1.66, N = 3 236.51 245.00 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 3003 3202 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 12.90 13.29
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 3003 3202 10 20 30 40 50 SE +/- 0.34, N = 3 SE +/- 0.24, N = 15 41.63 42.86 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 3003 3202 3M 6M 9M 12M 15M SE +/- 108514.09, N = 3 SE +/- 40276.62, N = 3 11736507 11427830 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 3003 3202 0.7754 1.5508 2.3262 3.1016 3.877 SE +/- 0.037, N = 3 SE +/- 0.037, N = 15 3.362 3.446
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 3003 3202 15 30 45 60 75 SE +/- 0.80, N = 12 SE +/- 0.20, N = 3 68.57 66.95 1. (CC) gcc options: -O3
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 3003 3202 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 80.00 81.88
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 3003 3202 90 180 270 360 450 SE +/- 1.17, N = 3 SE +/- 3.35, N = 3 438 428 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 3003 3202 0.7655 1.531 2.2965 3.062 3.8275 SE +/- 0.047, N = 3 SE +/- 0.039, N = 3 3.402 3.325 MIN: 3.23 / MAX: 5.81 MIN: 3.16 / MAX: 4.02 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 3003 3202 7 14 21 28 35 SE +/- 0.18, N = 3 SE +/- 0.26, N = 3 31.01 30.33 MIN: 29.92 / MAX: 38.55 MIN: 29.28 / MAX: 56.48 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 3003 3202 14 28 42 56 70 SE +/- 0.21, N = 3 SE +/- 0.01, N = 3 60.05 61.25
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 3003 3202 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.24, N = 3 11.04 11.26 MIN: 10.95 / MAX: 11.89 MIN: 10.91 / MAX: 19.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 3003 3202 300 600 900 1200 1500 SE +/- 2.32, N = 3 SE +/- 21.94, N = 3 1532.41 1562.89 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.1873 0.3746 0.5619 0.7492 0.9365 SE +/- 0.001756, N = 3 SE +/- 0.001536, N = 3 0.816147 0.832251 MIN: 0.74 MIN: 0.75 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 3003 3202 0.9338 1.8676 2.8014 3.7352 4.669 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 4.07 4.15 MIN: 4.03 / MAX: 5.2 MIN: 4.11 / MAX: 5.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression 3003 3202 1.2296 2.4592 3.6888 4.9184 6.148 SE +/- 0.059, N = 3 SE +/- 0.062, N = 3 5.465 5.360 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.3348 0.6696 1.0044 1.3392 1.674 SE +/- 0.00059, N = 3 SE +/- 0.00214, N = 3 1.45970 1.48786 MIN: 1.39 MIN: 1.39 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 3003 3202 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.15, N = 3 99.00 100.90 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 3003 3202 0.5472 1.0944 1.6416 2.1888 2.736 SE +/- 0.00251, N = 3 SE +/- 0.00987, N = 3 2.38690 2.43213 MIN: 2.28 MIN: 2.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 3003 3202 50 100 150 200 250 SE +/- 0.49, N = 3 SE +/- 0.36, N = 3 224.40 228.62 MIN: 172.58 / MAX: 234.67 MIN: 172.54 / MAX: 238.82 1. (CC) gcc options: -pthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 3003 3202 1.0035 2.007 3.0105 4.014 5.0175 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.38 4.46 MIN: 4.24 / MAX: 7.32 MIN: 4.31 / MAX: 5.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 3003 3202 300 600 900 1200 1500 SE +/- 0.92, N = 3 SE +/- 3.58, N = 3 1199.39 1221.19 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 3003 3202 200 400 600 800 1000 787.50 801.64
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 3003 3202 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.17, N = 3 16.86 16.57 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 3003 3202 0.1433 0.2866 0.4299 0.5732 0.7165 SE +/- 0.000749, N = 3 SE +/- 0.000300, N = 3 0.625862 0.636735 MIN: 0.6 MIN: 0.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 3003 3202 200K 400K 600K 800K 1000K SE +/- 1722.61, N = 3 SE +/- 552.83, N = 3 829763.13 815726.12 1. (CC) gcc options: -O2 -lrt" -lrt
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 3003 3202 6 12 18 24 30 SE +/- 0.25, N = 3 SE +/- 0.19, N = 3 24.04 23.63 MIN: 21.95 / MAX: 33.08 MIN: 22.31 / MAX: 33.12 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 3003 3202 80 160 240 320 400 SE +/- 0.36, N = 3 SE +/- 3.87, N = 3 349.70 355.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 3003 3202 6 12 18 24 30 SE +/- 0.35, N = 3 SE +/- 0.26, N = 4 24.57 24.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 3003 3202 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 12.47 12.67 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 3003 3202 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.13, N = 3 17.67 17.95 MIN: 17.47 / MAX: 18.02 MIN: 17.66 / MAX: 19.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 3003 3202 2 4 6 8 10 SE +/- 0.008, N = 3 SE +/- 0.011, N = 3 8.803 8.669
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 3003 3202 0.891 1.782 2.673 3.564 4.455 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 3.90 3.96 MIN: 3.78 / MAX: 4.83 MIN: 3.84 / MAX: 5.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 3003 3202 1.1772 2.3544 3.5316 4.7088 5.886 SE +/- 0.062, N = 3 SE +/- 0.022, N = 3 5.232 5.153 MIN: 5.02 / MAX: 8.4 MIN: 5.02 / MAX: 14.32 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 3003 3202 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.24, N = 3 136.80 134.74 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.109 0.218 0.327 0.436 0.545 SE +/- 0.002170, N = 3 SE +/- 0.003401, N = 15 0.477138 0.484368 MIN: 0.44 MIN: 0.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 3003 3202 1.2083 2.4166 3.6249 4.8332 6.0415 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.29 5.37 MIN: 5.24 / MAX: 5.79 MIN: 5.31 / MAX: 7.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 3003 3202 3 6 9 12 15 SE +/- 0.06, N = 5 SE +/- 0.03, N = 5 10.95 11.11 1. (CXX) g++ options: -rdynamic
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 3003 3202 1.2218 2.4436 3.6654 4.8872 6.109 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 5.35 5.43 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.6737 1.3474 2.0211 2.6948 3.3685 SE +/- 0.00671, N = 3 SE +/- 0.00911, N = 3 2.95002 2.99402 MIN: 2.81 MIN: 2.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 3003 3202 50M 100M 150M 200M 250M SE +/- 2159740.57, N = 3 SE +/- 569155.55, N = 3 207261167 210165200 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile 3003 3202 11 22 33 44 55 SE +/- 0.43, N = 3 SE +/- 0.49, N = 3 45.79 46.41
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 3003 3202 0.2869 0.5738 0.8607 1.1476 1.4345 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 1.275 1.258 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless 3003 3202 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 13.04 12.87 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 20.56, N = 3 SE +/- 19.66, N = 4 1818.11 1795.37 MIN: 1764.89 MIN: 1755.37 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 3003 3202 14 28 42 56 70 SE +/- 0.13, N = 3 SE +/- 0.09, N = 3 60.60 59.90 MIN: 59.43 / MAX: 62.32 MIN: 58.71 / MAX: 61.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 3003 3202 0.2447 0.4894 0.7341 0.9788 1.2235 SE +/- 0.00258, N = 3 SE +/- 0.00500, N = 3 1.07519 1.08736
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 11.50, N = 12 SE +/- 11.91, N = 3 13093.1 12949.0 1. (CC) gcc options: -O3
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 3003 3202 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 13.54 13.39 1. (CXX) g++ options: -O3 -pthread -lm
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 3003 3202 0.4095 0.819 1.2285 1.638 2.0475 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.80 1.82 MIN: 1.78 / MAX: 2.28 MIN: 1.79 / MAX: 2.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 3003 3202 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.31, N = 3 79.20 80.07
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 3003 3202 80 160 240 320 400 SE +/- 1.98, N = 3 SE +/- 1.60, N = 3 382.93 387.14 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression 3003 3202 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 27.23 27.52 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 7.36, N = 3 SE +/- 2.16, N = 3 13085.1 12946.4 1. (CC) gcc options: -O3
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 3003 3202 0.9394 1.8788 2.8182 3.7576 4.697 SE +/- 0.006, N = 3 SE +/- 0.017, N = 3 4.175 4.132
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 3003 3202 0.5576 1.1152 1.6728 2.2304 2.788 SE +/- 0.00427, N = 3 SE +/- 0.01009, N = 3 2.45326 2.47841 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 3003 3202 20 40 60 80 100 SE +/- 0.33, N = 3 99 98 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 3003 3202 60K 120K 180K 240K 300K 265419 262742 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 3003 3202 3K 6K 9K 12K 15K SE +/- 20.67, N = 3 SE +/- 68.50, N = 3 16013 15863 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 3003 3202 50 100 150 200 250 SE +/- 0.32, N = 3 SE +/- 0.67, N = 3 218.29 220.33 MIN: 208.52 / MAX: 289.05 MIN: 216.95 / MAX: 261.2 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 3.1.1 Test: Basic - Device: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 3.84, N = 3 SE +/- 6.24, N = 3 1892.61 1875.48 1. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -pthread -lmpi_cxx -lmpi
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 3003 3202 0.9945 1.989 2.9835 3.978 4.9725 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.38 4.42 MIN: 4.33 / MAX: 4.89 MIN: 4.35 / MAX: 5.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenFOAM Input: Motorbike 60M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M 3003 3202 300 600 900 1200 1500 SE +/- 0.57, N = 3 SE +/- 0.41, N = 3 1391.39 1380.20 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3003 3202 0.7947 1.5894 2.3841 3.1788 3.9735 SE +/- 0.00613, N = 3 SE +/- 0.00421, N = 3 3.50445 3.53182 MIN: 3.38 MIN: 3.41 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 15.33, N = 3 SE +/- 9.60, N = 3 1809.76 1823.19 MIN: 1773.21 MIN: 1798.09 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Tesseract Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better Tesseract 2014-05-12 Resolution: 3840 x 2160 3003 3202 80 160 240 320 400 SE +/- 3.53, N = 15 SE +/- 4.21, N = 15 356.17 353.57
yquake2 Renderer: OpenGL 3.x - Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 3840 x 2160 3003 3202 200 400 600 800 1000 SE +/- 1.03, N = 3 SE +/- 1.35, N = 3 979.3 986.5 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 3003 3202 5 10 15 20 25 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 21.14 21.29 MIN: 20.72 / MAX: 29.47 MIN: 20.98 / MAX: 21.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 3003 3202 0.3917 0.7834 1.1751 1.5668 1.9585 SE +/- 0.001, N = 3 SE +/- 0.004, N = 3 1.741 1.729 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 19.80, N = 3 SE +/- 25.73, N = 3 13410.2 13319.6 1. (CC) gcc options: -O3
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 3003 3202 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 19.13 19.26 MIN: 18.77 MIN: 18.8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 3003 3202 5 10 15 20 25 SE +/- 0.08, N = 4 SE +/- 0.08, N = 4 21.56 21.69 1. (CC) gcc options: -O2 -std=c99
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 3003 3202 0.1107 0.2214 0.3321 0.4428 0.5535 SE +/- 0.00404, N = 3 SE +/- 0.00234, N = 3 0.48933 0.49204 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 3003 3202 50 100 150 200 250 SE +/- 0.43, N = 3 SE +/- 0.34, N = 3 211.46 212.62 MIN: 210.71 / MAX: 212.37 MIN: 211.89 / MAX: 213.24 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 3003 3202 0.3386 0.6772 1.0158 1.3544 1.693 SE +/- 0.005, N = 3 SE +/- 0.007, N = 3 1.497 1.505
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 3003 3202 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.22, N = 3 47.87 47.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 3003 3202 0.0112 0.0224 0.0336 0.0448 0.056 SE +/- 0.00046, N = 3 SE +/- 0.00017, N = 3 0.04998 0.04973 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 3003 3202 1100 2200 3300 4400 5500 SE +/- 59.80, N = 3 SE +/- 66.75, N = 3 5041.86 5066.74 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 49.00, N = 3 SE +/- 59.07, N = 3 11911.06 11854.52 1. (CC) gcc options: -O3
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 3003 3202 3 6 9 12 15 SE +/- 0.045, N = 5 SE +/- 0.041, N = 5 9.805 9.851 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 3003 3202 3 6 9 12 15 SE +/- 0.15, N = 3 SE +/- 0.01, N = 3 12.04 11.99 MIN: 11.72 / MAX: 14.17 MIN: 11.79 / MAX: 12.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 3003 3202 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.51 14.57 MIN: 14.39 / MAX: 15.06 MIN: 14.45 / MAX: 23.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 11.23, N = 3 SE +/- 9.00, N = 3 2752.88 2763.90 MIN: 2722.06 MIN: 2736.15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 3003 3202 2 4 6 8 10 SE +/- 0.032, N = 5 SE +/- 0.042, N = 5 6.126 6.150 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 3003 3202 130 260 390 520 650 SE +/- 0.69, N = 3 SE +/- 0.77, N = 3 590.32 592.56 MIN: 447.67 / MAX: 749.27 MIN: 447.8 / MAX: 754.79 1. (CC) gcc options: -pthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3003 3202 3 6 9 12 15 SE +/- 0.01431, N = 3 SE +/- 0.00745, N = 3 9.48452 9.51868 MIN: 9.38 MIN: 9.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 3003 3202 0.4421 0.8842 1.3263 1.7684 2.2105 SE +/- 0.005, N = 3 SE +/- 0.016, N = 3 1.958 1.965
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 3003 3202 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.18, N = 3 15.18 15.23 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 3003 3202 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 17.23 17.29 MIN: 16.81 MIN: 16.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 3003 3202 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 96.35 96.66 MIN: 61.49 / MAX: 217.11 MIN: 61.56 / MAX: 221.22 1. (CC) gcc options: -pthread
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 3003 3202 15 30 45 60 75 SE +/- 0.92, N = 3 SE +/- 0.20, N = 3 68.88 69.10 1. (CC) gcc options: -O3
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 8.25, N = 3 SE +/- 17.01, N = 3 1783.66 1789.23 MIN: 1765.38 MIN: 1761.64 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 3003 3202 1000 2000 3000 4000 5000 SE +/- 8.02, N = 3 SE +/- 12.71, N = 3 4723.9 4738.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 3003 3202 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 14.64 14.60 MIN: 14.31 / MAX: 15.31 MIN: 14.21 / MAX: 15.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 3003 3202 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.07, N = 3 82.70 82.93 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 3003 3202 16M 32M 48M 64M 80M SE +/- 1023469.37, N = 3 SE +/- 319563.42, N = 3 72385553 72580800 1. (CXX) g++ options: -O3 -fopenmp
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 3003 3202 5 10 15 20 25 SE +/- 0.21, N = 7 SE +/- 0.14, N = 3 22.31 22.37 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 9.46, N = 3 SE +/- 4.41, N = 3 2769.09 2761.96 MIN: 2739.1 MIN: 2740.57 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 3003 3202 200K 400K 600K 800K 1000K SE +/- 8175.01, N = 3 SE +/- 7522.35, N = 3 831939 834019
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 3003 3202 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 4.11 4.10 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 3003 3202 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 43.4 43.3 1. (CC) gcc options: -O3 -pthread -lz -llzma
Warsow Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 3840 x 2160 3003 3202 90 180 270 360 450 SE +/- 0.72, N = 3 SE +/- 0.67, N = 3 429.6 430.5
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 3003 3202 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.17, N = 3 25.05 25.00 MIN: 24.65 / MAX: 26.23 MIN: 24.58 / MAX: 26.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark 3003 3202 12 24 36 48 60 SE +/- 0.08, N = 3 SE +/- 0.24, N = 3 52.99 53.09 1. (CXX) g++ options: -O2 -fopenmp -ljpeg -lz -lm
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 13.63, N = 3 SE +/- 2.89, N = 3 2753.77 2749.23 MIN: 2727.8 MIN: 2735.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 3003 3202 7K 14K 21K 28K 35K SE +/- 146.47, N = 3 SE +/- 130.59, N = 3 34205.28 34166.84 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 3003 3202 120 240 360 480 600 SE +/- 4.29, N = 3 SE +/- 1.59, N = 3 534.95 535.52 MIN: 432.57 / MAX: 611.74 MIN: 453.14 / MAX: 589.94 1. (CC) gcc options: -pthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 3003 3202 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 12.99 13.00 MIN: 12.62 / MAX: 13.57 MIN: 12.64 / MAX: 20.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU 3003 3202 16 32 48 64 80 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 70.48 70.53
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 3003 3202 200 400 600 800 1000 SE +/- 3.00, N = 3 SE +/- 4.96, N = 3 958.66 957.95 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 3003 3202 3 6 9 12 15 SE +/- 0.15, N = 15 SE +/- 0.13, N = 15 13.10 13.11 1. (CXX) g++ options: -O3 -pthread -lm
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 3003 3202 110 220 330 440 550 SE +/- 0.48, N = 3 SE +/- 4.15, N = 3 514.13 514.08
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3003 3202 0.8955 1.791 2.6865 3.582 4.4775 SE +/- 0.01644, N = 3 SE +/- 0.00985, N = 3 3.98022 3.97992 MIN: 3.72 MIN: 3.76 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 3003 3202 12 24 36 48 60 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 53.17 53.17 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
ET: Legacy Renderer: Renderer2 - Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better ET: Legacy 2.75 Renderer: Renderer2 - Resolution: 3840 x 2160 3003 50 100 150 200 250 224.3
IOR Block Size: 8MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 8MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 6.28, N = 3 SE +/- 28.69, N = 13 1601.21 1461.18 MIN: 1005.82 / MAX: 2534.37 MIN: 491.21 / MAX: 2711.8 1. (CC) gcc options: -O2 -lm -pthread -lmpi
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 3003 3202 1600 3200 4800 6400 8000 SE +/- 202.78, N = 12 SE +/- 175.44, N = 12 7056 7324 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 3003 3202 0.5627 1.1254 1.6881 2.2508 2.8135 SE +/- 0.094, N = 3 SE +/- 0.027, N = 3 2.501 2.481 MIN: 2.32 / MAX: 4.53 MIN: 2.42 / MAX: 2.71 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Xonotic Resolution: 3840 x 2160 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.2 Resolution: 3840 x 2160 - Effects Quality: Ultimate 3003 3202 60 120 180 240 300 SE +/- 4.83, N = 15 SE +/- 3.79, N = 3 257.46 288.97 MIN: 55 / MAX: 623 MIN: 60 / MAX: 571
Phoronix Test Suite v10.8.5