5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS AMD Ryzen 9 5950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3202 BIOS) and AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101203-HA-5950XASUS43&grr&rdt .
5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 3003 3202 AMD Ryzen 9 5950X 16-Core @ 3.40GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3003 BIOS) AMD Starship/Matisse 32GB 2000GB Corsair Force MP600 + 2000GB AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB (2100/875MHz) AMD Navi 10 HDMI Audio ASUS MG28U Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.10 5.11.0-051100rc2daily20210108-generic (x86_64) 20210107 GNOME Shell 3.38.1 X Server 1.20.9 amdgpu 19.1.0 4.6 Mesa 21.0.0-devel (git-f01bca8 2021-01-08 groovy-oibaf-ppa) (LLVM 11.0.1) 1.2.164 GCC 10.2.0 ext4 3840x2160 ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3202 BIOS) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201009 Graphics Details - GLAMOR Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected Disk Details - 3202: NONE / errors=remount-ro,relatime,rw / Block Size: 4096
5950X ASUS ROG CROSSHAIR VIII HERO WiFi BIOS hpcc: G-HPL relion: Basic - CPU openfoam: Motorbike 60M ior: 512MB - Default Test Directory qe: AUSURF112 ior: 256MB - Default Test Directory lammps: 20k Atoms onnx: super-resolution-10 - OpenMP CPU onnx: bertsquad-10 - OpenMP CPU cp2k: Fayalite-FIST Data brl-cad: VGR Performance Metric numpy: tesseract: 3840 x 2160 dav1d: Chimera 1080p 10-bit gromacs: Water Benchmark cloverleaf: Lagrangian-Eulerian Hydrodynamics xonotic: 3840 x 2160 - Ultimate sqlite-speedtest: Timed Time - Size 1,000 compress-lz4: 9 - Decompression Speed compress-lz4: 9 - Compression Speed onnx: fcn-resnet101-11 - OpenMP CPU onnx: yolov4 - OpenMP CPU onnx: shufflenet-v2-10 - OpenMP CPU openfoam: Motorbike 30M astcenc: Exhaustive onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU hmmer: Pfam Database Search warsow: 3840 x 2160 build2: Time To Compile onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU build-godot: Time To Compile rav1e: 10 onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 compress-zstd: 19 indigobench: CPU - Supercar indigobench: CPU - Bedroom build-eigen: Time To Compile ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet namd: ATPase Simulation - 327,506 Atoms etlegacy: Renderer2 - 3840 x 2160 compress-lz4: 3 - Decompression Speed compress-lz4: 3 - Compression Speed deepspeech: CPU kripke: build-linux-kernel: Time To Compile amg: rav1e: 5 qmcpack: simple-H2O compress-lz4: 1 - Decompression Speed compress-lz4: 1 - Compression Speed compress-zstd: 3 rav1e: 6 synthmark: VoiceMark_100 x265: Bosphorus 4K ior: 8MB - Default Test Directory espeak: Text-To-Speech Synthesis onednn: IP Shapes 3D - u8s8f32 - CPU webp: Quality 100, Lossless, Highest Compression coremark: CoreMark Size 666 - Iterations Per Second phpbench: PHP Benchmark Suite libraw: Post-Processing Benchmark onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU etcpak: ETC2 dav1d: Chimera 1080p encode-wavpack: WAV To WavPack crafty: Elapsed Time dav1d: Summer Nature 4K encode-ape: WAV To APE astcenc: Thorough tnn: CPU - MobileNet v2 rnnoise: onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU tnn: CPU - SqueezeNet v1.1 etcpak: ETC1 + Dithering webp: Quality 100, Lossless dolfyn: Computational Fluid Dynamics etcpak: ETC1 x265: Bosphorus 1080p ior: 4MB - Default Test Directory onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU lulesh: encode-opus: WAV To Opus Encode lammps: Rhodopsin Protein onednn: IP Shapes 3D - f32 - CPU dav1d: Summer Nature 1080p onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU webp: Quality 100, Highest Compression astcenc: Medium astcenc: Fast etcpak: DXT1 onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU ior: 2MB - Default Test Directory webp: Quality 100 yquake2: OpenGL 3.x - 3840 x 2160 hpcc: Max Ping Pong Bandwidth hpcc: Rand Ring Bandwidth hpcc: Rand Ring Latency hpcc: G-Rand Access hpcc: EP-STREAM Triad hpcc: G-Ptrans hpcc: EP-DGEMM hpcc: G-Ffte 3003 3202 53.16620 1892.609 1391.39 1748.21 1199.39 1339.91 13.539 7056 705 787.495 265419 514.13 356.1715 96.35 1.275 136.80 257.4557090 41.631 13093.1 68.57 99 438 16013 104.48 99.00 1818.11 82.704 429.6 79.997 2769.09 2752.88 2753.77 79.198 3.362 1783.66 1809.76 31.005 2.501 3.402 24.036 5.232 43.4 8.803 4.175 60.052 17.67 14.64 21.14 25.05 11.04 14.51 60.60 12.99 1.80 5.29 3.90 4.38 4.07 4.38 12.04 1.07519 224.3 13085.1 68.88 70.48086 72385553 45.787 207261167 1.497 22.312 13410.2 11911.06 4723.9 1.958 958.662 24.57 1601.21 21.560 0.477138 27.225 829763.126732 831939 52.99 2.95002 2.38690 236.507 590.32 10.947 11736507 224.40 9.805 12.47 218.293 15.181 3.98022 0.816147 211.459 349.700 13.039 12.898 382.932 47.87 1580.25 0.625862 1.45970 5041.8561 6.126 13.104 9.48452 534.95 19.1307 17.2340 5.465 5.35 4.11 1532.411 1.53431 3.50445 1540.02 1.741 979.3 34205.275 1.92562 0.48933 0.04998 1.42468 2.45326 16.85917 6.23452 53.16863 1875.48 1380.20 1682.82 1221.19 1251.65 13.390 7324 665 801.638 262742 514.08 353.5707 96.66 1.258 134.74 288.9672535 42.862 12949.0 66.95 98 428 15863 97.87 100.90 1795.37 82.928 430.5 81.883 2761.96 2763.90 2749.23 80.072 3.446 1789.23 1823.19 30.332 2.481 3.325 23.630 5.153 43.3 8.669 4.132 61.251 17.95 14.60 21.29 25.00 11.26 14.57 59.90 13.00 1.82 5.37 3.96 4.42 4.15 4.46 11.99 1.08736 12946.4 69.10 70.53470 72580800 46.410 210165200 1.505 22.371 13319.6 11854.52 4738.4 1.965 957.951 24.18 1461.18 21.692 0.484368 27.522 815726.122263 834019 53.09 2.99402 2.43213 244.998 592.56 11.112 11427830 228.62 9.851 12.67 220.329 15.234 3.97992 0.832251 212.620 355.439 12.873 13.292 387.141 47.62 1482.95 0.636735 1.48786 5066.7382 6.150 13.113 9.51868 535.52 19.2565 17.2901 5.360 5.43 4.10 1562.887 1.59818 3.53182 1388.54 1.729 986.5 34166.839 2.02474 0.49204 0.04973 1.47956 2.47841 16.57023 6.55050 OpenBenchmarking.org
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 3003 3202 12 24 36 48 60 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 53.17 53.17 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 3.1.1 Test: Basic - Device: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 3.84, N = 3 SE +/- 6.24, N = 3 1892.61 1875.48 1. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -pthread -lmpi_cxx -lmpi
OpenFOAM Input: Motorbike 60M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 60M 3003 3202 300 600 900 1200 1500 SE +/- 0.57, N = 3 SE +/- 0.41, N = 3 1391.39 1380.20 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
IOR Block Size: 512MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 512MB - Disk Target: Default Test Directory 3003 3202 400 800 1200 1600 2000 SE +/- 11.67, N = 3 SE +/- 22.61, N = 9 1748.21 1682.82 MIN: 534.9 / MAX: 2360.08 MIN: 251.69 / MAX: 2253.72 1. (CC) gcc options: -O2 -lm -pthread -lmpi
Quantum ESPRESSO Input: AUSURF112 OpenBenchmarking.org Seconds, Fewer Is Better Quantum ESPRESSO 6.7 Input: AUSURF112 3003 3202 300 600 900 1200 1500 SE +/- 0.92, N = 3 SE +/- 3.58, N = 3 1199.39 1221.19 1. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
IOR Block Size: 256MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 256MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 19.35, N = 9 SE +/- 14.35, N = 9 1339.91 1251.65 MIN: 282.98 / MAX: 2236.64 MIN: 354.68 / MAX: 2107.13 1. (CC) gcc options: -O2 -lm -pthread -lmpi
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 3003 3202 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 13.54 13.39 1. (CXX) g++ options: -O3 -pthread -lm
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU 3003 3202 1600 3200 4800 6400 8000 SE +/- 202.78, N = 12 SE +/- 175.44, N = 12 7056 7324 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU 3003 3202 150 300 450 600 750 SE +/- 1.32, N = 3 SE +/- 11.02, N = 12 705 665 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 3003 3202 200 400 600 800 1000 787.50 801.64
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 3003 3202 60K 120K 180K 240K 300K 265419 262742 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 3003 3202 110 220 330 440 550 SE +/- 0.48, N = 3 SE +/- 4.15, N = 3 514.13 514.08
Tesseract Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better Tesseract 2014-05-12 Resolution: 3840 x 2160 3003 3202 80 160 240 320 400 SE +/- 3.53, N = 15 SE +/- 4.21, N = 15 356.17 353.57
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 3003 3202 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 96.35 96.66 MIN: 61.49 / MAX: 217.11 MIN: 61.56 / MAX: 221.22 1. (CC) gcc options: -pthread
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 3003 3202 0.2869 0.5738 0.8607 1.1476 1.4345 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 1.275 1.258 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 3003 3202 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.24, N = 3 136.80 134.74 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Xonotic Resolution: 3840 x 2160 - Effects Quality: Ultimate OpenBenchmarking.org Frames Per Second, More Is Better Xonotic 0.8.2 Resolution: 3840 x 2160 - Effects Quality: Ultimate 3003 3202 60 120 180 240 300 SE +/- 4.83, N = 15 SE +/- 3.79, N = 3 257.46 288.97 MIN: 55 / MAX: 623 MIN: 60 / MAX: 571
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 3003 3202 10 20 30 40 50 SE +/- 0.34, N = 3 SE +/- 0.24, N = 15 41.63 42.86 1. (CC) gcc options: -O2 -ldl -lz -lpthread
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 11.50, N = 12 SE +/- 11.91, N = 3 13093.1 12949.0 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 3003 3202 15 30 45 60 75 SE +/- 0.80, N = 12 SE +/- 0.20, N = 3 68.57 66.95 1. (CC) gcc options: -O3
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU 3003 3202 20 40 60 80 100 SE +/- 0.33, N = 3 99 98 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU 3003 3202 90 180 270 360 450 SE +/- 1.17, N = 3 SE +/- 3.35, N = 3 438 428 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU 3003 3202 3K 6K 9K 12K 15K SE +/- 20.67, N = 3 SE +/- 68.50, N = 3 16013 15863 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 3003 3202 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.15, N = 3 104.48 97.87 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 3003 3202 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.15, N = 3 99.00 100.90 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 20.56, N = 3 SE +/- 19.66, N = 4 1818.11 1795.37 MIN: 1764.89 MIN: 1755.37 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 3003 3202 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.07, N = 3 82.70 82.93 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Warsow Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 3840 x 2160 3003 3202 90 180 270 360 450 SE +/- 0.72, N = 3 SE +/- 0.67, N = 3 429.6 430.5
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 3003 3202 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 80.00 81.88
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 9.46, N = 3 SE +/- 4.41, N = 3 2769.09 2761.96 MIN: 2739.1 MIN: 2740.57 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 11.23, N = 3 SE +/- 9.00, N = 3 2752.88 2763.90 MIN: 2722.06 MIN: 2736.15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 3003 3202 600 1200 1800 2400 3000 SE +/- 13.63, N = 3 SE +/- 2.89, N = 3 2753.77 2749.23 MIN: 2727.8 MIN: 2735.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 3003 3202 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.31, N = 3 79.20 80.07
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 3003 3202 0.7754 1.5508 2.3262 3.1016 3.877 SE +/- 0.037, N = 3 SE +/- 0.037, N = 15 3.362 3.446
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 8.25, N = 3 SE +/- 17.01, N = 3 1783.66 1789.23 MIN: 1765.38 MIN: 1761.64 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 3003 3202 400 800 1200 1600 2000 SE +/- 15.33, N = 3 SE +/- 9.60, N = 3 1809.76 1823.19 MIN: 1773.21 MIN: 1798.09 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 3003 3202 7 14 21 28 35 SE +/- 0.18, N = 3 SE +/- 0.26, N = 3 31.01 30.33 MIN: 29.92 / MAX: 38.55 MIN: 29.28 / MAX: 56.48 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 3003 3202 0.5627 1.1254 1.6881 2.2508 2.8135 SE +/- 0.094, N = 3 SE +/- 0.027, N = 3 2.501 2.481 MIN: 2.32 / MAX: 4.53 MIN: 2.42 / MAX: 2.71 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 3003 3202 0.7655 1.531 2.2965 3.062 3.8275 SE +/- 0.047, N = 3 SE +/- 0.039, N = 3 3.402 3.325 MIN: 3.23 / MAX: 5.81 MIN: 3.16 / MAX: 4.02 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 3003 3202 6 12 18 24 30 SE +/- 0.25, N = 3 SE +/- 0.19, N = 3 24.04 23.63 MIN: 21.95 / MAX: 33.08 MIN: 22.31 / MAX: 33.12 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 3003 3202 1.1772 2.3544 3.5316 4.7088 5.886 SE +/- 0.062, N = 3 SE +/- 0.022, N = 3 5.232 5.153 MIN: 5.02 / MAX: 8.4 MIN: 5.02 / MAX: 14.32 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 3003 3202 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 43.4 43.3 1. (CC) gcc options: -O3 -pthread -lz -llzma
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 3003 3202 2 4 6 8 10 SE +/- 0.008, N = 3 SE +/- 0.011, N = 3 8.803 8.669
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 3003 3202 0.9394 1.8788 2.8182 3.7576 4.697 SE +/- 0.006, N = 3 SE +/- 0.017, N = 3 4.175 4.132
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 3003 3202 14 28 42 56 70 SE +/- 0.21, N = 3 SE +/- 0.01, N = 3 60.05 61.25
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 3003 3202 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.13, N = 3 17.67 17.95 MIN: 17.47 / MAX: 18.02 MIN: 17.66 / MAX: 19.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 3003 3202 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 14.64 14.60 MIN: 14.31 / MAX: 15.31 MIN: 14.21 / MAX: 15.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 3003 3202 5 10 15 20 25 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 21.14 21.29 MIN: 20.72 / MAX: 29.47 MIN: 20.98 / MAX: 21.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 3003 3202 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.17, N = 3 25.05 25.00 MIN: 24.65 / MAX: 26.23 MIN: 24.58 / MAX: 26.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 3003 3202 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.24, N = 3 11.04 11.26 MIN: 10.95 / MAX: 11.89 MIN: 10.91 / MAX: 19.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 3003 3202 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.51 14.57 MIN: 14.39 / MAX: 15.06 MIN: 14.45 / MAX: 23.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 3003 3202 14 28 42 56 70 SE +/- 0.13, N = 3 SE +/- 0.09, N = 3 60.60 59.90 MIN: 59.43 / MAX: 62.32 MIN: 58.71 / MAX: 61.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 3003 3202 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 12.99 13.00 MIN: 12.62 / MAX: 13.57 MIN: 12.64 / MAX: 20.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 3003 3202 0.4095 0.819 1.2285 1.638 2.0475 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.80 1.82 MIN: 1.78 / MAX: 2.28 MIN: 1.79 / MAX: 2.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 3003 3202 1.2083 2.4166 3.6249 4.8332 6.0415 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.29 5.37 MIN: 5.24 / MAX: 5.79 MIN: 5.31 / MAX: 7.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 3003 3202 0.891 1.782 2.673 3.564 4.455 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 3.90 3.96 MIN: 3.78 / MAX: 4.83 MIN: 3.84 / MAX: 5.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 3003 3202 0.9945 1.989 2.9835 3.978 4.9725 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.38 4.42 MIN: 4.33 / MAX: 4.89 MIN: 4.35 / MAX: 5.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 3003 3202 0.9338 1.8676 2.8014 3.7352 4.669 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 4.07 4.15 MIN: 4.03 / MAX: 5.2 MIN: 4.11 / MAX: 5.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 3003 3202 1.0035 2.007 3.0105 4.014 5.0175 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.38 4.46 MIN: 4.24 / MAX: 7.32 MIN: 4.31 / MAX: 5.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 3003 3202 3 6 9 12 15 SE +/- 0.15, N = 3 SE +/- 0.01, N = 3 12.04 11.99 MIN: 11.72 / MAX: 14.17 MIN: 11.79 / MAX: 12.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 3003 3202 0.2447 0.4894 0.7341 0.9788 1.2235 SE +/- 0.00258, N = 3 SE +/- 0.00500, N = 3 1.07519 1.08736
ET: Legacy Renderer: Renderer2 - Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better ET: Legacy 2.75 Renderer: Renderer2 - Resolution: 3840 x 2160 3003 50 100 150 200 250 224.3
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 7.36, N = 3 SE +/- 2.16, N = 3 13085.1 12946.4 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 3003 3202 15 30 45 60 75 SE +/- 0.92, N = 3 SE +/- 0.20, N = 3 68.88 69.10 1. (CC) gcc options: -O3
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU 3003 3202 16 32 48 64 80 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 70.48 70.53
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 3003 3202 16M 32M 48M 64M 80M SE +/- 1023469.37, N = 3 SE +/- 319563.42, N = 3 72385553 72580800 1. (CXX) g++ options: -O3 -fopenmp
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile 3003 3202 11 22 33 44 55 SE +/- 0.43, N = 3 SE +/- 0.49, N = 3 45.79 46.41
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 3003 3202 50M 100M 150M 200M 250M SE +/- 2159740.57, N = 3 SE +/- 569155.55, N = 3 207261167 210165200 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 3003 3202 0.3386 0.6772 1.0158 1.3544 1.693 SE +/- 0.005, N = 3 SE +/- 0.007, N = 3 1.497 1.505
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.10 Input: simple-H2O 3003 3202 5 10 15 20 25 SE +/- 0.21, N = 7 SE +/- 0.14, N = 3 22.31 22.37 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 19.80, N = 3 SE +/- 25.73, N = 3 13410.2 13319.6 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 3003 3202 3K 6K 9K 12K 15K SE +/- 49.00, N = 3 SE +/- 59.07, N = 3 11911.06 11854.52 1. (CC) gcc options: -O3
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 3003 3202 1000 2000 3000 4000 5000 SE +/- 8.02, N = 3 SE +/- 12.71, N = 3 4723.9 4738.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 3003 3202 0.4421 0.8842 1.3263 1.7684 2.2105 SE +/- 0.005, N = 3 SE +/- 0.016, N = 3 1.958 1.965
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 3003 3202 200 400 600 800 1000 SE +/- 3.00, N = 3 SE +/- 4.96, N = 3 958.66 957.95 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 3003 3202 6 12 18 24 30 SE +/- 0.35, N = 3 SE +/- 0.26, N = 4 24.57 24.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
IOR Block Size: 8MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 8MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 6.28, N = 3 SE +/- 28.69, N = 13 1601.21 1461.18 MIN: 1005.82 / MAX: 2534.37 MIN: 491.21 / MAX: 2711.8 1. (CC) gcc options: -O2 -lm -pthread -lmpi
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 3003 3202 5 10 15 20 25 SE +/- 0.08, N = 4 SE +/- 0.08, N = 4 21.56 21.69 1. (CC) gcc options: -O2 -std=c99
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.109 0.218 0.327 0.436 0.545 SE +/- 0.002170, N = 3 SE +/- 0.003401, N = 15 0.477138 0.484368 MIN: 0.44 MIN: 0.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression 3003 3202 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 27.23 27.52 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 3003 3202 200K 400K 600K 800K 1000K SE +/- 1722.61, N = 3 SE +/- 552.83, N = 3 829763.13 815726.12 1. (CC) gcc options: -O2 -lrt" -lrt
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 3003 3202 200K 400K 600K 800K 1000K SE +/- 8175.01, N = 3 SE +/- 7522.35, N = 3 831939 834019
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark 3003 3202 12 24 36 48 60 SE +/- 0.08, N = 3 SE +/- 0.24, N = 3 52.99 53.09 1. (CXX) g++ options: -O2 -fopenmp -ljpeg -lz -lm
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.6737 1.3474 2.0211 2.6948 3.3685 SE +/- 0.00671, N = 3 SE +/- 0.00911, N = 3 2.95002 2.99402 MIN: 2.81 MIN: 2.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 3003 3202 0.5472 1.0944 1.6416 2.1888 2.736 SE +/- 0.00251, N = 3 SE +/- 0.00987, N = 3 2.38690 2.43213 MIN: 2.28 MIN: 2.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 3003 3202 50 100 150 200 250 SE +/- 0.59, N = 3 SE +/- 1.66, N = 3 236.51 245.00 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 3003 3202 130 260 390 520 650 SE +/- 0.69, N = 3 SE +/- 0.77, N = 3 590.32 592.56 MIN: 447.67 / MAX: 749.27 MIN: 447.8 / MAX: 754.79 1. (CC) gcc options: -pthread
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 3003 3202 3 6 9 12 15 SE +/- 0.06, N = 5 SE +/- 0.03, N = 5 10.95 11.11 1. (CXX) g++ options: -rdynamic
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 3003 3202 3M 6M 9M 12M 15M SE +/- 108514.09, N = 3 SE +/- 40276.62, N = 3 11736507 11427830 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 3003 3202 50 100 150 200 250 SE +/- 0.49, N = 3 SE +/- 0.36, N = 3 224.40 228.62 MIN: 172.58 / MAX: 234.67 MIN: 172.54 / MAX: 238.82 1. (CC) gcc options: -pthread
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 3003 3202 3 6 9 12 15 SE +/- 0.045, N = 5 SE +/- 0.041, N = 5 9.805 9.851 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 3003 3202 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 12.47 12.67 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 3003 3202 50 100 150 200 250 SE +/- 0.32, N = 3 SE +/- 0.67, N = 3 218.29 220.33 MIN: 208.52 / MAX: 289.05 MIN: 216.95 / MAX: 261.2 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 3003 3202 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.18, N = 3 15.18 15.23 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3003 3202 0.8955 1.791 2.6865 3.582 4.4775 SE +/- 0.01644, N = 3 SE +/- 0.00985, N = 3 3.98022 3.97992 MIN: 3.72 MIN: 3.76 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.1873 0.3746 0.5619 0.7492 0.9365 SE +/- 0.001756, N = 3 SE +/- 0.001536, N = 3 0.816147 0.832251 MIN: 0.74 MIN: 0.75 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 3003 3202 50 100 150 200 250 SE +/- 0.43, N = 3 SE +/- 0.34, N = 3 211.46 212.62 MIN: 210.71 / MAX: 212.37 MIN: 211.89 / MAX: 213.24 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering 3003 3202 80 160 240 320 400 SE +/- 0.36, N = 3 SE +/- 3.87, N = 3 349.70 355.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless 3003 3202 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 13.04 12.87 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 3003 3202 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 12.90 13.29
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 3003 3202 80 160 240 320 400 SE +/- 1.98, N = 3 SE +/- 1.60, N = 3 382.93 387.14 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 3003 3202 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.22, N = 3 47.87 47.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
IOR Block Size: 4MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 4MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 5.96, N = 3 SE +/- 11.11, N = 11 1580.25 1482.95 MIN: 1161.4 / MAX: 2244.12 MIN: 955.9 / MAX: 2484.92 1. (CC) gcc options: -O2 -lm -pthread -lmpi
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 3003 3202 0.1433 0.2866 0.4299 0.5732 0.7165 SE +/- 0.000749, N = 3 SE +/- 0.000300, N = 3 0.625862 0.636735 MIN: 0.6 MIN: 0.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.3348 0.6696 1.0044 1.3392 1.674 SE +/- 0.00059, N = 3 SE +/- 0.00214, N = 3 1.45970 1.48786 MIN: 1.39 MIN: 1.39 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 3003 3202 1100 2200 3300 4400 5500 SE +/- 59.80, N = 3 SE +/- 66.75, N = 3 5041.86 5066.74 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 3003 3202 2 4 6 8 10 SE +/- 0.032, N = 5 SE +/- 0.042, N = 5 6.126 6.150 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 3003 3202 3 6 9 12 15 SE +/- 0.15, N = 15 SE +/- 0.13, N = 15 13.10 13.11 1. (CXX) g++ options: -O3 -pthread -lm
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3003 3202 3 6 9 12 15 SE +/- 0.01431, N = 3 SE +/- 0.00745, N = 3 9.48452 9.51868 MIN: 9.38 MIN: 9.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 3003 3202 120 240 360 480 600 SE +/- 4.29, N = 3 SE +/- 1.59, N = 3 534.95 535.52 MIN: 432.57 / MAX: 611.74 MIN: 453.14 / MAX: 589.94 1. (CC) gcc options: -pthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 3003 3202 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 19.13 19.26 MIN: 18.77 MIN: 18.8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 3003 3202 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 17.23 17.29 MIN: 16.81 MIN: 16.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression 3003 3202 1.2296 2.4592 3.6888 4.9184 6.148 SE +/- 0.059, N = 3 SE +/- 0.062, N = 3 5.465 5.360 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 3003 3202 1.2218 2.4436 3.6654 4.8872 6.109 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 5.35 5.43 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 3003 3202 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 4.11 4.10 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 3003 3202 300 600 900 1200 1500 SE +/- 2.32, N = 3 SE +/- 21.94, N = 3 1532.41 1562.89 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 3003 3202 0.3596 0.7192 1.0788 1.4384 1.798 SE +/- 0.00153, N = 3 SE +/- 0.00175, N = 3 1.53431 1.59818 MIN: 1.41 MIN: 1.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3003 3202 0.7947 1.5894 2.3841 3.1788 3.9735 SE +/- 0.00613, N = 3 SE +/- 0.00421, N = 3 3.50445 3.53182 MIN: 3.38 MIN: 3.41 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
IOR Block Size: 2MB - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better IOR 3.3.0 Block Size: 2MB - Disk Target: Default Test Directory 3003 3202 300 600 900 1200 1500 SE +/- 2.40, N = 3 SE +/- 14.36, N = 3 1540.02 1388.54 MIN: 1034.78 / MAX: 2149.91 MIN: 890.97 / MAX: 2113.75 1. (CC) gcc options: -O2 -lm -pthread -lmpi
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 3003 3202 0.3917 0.7834 1.1751 1.5668 1.9585 SE +/- 0.001, N = 3 SE +/- 0.004, N = 3 1.741 1.729 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
yquake2 Renderer: OpenGL 3.x - Resolution: 3840 x 2160 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 3840 x 2160 3003 3202 200 400 600 800 1000 SE +/- 1.03, N = 3 SE +/- 1.35, N = 3 979.3 986.5 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 3003 3202 7K 14K 21K 28K 35K SE +/- 146.47, N = 3 SE +/- 130.59, N = 3 34205.28 34166.84 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 3003 3202 0.4556 0.9112 1.3668 1.8224 2.278 SE +/- 0.02644, N = 3 SE +/- 0.02919, N = 3 1.92562 2.02474 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 3003 3202 0.1107 0.2214 0.3321 0.4428 0.5535 SE +/- 0.00404, N = 3 SE +/- 0.00234, N = 3 0.48933 0.49204 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 3003 3202 0.0112 0.0224 0.0336 0.0448 0.056 SE +/- 0.00046, N = 3 SE +/- 0.00017, N = 3 0.04998 0.04973 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 3003 3202 0.3329 0.6658 0.9987 1.3316 1.6645 SE +/- 0.00070, N = 3 SE +/- 0.00089, N = 3 1.42468 1.47956 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 3003 3202 0.5576 1.1152 1.6728 2.2304 2.788 SE +/- 0.00427, N = 3 SE +/- 0.01009, N = 3 2.45326 2.47841 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 3003 3202 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.17, N = 3 16.86 16.57 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 3003 3202 2 4 6 8 10 SE +/- 0.10692, N = 3 SE +/- 0.02613, N = 3 6.23452 6.55050 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.0.3
Phoronix Test Suite v10.8.5