Core i7 6800K Ubuntu 20.10 Intel Core i7-6800K testing with a MSI X99A WORKSTATION (MS-7A54) v1.0 (1.10 BIOS) and Zotac NVIDIA NV137 2GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101059-HA-COREI768045&rdt&grs .
Core i7 6800K Ubuntu 20.10 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution OpenGL Run 1 Run 2 Run 3 Intel Core i7-6800K @ 3.80GHz (6 Cores / 12 Threads) MSI X99A WORKSTATION (MS-7A54) v1.0 (1.10 BIOS) Intel Xeon E7 v4/Xeon 16GB 120GB TOSHIBA TR150 Zotac NVIDIA GeForce GTX 1050 Realtek ALC1150 G237HL Intel I218-LM + Intel I210 Ubuntu 20.10 5.8.0-33-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 modesetting 1.20.9 GCC 10.2.0 ext4 1920x1080 Zotac NVIDIA NV137 2GB 4.3 Mesa 20.2.1 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0xb000038 Python Details - Python 3.8.6 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
Core i7 6800K Ubuntu 20.10 redis: LPOP hpcc: EP-STREAM Triad astcenc: Medium astcenc: Thorough lammps: Rhodopsin Protein redis: SET hpcc: Rand Ring Latency ncnn: CPU - resnet18 redis: GET onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - f32 - CPU redis: SADD onednn: IP Shapes 3D - u8s8f32 - CPU ncnn: CPU - resnet50 ncnn: CPU - regnety_400m ncnn: CPU-v2-v2 - mobilenet-v2 hpcc: G-Rand Access basis: UASTC Level 0 onednn: IP Shapes 1D - u8s8f32 - CPU ncnn: CPU - efficientnet-b0 onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU x265: Bosphorus 4K onednn: Deconvolution Batch shapes_3d - f32 - CPU coremark: CoreMark Size 666 - Iterations Per Second asmfish: 1024 Hash Memory, 26 Depth hpcc: EP-DGEMM x265: Bosphorus 1080p rav1e: 5 encode-opus: WAV To Opus Encode node-web-tooling: embree: Pathtracer - Asian Dragon mafft: Multiple Sequence Alignment - LSU RNA build-ffmpeg: Time To Compile rnnoise: astcenc: Exhaustive rav1e: 1 ncnn: CPU-v3-v3 - mobilenet-v3 onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU deepspeech: CPU embree: Pathtracer ISPC - Asian Dragon basis: UASTC Level 2 + RDO Post-Processing hpcc: Max Ping Pong Bandwidth indigobench: CPU - Supercar redis: LPUSH stockfish: Total Time onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU ncnn: CPU - mobilenet onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU basis: ETC1S embree: Pathtracer ISPC - Crown ncnn: CPU - yolov4-tiny rav1e: 10 encode-wavpack: WAV To WavPack gromacs: Water Benchmark embree: Pathtracer - Crown ncnn: CPU - alexnet rav1e: 6 yquake2: Software CPU - 1920 x 1080 indigobench: CPU - Bedroom ai-benchmark: Device Inference Score build-linux-kernel: Time To Compile build-eigen: Time To Compile ncnn: CPU - googlenet ncnn: CPU - squeezenet_ssd hpcc: G-HPL onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU hmmer: Pfam Database Search hpcc: G-Ptrans onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU compress-lz4: 1 - Compression Speed phpbench: PHP Benchmark Suite onednn: Deconvolution Batch shapes_1d - f32 - CPU dav1d: Summer Nature 1080p build2: Time To Compile onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU dav1d: Summer Nature 4K kvazaar: Bosphorus 1080p - Slow kvazaar: Bosphorus 4K - Medium ai-benchmark: Device AI Score onednn: Convolution Batch Shapes Auto - f32 - CPU x264: H.264 Video Encoding kvazaar: Bosphorus 1080p - Ultra Fast compress-zstd: 19 crafty: Elapsed Time kvazaar: Bosphorus 4K - Very Fast dav1d: Chimera 1080p brl-cad: VGR Performance Metric unpack-firefox: firefox-84.0.source.tar.xz dav1d: Chimera 1080p 10-bit onednn: Recurrent Neural Network Inference - u8s8f32 - CPU basis: UASTC Level 3 compress-lz4: 9 - Decompression Speed kvazaar: Bosphorus 1080p - Medium compress-zstd: 3 sqlite-speedtest: Timed Time - Size 1,000 ncnn: CPU - vgg16 yafaray: Total Time For Sample Scene encode-ape: WAV To APE kvazaar: Bosphorus 4K - Ultra Fast compress-lz4: 1 - Decompression Speed ai-benchmark: Device Training Score compress-lz4: 3 - Decompression Speed kvazaar: Bosphorus 1080p - Very Fast onednn: Recurrent Neural Network Training - u8s8f32 - CPU basis: UASTC Level 2 compress-lz4: 3 - Compression Speed compress-lz4: 9 - Compression Speed astcenc: Fast kvazaar: Bosphorus 4K - Slow simdjson: DistinctUserID simdjson: PartialTweets simdjson: LargeRand simdjson: Kostya clomp: Static OMP Speedup ncnn: CPU - blazeface ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 espeak: Text-To-Speech Synthesis onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU hpcc: Rand Ring Bandwidth hpcc: G-Ffte Run 1 Run 2 Run 3 2200466.42 5.34605 8.83 55.32 2.603 1511980.13 0.43412 55.65 2022289.54 10.05002 13.0274 1786977.08 3.81378 173.24 58.07 27.04 0.02410 10.851 9.32830 35.75 9.47929 8.64 15.9999 192969.408009 15683640 37.23113 35.22 0.828 9.462 8.90 7.5212 11.416 100.248 25.152 427.41 0.277 17.22 12.7573 109.37800 9.2012 957.165 12775.554 2.275 1314301.94 10414229 11020.0 75.07 25.9310 68.928 7.3664 79.51 2.371 16.273 0.568 6.4106 35.40 1.091 81.7 0.979 815 158.694 99.304 69.77 59.71 86.18872 7391.52 131.746 2.50276 11058.9 7394.07 5400.28 622110 19.0877 322.35 264.647 17.1744 102.45 11.00 2.75 1696 16.9453 52.05 52.74 31.3 7034860 7.71 361.59 70291 23.662 67.76 7400.87 113.012 6298.2 11.33 2971.6 79.758 131.95 272.343 14.397 14.04 6392.0 881 6291.6 29.38 11071.4 58.979 43.21 42.29 7.54 2.69 0.71 0.69 0.39 0.58 1.2 9.11 17.03 35.12 39.673 8.20994 2.51129 2.07387 1361545.28 5.13392 8.20 52.22 2.711 1574421.25 0.44873 56.42 1965096.13 9.80732 12.6605 1750815.54 3.77356 169.88 56.97 27.22 0.02397 10.872 9.16745 35.16 9.51403 8.60 15.8302 192628.863487 15739160 37.38293 34.75 0.818 9.511 8.79 7.6137 11.554 100.009 24.959 422.82 0.280 17.06 12.6702 109.66230 9.1127 966.396 12753.061 2.254 1311290.87 10336902 11062.6 74.53 25.7090 68.683 7.3146 79.19 2.369 16.157 0.568 6.4340 35.20 1.085 81.2 0.977 820 159.021 98.969 70.17 60.05 86.50860 7430.60 131.059 2.51283 11071.2 7385.86 5389.98 619528 19.0626 323.32 265.496 17.2067 102.74 11.04 2.76 1702 16.9993 51.97 52.92 31.2 7047214 7.73 362.52 70116 23.604 67.71 7404.31 112.928 6286.9 11.35 2966.5 79.637 131.74 271.975 14.381 14.04 6384.4 882 6285.0 29.35 11074.2 58.951 43.21 42.30 7.54 2.69 0.71 0.69 0.39 0.58 1.2 9.01 18.14 35.79 40.569 7.18295 2.36057 2.08554 1354046.58 5.54450 8.31 52.17 2.643 1558674.52 0.43635 57.44 1960201.46 10.0958 12.7106 1746188.19 3.73038 171.13 57.70 27.56 0.02441 11.048 9.21052 35.71 9.63747 8.74 15.7450 195702.021203 15902806 36.87293 35.10 0.829 9.585 8.89 7.5330 11.472 101.161 25.239 422.91 0.278 17.04 12.6246 110.4851 9.1881 962.778 12872.700 2.274 1323367.41 10430427 11119.0 75.19 25.7774 69.243 7.3115 78.92 2.354 0.564 6.4533 35.43 1.092 81.3 0.983 158.064 99.544 69.99 59.78 86.65972 7414.46 131.347 2.50070 11111.7 7418.02 5377.61 19.1393 323.64 265.692 17.2416 102.34 11.01 2.75 16.9394 52.15 52.78 31.2 7024969 7.71 361.92 67.60 7415.85 113.137 6295.4 11.33 2969.0 79.631 131.81 272.395 14.376 14.02 6385.2 6291.5 29.36 11063.8 58.994 43.24 42.28 7.54 2.69 0.71 0.69 0.39 0.58 1.2 8.70 17.23 35.26 40.871 7.21289 2.44927 2.07874 OpenBenchmarking.org
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP Run 1 Run 2 Run 3 500K 1000K 1500K 2000K 2500K SE +/- 25730.97, N = 3 SE +/- 14358.36, N = 5 SE +/- 6791.19, N = 3 2200466.42 1361545.28 1354046.58 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad Run 1 Run 2 Run 3 1.2475 2.495 3.7425 4.99 6.2375 SE +/- 0.07070, N = 3 SE +/- 0.09991, N = 3 SE +/- 0.01514, N = 3 5.34605 5.13392 5.54450 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 8.83 8.20 8.31 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough Run 1 Run 2 Run 3 12 24 36 48 60 SE +/- 0.19, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 55.32 52.22 52.17 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein Run 1 Run 2 Run 3 0.61 1.22 1.83 2.44 3.05 SE +/- 0.027, N = 15 SE +/- 0.040, N = 15 SE +/- 0.005, N = 3 2.603 2.711 2.643 1. (CXX) g++ options: -O3 -pthread -lm
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET Run 1 Run 2 Run 3 300K 600K 900K 1200K 1500K SE +/- 8648.91, N = 3 SE +/- 16434.34, N = 3 SE +/- 14536.11, N = 7 1511980.13 1574421.25 1558674.52 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency Run 1 Run 2 Run 3 0.101 0.202 0.303 0.404 0.505 SE +/- 0.01254, N = 3 SE +/- 0.00111, N = 3 SE +/- 0.01324, N = 3 0.43412 0.44873 0.43635 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 Run 1 Run 2 Run 3 13 26 39 52 65 SE +/- 0.53, N = 3 SE +/- 0.38, N = 3 SE +/- 0.93, N = 4 55.65 56.42 57.44 MIN: 29.24 / MAX: 174.71 MIN: 29.6 / MAX: 196.71 MIN: 32.72 / MAX: 234.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET Run 1 Run 2 Run 3 400K 800K 1200K 1600K 2000K SE +/- 12019.14, N = 3 SE +/- 19126.07, N = 3 SE +/- 21525.04, N = 3 2022289.54 1965096.13 1960201.46 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.09931, N = 3 SE +/- 0.03591, N = 3 SE +/- 0.04430, N = 3 10.05002 9.80732 10.09580 MIN: 8.56 MIN: 8.52 MIN: 8.61 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 13.03 12.66 12.71 MIN: 6.76 MIN: 6.05 MIN: 5.94 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD Run 1 Run 2 Run 3 400K 800K 1200K 1600K 2000K SE +/- 8709.31, N = 3 SE +/- 5919.78, N = 3 SE +/- 19777.69, N = 4 1786977.08 1750815.54 1746188.19 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 3 0.8581 1.7162 2.5743 3.4324 4.2905 SE +/- 0.00333, N = 3 SE +/- 0.00200, N = 3 SE +/- 0.01990, N = 3 3.81378 3.77356 3.73038 MIN: 3.12 MIN: 3.18 MIN: 3.21 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 Run 1 Run 2 Run 3 40 80 120 160 200 SE +/- 0.53, N = 3 SE +/- 2.32, N = 3 SE +/- 1.71, N = 4 173.24 169.88 171.13 MIN: 100.86 / MAX: 302.11 MIN: 98.02 / MAX: 304.64 MIN: 94.76 / MAX: 337.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m Run 1 Run 2 Run 3 13 26 39 52 65 SE +/- 0.08, N = 3 SE +/- 0.25, N = 3 SE +/- 0.54, N = 4 58.07 56.97 57.70 MIN: 51.31 / MAX: 195.89 MIN: 51.81 / MAX: 108.77 MIN: 51.85 / MAX: 196.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 Run 1 Run 2 Run 3 6 12 18 24 30 SE +/- 0.47, N = 3 SE +/- 0.64, N = 3 SE +/- 0.60, N = 4 27.04 27.22 27.56 MIN: 15.02 / MAX: 98.6 MIN: 6.36 / MAX: 114.32 MIN: 7.54 / MAX: 160.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access Run 1 Run 2 Run 3 0.0055 0.011 0.0165 0.022 0.0275 SE +/- 0.00028, N = 3 SE +/- 0.00072, N = 3 SE +/- 0.00023, N = 3 0.02410 0.02397 0.02441 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.11, N = 15 10.85 10.87 11.05 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.06357, N = 3 SE +/- 0.03003, N = 3 SE +/- 0.04781, N = 3 9.32830 9.16745 9.21052 MIN: 3.98 MIN: 3.97 MIN: 3.97 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 Run 1 Run 2 Run 3 8 16 24 32 40 SE +/- 0.37, N = 3 SE +/- 0.38, N = 3 SE +/- 0.39, N = 4 35.75 35.16 35.71 MIN: 19.97 / MAX: 132.42 MIN: 21.2 / MAX: 117.05 MIN: 21.19 / MAX: 116.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.03038, N = 3 SE +/- 0.01988, N = 3 SE +/- 0.03653, N = 3 9.47929 9.51403 9.63747 MIN: 4.26 MIN: 4.29 MIN: 4.31 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 8.64 8.60 8.74 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 3 4 8 12 16 20 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 16.00 15.83 15.75 MIN: 12.18 MIN: 11.82 MIN: 11.87 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second Run 1 Run 2 Run 3 40K 80K 120K 160K 200K SE +/- 1296.05, N = 3 SE +/- 1995.67, N = 4 SE +/- 441.31, N = 3 192969.41 192628.86 195702.02 1. (CC) gcc options: -O2 -lrt" -lrt
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth Run 1 Run 2 Run 3 3M 6M 9M 12M 15M SE +/- 154965.11, N = 6 SE +/- 44605.31, N = 3 SE +/- 137366.03, N = 7 15683640 15739160 15902806
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM Run 1 Run 2 Run 3 9 18 27 36 45 SE +/- 0.60, N = 3 SE +/- 0.40, N = 3 SE +/- 0.49, N = 3 37.23 37.38 36.87 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p Run 1 Run 2 Run 3 8 16 24 32 40 SE +/- 0.11, N = 3 SE +/- 0.28, N = 3 SE +/- 0.11, N = 3 35.22 34.75 35.10 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 Run 1 Run 2 Run 3 0.1865 0.373 0.5595 0.746 0.9325 SE +/- 0.011, N = 3 SE +/- 0.006, N = 3 SE +/- 0.003, N = 3 0.828 0.818 0.829
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.024, N = 5 SE +/- 0.038, N = 5 SE +/- 0.058, N = 25 9.462 9.511 9.585 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.10, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 8.90 8.79 8.89 1. Nodejs
v12.18.2
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.0019, N = 3 SE +/- 0.0742, N = 3 SE +/- 0.0113, N = 3 7.5212 7.6137 7.5330 MIN: 7.48 / MAX: 7.63 MIN: 7.43 / MAX: 7.81 MIN: 7.49 / MAX: 7.62
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.13, N = 3 SE +/- 0.16, N = 3 SE +/- 0.12, N = 3 11.42 11.55 11.47 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.37, N = 3 SE +/- 0.11, N = 3 SE +/- 0.88, N = 3 100.25 100.01 101.16
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 Run 1 Run 2 Run 3 6 12 18 24 30 SE +/- 0.21, N = 8 SE +/- 0.10, N = 3 SE +/- 0.28, N = 5 25.15 24.96 25.24 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive Run 1 Run 2 Run 3 90 180 270 360 450 SE +/- 4.31, N = 6 SE +/- 0.16, N = 3 SE +/- 0.03, N = 3 427.41 422.82 422.91 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 Run 1 Run 2 Run 3 0.063 0.126 0.189 0.252 0.315 SE +/- 0.004, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 0.277 0.280 0.278
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 Run 1 Run 2 Run 3 4 8 12 16 20 SE +/- 0.10, N = 3 SE +/- 0.20, N = 3 SE +/- 0.10, N = 4 17.22 17.06 17.04 MIN: 9.81 / MAX: 65.85 MIN: 9.88 / MAX: 85.95 MIN: 9.51 / MAX: 72.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 12.76 12.67 12.62 MIN: 8.27 MIN: 8.03 MIN: 7.63 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.34, N = 3 SE +/- 0.33, N = 3 SE +/- 0.23, N = 3 109.38 109.66 110.49
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.0541, N = 3 SE +/- 0.0329, N = 3 SE +/- 0.0422, N = 3 9.2012 9.1127 9.1881 MIN: 9.04 / MAX: 9.42 MIN: 9.01 / MAX: 9.31 MIN: 9.09 / MAX: 9.41
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing Run 1 Run 2 Run 3 200 400 600 800 1000 SE +/- 4.07, N = 3 SE +/- 1.57, N = 3 SE +/- 11.07, N = 3 957.17 966.40 962.78 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth Run 1 Run 2 Run 3 3K 6K 9K 12K 15K SE +/- 41.78, N = 3 SE +/- 19.51, N = 3 SE +/- 46.85, N = 3 12775.55 12753.06 12872.70 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar Run 1 Run 2 Run 3 0.5119 1.0238 1.5357 2.0476 2.5595 SE +/- 0.002, N = 3 SE +/- 0.010, N = 3 SE +/- 0.000, N = 3 2.275 2.254 2.274
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH Run 1 Run 2 Run 3 300K 600K 900K 1200K 1500K SE +/- 9622.95, N = 11 SE +/- 15165.81, N = 3 SE +/- 13263.10, N = 3 1314301.94 1311290.87 1323367.41 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time Run 1 Run 2 Run 3 2M 4M 6M 8M 10M SE +/- 125432.49, N = 3 SE +/- 55842.23, N = 3 SE +/- 41101.79, N = 3 10414229 10336902 10430427 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Run 1 Run 2 Run 3 2K 4K 6K 8K 10K SE +/- 10.72, N = 3 SE +/- 26.91, N = 3 SE +/- 17.48, N = 3 11020.0 11062.6 11119.0 MIN: 10766 MIN: 10773.5 MIN: 10848.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.43, N = 3 SE +/- 0.53, N = 3 SE +/- 0.88, N = 4 75.07 74.53 75.19 MIN: 52.17 / MAX: 259.67 MIN: 53.79 / MAX: 180.72 MIN: 52.93 / MAX: 204.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 3 6 12 18 24 30 SE +/- 0.09, N = 3 SE +/- 0.01, N = 3 SE +/- 0.18, N = 3 25.93 25.71 25.78 MIN: 9.72 MIN: 9.67 MIN: 9.78 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S Run 1 Run 2 Run 3 15 30 45 60 75 SE +/- 0.29, N = 3 SE +/- 0.38, N = 3 SE +/- 0.50, N = 3 68.93 68.68 69.24 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.0045, N = 3 SE +/- 0.0251, N = 3 SE +/- 0.0111, N = 3 7.3664 7.3146 7.3115 MIN: 7.29 / MAX: 7.5 MIN: 7.22 / MAX: 7.48 MIN: 7.24 / MAX: 7.45
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.35, N = 3 SE +/- 0.30, N = 3 SE +/- 0.35, N = 4 79.51 79.19 78.92 MIN: 57.15 / MAX: 128.6 MIN: 59.35 / MAX: 182.68 MIN: 59.82 / MAX: 157.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 Run 1 Run 2 Run 3 0.5335 1.067 1.6005 2.134 2.6675 SE +/- 0.013, N = 3 SE +/- 0.025, N = 3 SE +/- 0.015, N = 3 2.371 2.369 2.354
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack Run 1 Run 2 4 8 12 16 20 SE +/- 0.13, N = 5 SE +/- 0.00, N = 5 16.27 16.16 1. (CXX) g++ options: -rdynamic
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark Run 1 Run 2 Run 3 0.1278 0.2556 0.3834 0.5112 0.639 SE +/- 0.005, N = 3 SE +/- 0.006, N = 3 SE +/- 0.004, N = 3 0.568 0.568 0.564 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.0067, N = 3 SE +/- 0.0076, N = 3 SE +/- 0.0142, N = 3 6.4106 6.4340 6.4533 MIN: 6.35 / MAX: 6.51 MIN: 6.37 / MAX: 6.53 MIN: 6.39 / MAX: 6.57
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet Run 1 Run 2 Run 3 8 16 24 32 40 SE +/- 0.39, N = 3 SE +/- 0.17, N = 3 SE +/- 0.33, N = 4 35.40 35.20 35.43 MIN: 24.51 / MAX: 94.32 MIN: 25.77 / MAX: 83.7 MIN: 24.78 / MAX: 113.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 Run 1 Run 2 Run 3 0.2457 0.4914 0.7371 0.9828 1.2285 SE +/- 0.002, N = 3 SE +/- 0.010, N = 3 SE +/- 0.007, N = 3 1.091 1.085 1.092
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.78, N = 3 SE +/- 0.93, N = 3 SE +/- 0.26, N = 3 81.7 81.2 81.3 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom Run 1 Run 2 Run 3 0.2212 0.4424 0.6636 0.8848 1.106 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 0.979 0.977 0.983
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score Run 1 Run 2 200 400 600 800 1000 815 820
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile Run 1 Run 2 Run 3 40 80 120 160 200 SE +/- 1.34, N = 3 SE +/- 1.17, N = 3 SE +/- 0.84, N = 3 158.69 159.02 158.06
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.43, N = 3 SE +/- 0.16, N = 3 SE +/- 0.48, N = 3 99.30 98.97 99.54
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet Run 1 Run 2 Run 3 16 32 48 64 80 SE +/- 0.42, N = 3 SE +/- 0.96, N = 3 SE +/- 0.44, N = 4 69.77 70.17 69.99 MIN: 40.57 / MAX: 187.95 MIN: 40.33 / MAX: 223.67 MIN: 40.48 / MAX: 183.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd Run 1 Run 2 Run 3 13 26 39 52 65 SE +/- 0.45, N = 3 SE +/- 0.40, N = 3 SE +/- 0.35, N = 4 59.71 60.05 59.78 MIN: 43.82 / MAX: 136.53 MIN: 43.01 / MAX: 117.51 MIN: 44.03 / MAX: 108.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.94, N = 9 SE +/- 1.20, N = 3 SE +/- 0.89, N = 9 86.19 86.51 86.66 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Run 1 Run 2 Run 3 1600 3200 4800 6400 8000 SE +/- 7.28, N = 3 SE +/- 42.16, N = 3 SE +/- 14.92, N = 3 7391.52 7430.60 7414.46 MIN: 7132.96 MIN: 7131.37 MIN: 7129.34 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search Run 1 Run 2 Run 3 30 60 90 120 150 SE +/- 0.62, N = 3 SE +/- 0.21, N = 3 SE +/- 0.12, N = 3 131.75 131.06 131.35 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans Run 1 Run 2 Run 3 0.5654 1.1308 1.6962 2.2616 2.827 SE +/- 0.00379, N = 3 SE +/- 0.00766, N = 3 SE +/- 0.00646, N = 3 2.50276 2.51283 2.50070 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 3 2K 4K 6K 8K 10K SE +/- 10.33, N = 3 SE +/- 4.29, N = 3 SE +/- 7.07, N = 3 11058.9 11071.2 11111.7 MIN: 10774.1 MIN: 10804.1 MIN: 10816.9 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 3 1600 3200 4800 6400 8000 SE +/- 20.69, N = 3 SE +/- 11.50, N = 3 SE +/- 6.04, N = 3 7394.07 7385.86 7418.02 MIN: 7102.56 MIN: 7090.33 MIN: 7150.1 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed Run 1 Run 2 Run 3 1200 2400 3600 4800 6000 SE +/- 2.58, N = 3 SE +/- 1.99, N = 3 SE +/- 0.16, N = 3 5400.28 5389.98 5377.61 1. (CC) gcc options: -O3
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite Run 1 Run 2 130K 260K 390K 520K 650K SE +/- 357.11, N = 3 SE +/- 1039.69, N = 3 622110 619528
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 3 5 10 15 20 25 SE +/- 0.16, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 19.09 19.06 19.14 MIN: 8.36 MIN: 8.04 MIN: 8.13 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p Run 1 Run 2 Run 3 70 140 210 280 350 SE +/- 0.61, N = 3 SE +/- 0.18, N = 3 SE +/- 0.82, N = 3 322.35 323.32 323.64 MIN: 273.49 / MAX: 351.84 MIN: 283.06 / MAX: 351.54 MIN: 272 / MAX: 354.89 1. (CC) gcc options: -pthread
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile Run 1 Run 2 Run 3 60 120 180 240 300 SE +/- 0.63, N = 3 SE +/- 0.57, N = 3 SE +/- 0.98, N = 3 264.65 265.50 265.69
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 3 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 17.17 17.21 17.24 MIN: 12.71 MIN: 13.27 MIN: 12.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.10, N = 3 SE +/- 0.15, N = 3 102.45 102.74 102.34 MIN: 96.59 / MAX: 115.32 MIN: 96.51 / MAX: 116.33 MIN: 96.39 / MAX: 115.07 1. (CC) gcc options: -pthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Slow Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.00 11.04 11.01 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium Run 1 Run 2 Run 3 0.621 1.242 1.863 2.484 3.105 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.75 2.76 2.75 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score Run 1 Run 2 400 800 1200 1600 2000 1696 1702
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 3 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 16.95 17.00 16.94 MIN: 14.04 MIN: 13.62 MIN: 13.98 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2019-12-17 H.264 Video Encoding Run 1 Run 2 Run 3 12 24 36 48 60 SE +/- 0.41, N = 9 SE +/- 0.43, N = 8 SE +/- 0.42, N = 9 52.05 51.97 52.15 1. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast Run 1 Run 2 Run 3 12 24 36 48 60 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 52.74 52.92 52.78 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 Run 1 Run 2 Run 3 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 31.3 31.2 31.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time Run 1 Run 2 Run 3 1.5M 3M 4.5M 6M 7.5M SE +/- 16316.81, N = 3 SE +/- 5337.82, N = 3 SE +/- 13562.60, N = 3 7034860 7047214 7024969 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.71 7.73 7.71 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p Run 1 Run 2 Run 3 80 160 240 320 400 SE +/- 0.56, N = 3 SE +/- 1.27, N = 3 SE +/- 1.19, N = 3 361.59 362.52 361.92 MIN: 267 / MAX: 561.79 MIN: 267.49 / MAX: 564.49 MIN: 266.52 / MAX: 567.25 1. (CC) gcc options: -pthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric Run 1 Run 2 15K 30K 45K 60K 75K 70291 70116 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz Run 1 Run 2 6 12 18 24 30 SE +/- 0.22, N = 20 SE +/- 0.20, N = 20 23.66 23.60
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit Run 1 Run 2 Run 3 15 30 45 60 75 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 SE +/- 0.01, N = 3 67.76 67.71 67.60 MIN: 44.17 / MAX: 170.52 MIN: 44.16 / MAX: 166.83 MIN: 44.14 / MAX: 165.37 1. (CC) gcc options: -pthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 3 1600 3200 4800 6400 8000 SE +/- 12.71, N = 3 SE +/- 0.26, N = 3 SE +/- 4.87, N = 3 7400.87 7404.31 7415.85 MIN: 7138.98 MIN: 7147.81 MIN: 7149.4 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 Run 1 Run 2 Run 3 30 60 90 120 150 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 113.01 112.93 113.14 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed Run 1 Run 2 Run 3 1300 2600 3900 5200 6500 SE +/- 2.81, N = 3 SE +/- 4.84, N = 3 SE +/- 3.26, N = 3 6298.2 6286.9 6295.4 1. (CC) gcc options: -O3
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 11.33 11.35 11.33 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 Run 1 Run 2 Run 3 600 1200 1800 2400 3000 SE +/- 3.47, N = 3 SE +/- 4.29, N = 3 SE +/- 7.85, N = 3 2971.6 2966.5 2969.0 1. (CC) gcc options: -O3 -pthread -lz -llzma
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 Run 1 Run 2 Run 3 20 40 60 80 100 SE +/- 0.26, N = 3 SE +/- 0.02, N = 3 SE +/- 0.12, N = 3 79.76 79.64 79.63 1. (CC) gcc options: -O2 -ldl -lz -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 Run 1 Run 2 Run 3 30 60 90 120 150 SE +/- 0.58, N = 3 SE +/- 0.47, N = 3 SE +/- 0.72, N = 4 131.95 131.74 131.81 MIN: 105.72 / MAX: 192.42 MIN: 104.27 / MAX: 191.62 MIN: 98.07 / MAX: 183.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
YafaRay Total Time For Sample Scene OpenBenchmarking.org Seconds, Fewer Is Better YafaRay 3.4.1 Total Time For Sample Scene Run 1 Run 2 Run 3 60 120 180 240 300 SE +/- 0.45, N = 3 SE +/- 0.37, N = 3 SE +/- 0.28, N = 3 272.34 271.98 272.40 1. (CXX) g++ options: -std=c++11 -O3 -ffast-math -rdynamic -ldl -lImath -lIlmImf -lIex -lHalf -lz -lIlmThread -lxml2 -lfreetype -lpthread
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE Run 1 Run 2 Run 3 4 8 12 16 20 SE +/- 0.07, N = 5 SE +/- 0.06, N = 24 SE +/- 0.08, N = 5 14.40 14.38 14.38 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Run 1 Run 2 Run 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 14.04 14.04 14.02 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed Run 1 Run 2 Run 3 1400 2800 4200 5600 7000 SE +/- 1.39, N = 3 SE +/- 3.86, N = 3 SE +/- 0.96, N = 3 6392.0 6384.4 6385.2 1. (CC) gcc options: -O3
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score Run 1 Run 2 200 400 600 800 1000 881 882
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed Run 1 Run 2 Run 3 1300 2600 3900 5200 6500 SE +/- 1.24, N = 3 SE +/- 2.97, N = 3 SE +/- 3.87, N = 3 6291.6 6285.0 6291.5 1. (CC) gcc options: -O3
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast Run 1 Run 2 Run 3 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 29.38 29.35 29.36 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Run 1 Run 2 Run 3 2K 4K 6K 8K 10K SE +/- 8.16, N = 3 SE +/- 10.87, N = 3 SE +/- 11.56, N = 3 11071.4 11074.2 11063.8 MIN: 10803.3 MIN: 10796.5 MIN: 10764.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 Run 1 Run 2 Run 3 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 58.98 58.95 58.99 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed Run 1 Run 2 Run 3 10 20 30 40 50 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 43.21 43.21 43.24 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed Run 1 Run 2 Run 3 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 42.29 42.30 42.28 1. (CC) gcc options: -O3
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 7.54 7.54 7.54 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Slow Run 1 Run 2 Run 3 0.6053 1.2106 1.8159 2.4212 3.0265 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.69 2.69 2.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID Run 1 Run 2 Run 3 0.1598 0.3196 0.4794 0.6392 0.799 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.71 0.71 0.71 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets Run 1 Run 2 Run 3 0.1553 0.3106 0.4659 0.6212 0.7765 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.69 0.69 0.69 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom Run 1 Run 2 Run 3 0.0878 0.1756 0.2634 0.3512 0.439 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.39 0.39 0.39 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya Run 1 Run 2 Run 3 0.1305 0.261 0.3915 0.522 0.6525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.58 0.58 0.58 1. (CXX) g++ options: -O3 -pthread
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup Run 1 Run 2 Run 3 0.27 0.54 0.81 1.08 1.35 SE +/- 0.01, N = 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 15 1.2 1.2 1.2 1. (CC) gcc options: -fopenmp -O3 -lm
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface Run 1 Run 2 Run 3 3 6 9 12 15 SE +/- 0.48, N = 3 SE +/- 0.72, N = 3 SE +/- 0.31, N = 4 9.11 9.01 8.70 MIN: 2.68 / MAX: 132.97 MIN: 2.68 / MAX: 92.41 MIN: 2.68 / MAX: 149.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet Run 1 Run 2 Run 3 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.75, N = 3 SE +/- 0.24, N = 4 17.03 18.14 17.23 MIN: 13.56 / MAX: 48.96 MIN: 5.59 / MAX: 157.6 MIN: 13.21 / MAX: 64.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 Run 1 Run 2 Run 3 8 16 24 32 40 SE +/- 0.29, N = 3 SE +/- 1.24, N = 3 SE +/- 0.25, N = 4 35.12 35.79 35.26 MIN: 20.41 / MAX: 103.78 MIN: 20.38 / MAX: 148.98 MIN: 17.61 / MAX: 123.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis Run 1 Run 2 Run 3 9 18 27 36 45 SE +/- 0.53, N = 20 SE +/- 0.92, N = 20 SE +/- 0.84, N = 20 39.67 40.57 40.87 1. (CC) gcc options: -O2 -std=c99
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU Run 1 Run 2 Run 3 2 4 6 8 10 SE +/- 0.48864, N = 15 SE +/- 0.02084, N = 3 SE +/- 0.02417, N = 3 8.20994 7.18295 7.21289 MIN: 3.29 MIN: 3.37 MIN: 3.38 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth Run 1 Run 2 Run 3 0.565 1.13 1.695 2.26 2.825 SE +/- 0.09513, N = 3 SE +/- 0.01108, N = 3 SE +/- 0.10589, N = 3 2.51129 2.36057 2.44927 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte Run 1 Run 2 Run 3 0.4692 0.9384 1.4076 1.8768 2.346 SE +/- 0.13748, N = 3 SE +/- 0.05507, N = 3 SE +/- 0.05394, N = 3 2.07387 2.08554 2.07874 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
Phoronix Test Suite v10.8.4