Core i7 5775C EOY 2020 Intel Core i7-5775C testing with a MSI Z97-G45 GAMING (MS-7821) v1.0 (V2.9 BIOS) and MSI Intel Iris Pro 6200 3GB on Ubuntu 18.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012292-HA-COREI757721&grs&rdt&export=pdf .
Core i7 5775C EOY 2020 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 Intel Core i7-5775C @ 3.70GHz (4 Cores / 8 Threads) MSI Z97-G45 GAMING (MS-7821) v1.0 (V2.9 BIOS) Intel Broadwell-U DMI 16GB 120GB CT120BX100SSD1 MSI Intel Iris Pro 6200 3GB (1150MHz) Intel Broadwell-U Audio VA2431 Qualcomm Atheros Killer E220x Ubuntu 18.10 5.0.0-999-generic (x86_64) 20190223 GNOME Shell 3.30.2 X Server 1.20.1 modesetting 1.20.1 4.5 Mesa 19.2.0-devel (git-2631fd3 2019-07-24 cosmic-oibaf-ppa) 1.1.102 GCC 8.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0x20 - Thermald 1.7 Java Details - OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.10.1) Python Details - Python 2.7.16 + Python 3.6.8 Security Details - l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling
Core i7 5775C EOY 2020 onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU mlpack: scikit_qda yquake2: OpenGL 1.x - 1920 x 1080 mlpack: scikit_ica onednn: IP Shapes 1D - u8s8f32 - CPU redis: LPUSH unpack-firefox: firefox-84.0.source.tar.xz mlpack: scikit_linearridgeregression sunflow: Global Illumination + Image Synthesis hpcc: G-HPL yquake2: OpenGL 3.x - 1920 x 1080 embree: Pathtracer ISPC - Crown build2: Time To Compile onednn: Convolution Batch Shapes Auto - f32 - CPU caffe: AlexNet - CPU - 100 onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU espeak: Text-To-Speech Synthesis ncnn: CPU-v2-v2 - mobilenet-v2 compress-lz4: 9 - Compression Speed numpy: asmfish: 1024 Hash Memory, 26 Depth ffte: N=256, 3D Complex FFT Routine compress-lz4: 3 - Decompression Speed ncnn: CPU - mnasnet hpcc: EP-DGEMM basis: ETC1S lammps: Rhodopsin Protein ncnn: CPU - blazeface build-ffmpeg: Time To Compile dolfyn: Computational Fluid Dynamics crafty: Elapsed Time ncnn: CPU - shufflenet-v2 embree: Pathtracer ISPC - Asian Dragon ncnn: CPU - mobilenet hpcc: Max Ping Pong Bandwidth x265: Bosphorus 4K hpcc: G-Rand Access onednn: Recurrent Neural Network Inference - u8s8f32 - CPU stockfish: Total Time ncnn: CPU - efficientnet-b0 ncnn: CPU - yolov4-tiny mafft: Multiple Sequence Alignment - LSU RNA ncnn: CPU - squeezenet_ssd redis: SET ncnn: CPU - googlenet rav1e: 10 ncnn: CPU - regnety_400m onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU keydb: x265: Bosphorus 1080p compress-lz4: 1 - Decompression Speed compress-lz4: 3 - Compression Speed hpcc: G-Ffte yquake2: Software CPU - 1920 x 1080 onednn: Recurrent Neural Network Inference - f32 - CPU sqlite-speedtest: Timed Time - Size 1,000 kvazaar: Bosphorus 1080p - Slow coremark: CoreMark Size 666 - Iterations Per Second rav1e: 5 hpcc: G-Ptrans ncnn: CPU - resnet18 hpcc: Rand Ring Latency compress-lz4: 9 - Decompression Speed gromacs: Water Benchmark encode-ape: WAV To APE compress-lz4: 1 - Compression Speed ncnn: CPU - alexnet brl-cad: VGR Performance Metric hint: FLOAT kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 1080p - Very Fast rav1e: 1 caffe: GoogleNet - CPU - 100 ncnn: CPU-v3-v3 - mobilenet-v3 kvazaar: Bosphorus 1080p - Medium embree: Pathtracer - Crown onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU byte: Dhrystone 2 kvazaar: Bosphorus 4K - Ultra Fast indigobench: CPU - Bedroom ncnn: CPU - resnet50 kvazaar: Bosphorus 1080p - Ultra Fast indigobench: CPU - Supercar build-eigen: Time To Compile hpcc: EP-STREAM Triad ncnn: CPU - vgg16 embree: Pathtracer - Asian Dragon rav1e: 6 mlpack: scikit_svm astcenc: Fast basis: UASTC Level 0 onednn: Recurrent Neural Network Training - f32 - CPU encode-opus: WAV To Opus Encode phpbench: PHP Benchmark Suite encode-ogg: WAV To Ogg hmmer: Pfam Database Search onednn: Recurrent Neural Network Training - u8s8f32 - CPU encode-wavpack: WAV To WavPack rnnoise: astcenc: Thorough basis: UASTC Level 2 astcenc: Exhaustive basis: UASTC Level 2 + RDO Post-Processing basis: UASTC Level 3 astcenc: Medium kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 4K - Slow clomp: Static OMP Speedup redis: GET redis: SADD redis: LPOP hpcc: Rand Ring Bandwidth betsy: ETC2 RGB - Highest betsy: ETC1 - Highest 1 2 3 3.50897 3.12754 17.9927 13.1433 12.4843 21.2868 9.00601 6.70028 99.00 518.3 61.60 5.77578 1421632.92 20.723 5.25 2.505 82.76190 553.8 5.4516 242.554 20.7908 67035 13.0131 40.479 7.08 42.62 279.09 11144382 19803.756124216 5408.7 6.68 36.01717 79.260 2.878 3.09 121.027 23.148 6824944 11.39 6.8017 26.45 17277.738 6.11 0.03293 3898.31 7419488 10.20 37.64 12.524 31.07 1660500.87 19.60 2.636 17.30 3897.98 411366.57 28.06 5489.2 43.28 2.79950 114.3 3879.97 72.274 8.75 140877.184823 0.878 1.23725 23.82 0.21109 5451.8 0.482 14.503 4787.47 21.96 45294 381176866.34838 5.62 22.68 0.288 138193 5.99 9.02 4.7126 7399.77 11.2286 37136699.5 10.18 0.721 46.99 40.39 1.654 82.306 4.08144 81.26 5.6657 1.183 13.35 8.42 12.002 7404.84 9.412 574643 24.080 114.001 7400.51 16.308 25.670 72.63 81.244 588.79 1001.729 158.998 10.93 2.06 2.01 3.1 2215813.5 1661804.39 2368415.09 5.46352 16.127 15.397 4.27331 3.24606 16.9585 13.9508 11.7263 20.6052 8.59358 6.33749 98.58 510.3 60.42 5.61439 1444723.54 21.023 5.39 2.487 81.58543 547.1 5.3988 240.876 20.8008 66877 12.8064 41.123 7.07 42.68 275.03 11229182 19947.561999863 5450.2 6.60 35.86837 78.339 2.849 3.09 122.199 23.227 6860639 11.29 6.8597 26.68 17131.180 6.14 0.03319 3884.54 7444518 10.20 37.65 12.610 31.15 1671609.50 19.49 2.650 17.38 3882.87 411227.16 27.90 5497.5 43.51 2.79007 114.1 3895.42 72.628 8.77 141109.571045 0.882 1.23969 23.82 0.21092 5458.7 0.484 14.476 4805.59 22.01 45275 381158594.08639 5.62 22.72 0.289 138669 6.00 9.04 4.7276 7396.63 11.1951 37026672.1 10.16 0.723 46.98 40.45 1.655 82.482 4.07380 81.40 5.6753 1.185 13.34 8.43 11.989 7396.94 9.422 575213 24.079 114.069 7394.59 16.298 25.657 72.60 81.226 588.80 1001.520 158.995 10.93 2.06 2.01 3.1 1939149.38 1761393.52 1438792.17 5.81487 15.314 14.698 3.71011 2.67834 16.0907 14.1075 11.6886 21.9084 8.49555 6.32260 95.58 527.6 59.83 5.62232 1461476.54 21.303 5.28 2.445 80.82890 542.5 5.3503 238.064 21.1590 65933 12.8897 40.939 6.97 42.05 277.38 11302838 19694.290326945 5477.8 6.62 36.29173 78.757 2.865 3.06 121.208 23.020 6885507 11.31 6.7998 26.68 17216.067 6.09 0.03312 3869.51 7392870 10.13 37.90 12.543 30.94 1668121.37 19.47 2.653 17.27 3873.60 408847.89 27.94 5519.6 43.38 2.78478 114.7 3876.36 72.533 8.79 141520.579544 0.882 1.23435 23.72 0.21180 5474.5 0.484 14.447 4792.61 22.04 45131 382533367.81553 5.64 22.76 0.288 138249 5.98 9.05 4.7247 7419.20 11.2136 37053570.5 10.19 0.723 47.11 40.49 1.658 82.334 4.07377 81.33 5.6695 1.185 13.36 8.43 11.998 7398.38 9.416 574799 24.102 114.093 7399.59 16.303 25.659 72.62 81.253 588.64 1001.727 158.980 10.93 2.06 2.01 3.1 2041822.96 1687783.91 2024710.52 5.36042 15.716 15.956 OpenBenchmarking.org
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 0.9615 1.923 2.8845 3.846 4.8075 SE +/- 0.03532, N = 3 SE +/- 0.00958, N = 3 SE +/- 0.06500, N = 3 3.50897 4.27331 3.71011 MIN: 3.36 MIN: 4.17 MIN: 3.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.7304 1.4608 2.1912 2.9216 3.652 SE +/- 0.01583, N = 3 SE +/- 0.03533, N = 3 SE +/- 0.01863, N = 3 3.12754 3.24606 2.67834 MIN: 3.06 MIN: 3.13 MIN: 2.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.21, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 17.99 16.96 16.09 MIN: 17.6 MIN: 16.49 MIN: 15.86 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 13.14 13.95 14.11 MIN: 12.91 MIN: 13.68 MIN: 13.89 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 12.48 11.73 11.69 MIN: 12.21 MIN: 11.59 MIN: 11.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 21.29 20.61 21.91 MIN: 21 MIN: 20.41 MIN: 21.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.07657, N = 3 SE +/- 0.03663, N = 3 SE +/- 0.00624, N = 3 9.00601 8.59358 8.49555 MIN: 8.67 MIN: 8.22 MIN: 8.07 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.01488, N = 3 SE +/- 0.00312, N = 3 SE +/- 0.01493, N = 3 6.70028 6.33749 6.32260 MIN: 6.39 MIN: 6.17 MIN: 6.08 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda 1 2 3 20 40 60 80 100 SE +/- 0.67, N = 3 SE +/- 1.16, N = 3 SE +/- 0.02, N = 3 99.00 98.58 95.58
yquake2 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 1 2 3 110 220 330 440 550 SE +/- 9.01, N = 4 SE +/- 5.25, N = 15 SE +/- 3.86, N = 3 518.3 510.3 527.6 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica 1 2 3 14 28 42 56 70 SE +/- 0.75, N = 3 SE +/- 0.28, N = 3 SE +/- 0.08, N = 3 61.60 60.42 59.83
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.2996 2.5992 3.8988 5.1984 6.498 SE +/- 0.00787, N = 3 SE +/- 0.00125, N = 3 SE +/- 0.00754, N = 3 5.77578 5.61439 5.62232 MIN: 5.72 MIN: 5.59 MIN: 5.59 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 2 3 300K 600K 900K 1200K 1500K SE +/- 23763.10, N = 3 SE +/- 10846.13, N = 3 SE +/- 3122.80, N = 3 1421632.92 1444723.54 1461476.54 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz 1 2 3 5 10 15 20 25 SE +/- 0.19, N = 4 SE +/- 0.16, N = 4 SE +/- 0.22, N = 11 20.72 21.02 21.30
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression 1 2 3 1.2128 2.4256 3.6384 4.8512 6.064 SE +/- 0.02, N = 3 SE +/- 0.08, N = 4 SE +/- 0.06, N = 3 5.25 5.39 5.28
Sunflow Rendering System Global Illumination + Image Synthesis OpenBenchmarking.org Seconds, Fewer Is Better Sunflow Rendering System 0.07.2 Global Illumination + Image Synthesis 1 2 3 0.5636 1.1272 1.6908 2.2544 2.818 SE +/- 0.008, N = 3 SE +/- 0.020, N = 3 SE +/- 0.009, N = 3 2.505 2.487 2.445 MIN: 2.38 / MAX: 3.2 MIN: 2.36 / MAX: 3.31 MIN: 2.34 / MAX: 3.15
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 1 2 3 20 40 60 80 100 SE +/- 1.43, N = 3 SE +/- 0.40, N = 3 SE +/- 0.62, N = 3 82.76 81.59 80.83 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 1 2 3 120 240 360 480 600 SE +/- 2.95, N = 3 SE +/- 4.68, N = 3 SE +/- 3.26, N = 3 553.8 547.1 542.5 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown 1 2 3 1.2266 2.4532 3.6798 4.9064 6.133 SE +/- 0.0048, N = 3 SE +/- 0.0109, N = 3 SE +/- 0.0687, N = 3 5.4516 5.3988 5.3503 MIN: 5.43 / MAX: 5.51 MIN: 5.36 / MAX: 5.47 MIN: 5.16 / MAX: 5.47
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 1 2 3 50 100 150 200 250 SE +/- 3.92, N = 3 SE +/- 0.21, N = 3 SE +/- 0.19, N = 3 242.55 240.88 238.06
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.27, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 20.79 20.80 21.16 MIN: 19.99 MIN: 20.59 MIN: 20.85 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 1 2 3 14K 28K 42K 56K 70K SE +/- 296.71, N = 3 SE +/- 202.49, N = 3 SE +/- 171.69, N = 3 67035 66877 65933 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lboost_system -lboost_thread -lboost_filesystem -lboost_chrono -lboost_date_time -lboost_atomic -lpthread -lglog -lgflags -lsz -lz -ldl -lm -lprotobuf -llmdb -lopenblas
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.14, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 13.01 12.81 12.89 MIN: 12.66 MIN: 12.64 MIN: 12.74 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 1 2 3 9 18 27 36 45 SE +/- 0.35, N = 16 SE +/- 0.48, N = 4 SE +/- 0.56, N = 6 40.48 41.12 40.94 1. (CC) gcc options: -O2 -std=c99
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 SE +/- 0.13, N = 3 7.08 7.07 6.97 MIN: 6.91 / MAX: 8.17 MIN: 6.97 / MAX: 8.06 MIN: 6.76 / MAX: 8.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 1 2 3 10 20 30 40 50 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.42, N = 3 42.62 42.68 42.05 1. (CC) gcc options: -O3
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 1 2 3 60 120 180 240 300 SE +/- 1.04, N = 3 SE +/- 0.54, N = 3 SE +/- 0.46, N = 3 279.09 275.03 277.38
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 1 2 3 2M 4M 6M 8M 10M SE +/- 69549.32, N = 3 SE +/- 81955.72, N = 3 SE +/- 85928.55, N = 3 11144382 11229182 11302838
FFTE N=256, 3D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 7.0 N=256, 3D Complex FFT Routine 1 2 3 4K 8K 12K 16K 20K SE +/- 49.39, N = 3 SE +/- 34.77, N = 3 SE +/- 59.28, N = 3 19803.76 19947.56 19694.29 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 1 2 3 1200 2400 3600 4800 6000 SE +/- 4.57, N = 3 SE +/- 4.97, N = 3 SE +/- 9.93, N = 3 5408.7 5450.2 5477.8 1. (CC) gcc options: -O3
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 1 2 3 2 4 6 8 10 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 6.68 6.60 6.62 MIN: 6.56 / MAX: 6.86 MIN: 6.55 / MAX: 6.74 MIN: 6.5 / MAX: 6.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 1 2 3 8 16 24 32 40 SE +/- 0.20, N = 3 SE +/- 0.08, N = 3 SE +/- 0.40, N = 3 36.02 35.87 36.29 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S 1 2 3 20 40 60 80 100 SE +/- 0.52, N = 3 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 79.26 78.34 78.76 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 0.6476 1.2952 1.9428 2.5904 3.238 SE +/- 0.006, N = 3 SE +/- 0.007, N = 3 SE +/- 0.010, N = 3 2.878 2.849 2.865 1. (CXX) g++ options: -O3 -pthread -lm
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 1 2 3 0.6953 1.3906 2.0859 2.7812 3.4765 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 3.09 3.09 3.06 MIN: 3.03 / MAX: 3.29 MIN: 3.03 / MAX: 7.35 MIN: 3.03 / MAX: 3.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 1 2 3 30 60 90 120 150 SE +/- 0.08, N = 3 SE +/- 0.47, N = 3 SE +/- 0.17, N = 3 121.03 122.20 121.21
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 1 2 3 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 23.15 23.23 23.02
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 1 2 3 1.5M 3M 4.5M 6M 7.5M SE +/- 29965.63, N = 3 SE +/- 16848.33, N = 3 SE +/- 4260.63, N = 3 6824944 6860639 6885507 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.39 11.29 11.31 MIN: 11.31 / MAX: 12.43 MIN: 11.24 / MAX: 12.08 MIN: 11.22 / MAX: 24.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon 1 2 3 2 4 6 8 10 SE +/- 0.0262, N = 3 SE +/- 0.0255, N = 3 SE +/- 0.0223, N = 3 6.8017 6.8597 6.7998 MIN: 6.73 / MAX: 6.95 MIN: 6.77 / MAX: 6.98 MIN: 6.73 / MAX: 6.95
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 1 2 3 6 12 18 24 30 SE +/- 0.12, N = 3 SE +/- 0.30, N = 3 SE +/- 0.17, N = 3 26.45 26.68 26.68 MIN: 26.18 / MAX: 39.6 MIN: 26.23 / MAX: 28.16 MIN: 26.16 / MAX: 37.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 1 2 3 4K 8K 12K 16K 20K SE +/- 43.07, N = 3 SE +/- 272.70, N = 3 SE +/- 105.67, N = 3 17277.74 17131.18 17216.07 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 6.11 6.14 6.09 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 1 2 3 0.0075 0.015 0.0225 0.03 0.0375 SE +/- 0.00010, N = 3 SE +/- 0.00018, N = 3 SE +/- 0.00022, N = 3 0.03293 0.03319 0.03312 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 11.68, N = 3 SE +/- 3.15, N = 3 SE +/- 6.34, N = 3 3898.31 3884.54 3869.51 MIN: 3877.01 MIN: 3876.92 MIN: 3857.36 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 1 2 3 1.6M 3.2M 4.8M 6.4M 8M SE +/- 44996.61, N = 3 SE +/- 60261.05, N = 3 SE +/- 74914.67, N = 3 7419488 7444518 7392870 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.11, N = 3 10.20 10.20 10.13 MIN: 10.13 / MAX: 10.32 MIN: 9.86 / MAX: 21.1 MIN: 9.89 / MAX: 11.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 1 2 3 9 18 27 36 45 SE +/- 0.11, N = 3 SE +/- 0.18, N = 3 SE +/- 0.04, N = 3 37.64 37.65 37.90 MIN: 37.09 / MAX: 39.96 MIN: 37.19 / MAX: 39.03 MIN: 37.49 / MAX: 40.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 12.52 12.61 12.54 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 1 2 3 7 14 21 28 35 SE +/- 0.13, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 31.07 31.15 30.94 MIN: 30.77 / MAX: 40.23 MIN: 30.98 / MAX: 33.13 MIN: 30.7 / MAX: 32.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 30456.48, N = 3 SE +/- 10885.22, N = 3 SE +/- 19626.30, N = 3 1660500.87 1671609.50 1668121.37 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.14, N = 3 SE +/- 0.09, N = 3 19.60 19.49 19.47 MIN: 19.49 / MAX: 19.99 MIN: 19.14 / MAX: 19.91 MIN: 19.19 / MAX: 20.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 1 2 3 0.5969 1.1938 1.7907 2.3876 2.9845 SE +/- 0.002, N = 3 SE +/- 0.009, N = 3 SE +/- 0.008, N = 3 2.636 2.650 2.653
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 1 2 3 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.22, N = 3 SE +/- 0.05, N = 3 17.30 17.38 17.27 MIN: 17.18 / MAX: 17.63 MIN: 17.04 / MAX: 98.97 MIN: 17.12 / MAX: 18.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 11.47, N = 3 SE +/- 0.98, N = 3 SE +/- 8.25, N = 3 3897.98 3882.87 3873.60 MIN: 3871.77 MIN: 3878.39 MIN: 3858.23 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
KeyDB OpenBenchmarking.org Ops/sec, More Is Better KeyDB 6.0.16 1 2 3 90K 180K 270K 360K 450K SE +/- 1505.49, N = 3 SE +/- 1060.79, N = 3 SE +/- 57.06, N = 3 411366.57 411227.16 408847.89 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 1 2 3 7 14 21 28 35 SE +/- 0.09, N = 3 SE +/- 0.15, N = 3 SE +/- 0.02, N = 3 28.06 27.90 27.94 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 1 2 3 1200 2400 3600 4800 6000 SE +/- 33.51, N = 3 SE +/- 14.56, N = 3 SE +/- 19.26, N = 3 5489.2 5497.5 5519.6 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 1 2 3 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.07, N = 3 43.28 43.51 43.38 1. (CC) gcc options: -O3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 1 2 3 0.6299 1.2598 1.8897 2.5196 3.1495 SE +/- 0.00194, N = 3 SE +/- 0.00441, N = 3 SE +/- 0.00289, N = 3 2.79950 2.79007 2.78478 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 1 2 3 30 60 90 120 150 SE +/- 0.15, N = 3 SE +/- 0.68, N = 3 SE +/- 0.24, N = 3 114.3 114.1 114.7 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 13.83, N = 3 SE +/- 3.39, N = 3 SE +/- 11.46, N = 3 3879.97 3895.42 3876.36 MIN: 3849.25 MIN: 3886.76 MIN: 3849.85 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 1 2 3 16 32 48 64 80 SE +/- 0.13, N = 3 SE +/- 0.06, N = 3 SE +/- 0.19, N = 3 72.27 72.63 72.53 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Slow 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 8.75 8.77 8.79 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 1 2 3 30K 60K 90K 120K 150K SE +/- 280.09, N = 3 SE +/- 334.19, N = 3 SE +/- 224.91, N = 3 140877.18 141109.57 141520.58 1. (CC) gcc options: -O2 -lrt" -lrt
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 1 2 3 0.1985 0.397 0.5955 0.794 0.9925 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 0.878 0.882 0.882
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 1 2 3 0.2789 0.5578 0.8367 1.1156 1.3945 SE +/- 0.01118, N = 3 SE +/- 0.00266, N = 3 SE +/- 0.00515, N = 3 1.23725 1.23969 1.23435 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 1 2 3 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 23.82 23.82 23.72 MIN: 23.6 / MAX: 36.25 MIN: 23.47 / MAX: 25.66 MIN: 23.51 / MAX: 36.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 1 2 3 0.0477 0.0954 0.1431 0.1908 0.2385 SE +/- 0.00015, N = 3 SE +/- 0.00022, N = 3 SE +/- 0.00054, N = 3 0.21109 0.21092 0.21180 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 1 2 3 1200 2400 3600 4800 6000 SE +/- 6.01, N = 3 SE +/- 11.99, N = 3 SE +/- 5.41, N = 3 5451.8 5458.7 5474.5 1. (CC) gcc options: -O3
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 1 2 3 0.1089 0.2178 0.3267 0.4356 0.5445 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 0.482 0.484 0.484 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 2 3 4 8 12 16 20 SE +/- 0.11, N = 5 SE +/- 0.06, N = 5 SE +/- 0.06, N = 5 14.50 14.48 14.45 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 1 2 3 1000 2000 3000 4000 5000 SE +/- 14.35, N = 3 SE +/- 18.70, N = 3 SE +/- 5.40, N = 3 4787.47 4805.59 4792.61 1. (CC) gcc options: -O3
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 1 2 3 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 21.96 22.01 22.04 MIN: 21.61 / MAX: 23.2 MIN: 21.84 / MAX: 22.4 MIN: 21.86 / MAX: 22.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 3 10K 20K 30K 40K 50K 45294 45275 45131 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT 1 2 3 80M 160M 240M 320M 400M SE +/- 977507.07, N = 3 SE +/- 786027.10, N = 3 SE +/- 82458.49, N = 3 381176866.35 381158594.09 382533367.82 1. (CC) gcc options: -O3 -march=native -lm
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 1 2 3 1.269 2.538 3.807 5.076 6.345 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 5.62 5.62 5.64 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 22.68 22.72 22.76 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 1 2 3 0.065 0.13 0.195 0.26 0.325 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.288 0.289 0.288
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 1 2 3 30K 60K 90K 120K 150K SE +/- 306.29, N = 3 SE +/- 50.29, N = 3 SE +/- 303.12, N = 3 138193 138669 138249 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lboost_system -lboost_thread -lboost_filesystem -lboost_chrono -lboost_date_time -lboost_atomic -lpthread -lglog -lgflags -lsz -lz -ldl -lm -lprotobuf -llmdb -lopenblas
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 5.99 6.00 5.98 MIN: 5.93 / MAX: 6.86 MIN: 5.95 / MAX: 7.09 MIN: 5.93 / MAX: 6.94 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 9.02 9.04 9.05 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown 1 2 3 1.0637 2.1274 3.1911 4.2548 5.3185 SE +/- 0.0107, N = 3 SE +/- 0.0150, N = 3 SE +/- 0.0010, N = 3 4.7126 4.7276 4.7247 MIN: 4.68 / MAX: 4.76 MIN: 4.69 / MAX: 4.79 MIN: 4.71 / MAX: 4.76
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 5.89, N = 3 SE +/- 3.99, N = 3 SE +/- 5.09, N = 3 7399.77 7396.63 7419.20 MIN: 7384.45 MIN: 7381.31 MIN: 7404.38 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 11.23 11.20 11.21 MIN: 11.19 MIN: 11.17 MIN: 11.16 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 3.6 Computational Test: Dhrystone 2 1 2 3 8M 16M 24M 32M 40M SE +/- 250280.80, N = 3 SE +/- 317702.14, N = 3 SE +/- 365197.62, N = 3 37136699.5 37026672.1 37053570.5
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 10.18 10.16 10.19 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 1 2 3 0.1627 0.3254 0.4881 0.6508 0.8135 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 0.721 0.723 0.723
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 1 2 3 11 22 33 44 55 SE +/- 0.06, N = 3 SE +/- 0.17, N = 3 SE +/- 0.21, N = 3 46.99 46.98 47.11 MIN: 46.57 / MAX: 49.08 MIN: 46.59 / MAX: 60.2 MIN: 46.64 / MAX: 50.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 1 2 3 9 18 27 36 45 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 40.39 40.45 40.49 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 1 2 3 0.3731 0.7462 1.1193 1.4924 1.8655 SE +/- 0.005, N = 3 SE +/- 0.006, N = 3 SE +/- 0.002, N = 3 1.654 1.655 1.658
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 2 3 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 SE +/- 0.02, N = 3 82.31 82.48 82.33
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 1 2 3 0.9183 1.8366 2.7549 3.6732 4.5915 SE +/- 0.00157, N = 3 SE +/- 0.00200, N = 3 SE +/- 0.00055, N = 3 4.08144 4.07380 4.07377 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 1 2 3 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.13, N = 3 81.26 81.40 81.33 MIN: 80.69 / MAX: 92.8 MIN: 80.9 / MAX: 92.48 MIN: 80.91 / MAX: 85.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon 1 2 3 1.2769 2.5538 3.8307 5.1076 6.3845 SE +/- 0.0264, N = 3 SE +/- 0.0277, N = 3 SE +/- 0.0224, N = 3 5.6657 5.6753 5.6695 MIN: 5.59 / MAX: 5.79 MIN: 5.59 / MAX: 5.79 MIN: 5.62 / MAX: 5.79
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 1 2 3 0.2666 0.5332 0.7998 1.0664 1.333 SE +/- 0.004, N = 3 SE +/- 0.004, N = 3 SE +/- 0.001, N = 3 1.183 1.185 1.185
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm 1 2 3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 13.35 13.34 13.36
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 1 2 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.42 8.43 8.43 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 1 2 3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 12.00 11.99 12.00 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 2.74, N = 3 SE +/- 4.24, N = 3 SE +/- 5.94, N = 3 7404.84 7396.94 7398.38 MIN: 7395.03 MIN: 7384.58 MIN: 7383.67 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 1 2 3 3 6 9 12 15 SE +/- 0.019, N = 5 SE +/- 0.014, N = 5 SE +/- 0.011, N = 5 9.412 9.422 9.416 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 1 2 3 120K 240K 360K 480K 600K SE +/- 503.51, N = 3 SE +/- 1210.22, N = 3 SE +/- 1507.55, N = 3 574643 575213 574799
Ogg Audio Encoding WAV To Ogg OpenBenchmarking.org Seconds, Fewer Is Better Ogg Audio Encoding 1.3.4 WAV To Ogg 1 2 3 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 24.08 24.08 24.10 1. (CC) gcc options: -O2 -ffast-math -fsigned-char
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 114.00 114.07 114.09 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 1600 3200 4800 6400 8000 SE +/- 4.52, N = 3 SE +/- 3.54, N = 3 SE +/- 7.47, N = 3 7400.51 7394.59 7399.59 MIN: 7385.65 MIN: 7380.26 MIN: 7382.13 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 16.31 16.30 16.30 1. (CXX) g++ options: -rdynamic
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 1 2 3 6 12 18 24 30 SE +/- 0.22, N = 3 SE +/- 0.22, N = 3 SE +/- 0.20, N = 3 25.67 25.66 25.66 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 1 2 3 16 32 48 64 80 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 72.63 72.60 72.62 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 1 2 3 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 81.24 81.23 81.25 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 1 2 3 130 260 390 520 650 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 588.79 588.80 588.64 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing 1 2 3 200 400 600 800 1000 SE +/- 0.26, N = 3 SE +/- 0.24, N = 3 SE +/- 0.27, N = 3 1001.73 1001.52 1001.73 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 1 2 3 40 80 120 160 200 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 159.00 159.00 158.98 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 10.93 10.93 10.93 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 1 2 3 0.4635 0.927 1.3905 1.854 2.3175 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.06 2.06 2.06 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Slow 1 2 3 0.4523 0.9046 1.3569 1.8092 2.2615 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.01 2.01 2.01 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 2 3 0.6975 1.395 2.0925 2.79 3.4875 SE +/- 0.03, N = 12 SE +/- 0.03, N = 3 SE +/- 0.03, N = 15 3.1 3.1 3.1 1. (CC) gcc options: -fopenmp -O3 -lm
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 3 500K 1000K 1500K 2000K 2500K SE +/- 41608.51, N = 3 SE +/- 51226.02, N = 15 SE +/- 26746.34, N = 3 2215813.50 1939149.38 2041822.96 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 57333.88, N = 15 SE +/- 50997.94, N = 15 SE +/- 56308.56, N = 15 1661804.39 1761393.52 1687783.91 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 3 500K 1000K 1500K 2000K 2500K SE +/- 28970.62, N = 8 SE +/- 20132.66, N = 3 SE +/- 116386.47, N = 12 2368415.09 1438792.17 2024710.52 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 1 2 3 1.3083 2.6166 3.9249 5.2332 6.5415 SE +/- 0.18268, N = 3 SE +/- 0.02203, N = 3 SE +/- 0.18898, N = 3 5.46352 5.81487 5.36042 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 3.1.2
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest 1 2 3 4 8 12 16 20 SE +/- 0.51, N = 13 SE +/- 0.32, N = 15 SE +/- 0.19, N = 15 16.13 15.31 15.72 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest 1 2 3 4 8 12 16 20 SE +/- 0.55, N = 15 SE +/- 0.45, N = 14 SE +/- 0.04, N = 3 15.40 14.70 15.96 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Phoronix Test Suite v10.8.5