xeon-platinum-8380-2p-smoke-run 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2105012-IB-XEONPLATI04&grr&sro&rro .
xeon-platinum-8380-2p-smoke-run Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution r1 r1a r2 r2a r2b r3 r4 r5 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Device 0998 16 x 32 GB DDR4-3200MT/s Hynix HMA84GR7CJR4N-XN 2 x 7682GB INTEL SSDPF2KX076TZ + 2 x 800GB INTEL SSDPF21Q800GB + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96 ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 20.04 5.11.0-051100-generic (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 GCC 9.3.0 ext4 1920x1080 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - r1: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r1a: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r2: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270 - r2a: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r2b: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r3: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r4: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 - r5: Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270 Python Details - Python 2.7.18 + Python 3.8.5 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
xeon-platinum-8380-2p-smoke-run hammerdb-mariadb: 128 - 250 hammerdb-mariadb: 128 - 250 hammerdb-mariadb: 128 - 500 hammerdb-mariadb: 128 - 500 hammerdb-mariadb: 64 - 250 hammerdb-mariadb: 64 - 250 hammerdb-mariadb: 32 - 250 hammerdb-mariadb: 32 - 250 hammerdb-mariadb: 16 - 500 hammerdb-mariadb: 16 - 500 hammerdb-mariadb: 32 - 500 hammerdb-mariadb: 32 - 500 mysqlslap: 256 mysqlslap: 512 mysqlslap: 128 hammerdb-mariadb: 64 - 500 hammerdb-mariadb: 64 - 500 incompact3d: X3D-benchmarking input.i3d gnuradio: Hilbert Transform gnuradio: FM Deemphasis Filter gnuradio: IIR Filter gnuradio: FIR Filter gnuradio: Signal Source (Cosine) gnuradio: Five Back to Back FIR Filters mysqlslap: 64 aom-av1: Speed 4 Two-Pass - Bosphorus 4K cp2k: Fayalite-FIST hammerdb-mariadb: 16 - 250 hammerdb-mariadb: 16 - 250 luaradio: Complex Phase luaradio: Hilbert Transform luaradio: FM Deemphasis Filter luaradio: Five Back to Back FIR Filters hammerdb-mariadb: 8 - 250 hammerdb-mariadb: 8 - 250 hammerdb-mariadb: 8 - 500 hammerdb-mariadb: 8 - 500 aom-av1: Speed 6 Two-Pass - Bosphorus 4K mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mysqlslap: 1 securemark: SecureMark-TLS aom-av1: Speed 0 Two-Pass - Bosphorus 4K mysqlslap: 32 build-llvm: Unix Makefiles aom-av1: Speed 4 Two-Pass - Bosphorus 1080p luxcorerender: Orange Juice - CPU mysqlslap: 16 build-erlang: Time To Compile luxcorerender: DLSC - CPU intel-mlc: Max Bandwidth - Stream-Triad Like intel-mlc: Max Bandwidth - 1:1 Reads-Writes intel-mlc: Max Bandwidth - 2:1 Reads-Writes intel-mlc: Max Bandwidth - 3:1 Reads-Writes intel-mlc: Max Bandwidth - All Reads mysqlslap: 8 build-llvm: Ninja aom-av1: Speed 6 Realtime - Bosphorus 4K onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU mysqlslap: 4 aom-av1: Speed 8 Realtime - Bosphorus 4K gmpbench: Total Time blender: Barbershop - CPU-Only build-nodejs: Time To Compile viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY xmrig: Monero - 1M aom-av1: Speed 6 Two-Pass - Bosphorus 1080p build-linux-kernel: Time To Compile sysbench: CPU blender: Pabellon Barcelona - CPU-Only build-wasmer: Time To Compile onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU aom-av1: Speed 9 Realtime - Bosphorus 4K blender: Classroom - CPU-Only toktx: UASTC 4 + Zstd Compression 19 onednn: Deconvolution Batch shapes_1d - f32 - CPU luxcorerender: Danish Mood - CPU luxcorerender: LuxCore Benchmark - CPU avifenc: 0 aom-av1: Speed 0 Two-Pass - Bosphorus 1080p luxcorerender: Rainbow Colors and Prism - CPU aom-av1: Speed 6 Realtime - Bosphorus 1080p blender: Fishy Cat - CPU-Only onednn: IP Shapes 1D - bf16bf16bf16 - CPU vosk: stockfish: Total Time avifenc: 6, Lossless srslte: PHY_DL_Test srslte: PHY_DL_Test avifenc: 6 srslte: OFDM_Test sysbench: RAM / Memory avifenc: 2 botan: AES-256 - Decrypt botan: AES-256 basis: ETC1S basis: UASTC Level 0 avifenc: 10, Lossless aom-av1: Speed 9 Realtime - Bosphorus 1080p blender: BMW27 - CPU-Only botan: ChaCha20Poly1305 - Decrypt botan: ChaCha20Poly1305 botan: Blowfish - Decrypt botan: Blowfish botan: Twofish - Decrypt botan: Twofish botan: CAST-256 - Decrypt botan: CAST-256 botan: KASUMI - Decrypt botan: KASUMI toybrot: TBB xmrig: Wownero - 1M intel-mlc: Peak Injection Bandwidth - Stream-Triad Like intel-mlc: Peak Injection Bandwidth - 1:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - 2:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - 3:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - All Reads onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU helsing: 14 digit tjbench: Decompression Throughput astcenc: Thorough astcenc: Exhaustive avifenc: 10 astcenc: Medium build-mesa: Time To Compile onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU svt-hevc: 1 - Bosphorus 1080p aom-av1: Speed 8 Realtime - Bosphorus 1080p liquid-dsp: 160 - 256 - 57 liquid-dsp: 128 - 256 - 57 liquid-dsp: 64 - 256 - 57 liquid-dsp: 32 - 256 - 57 liquid-dsp: 16 - 256 - 57 liquid-dsp: 8 - 256 - 57 liquid-dsp: 4 - 256 - 57 liquid-dsp: 2 - 256 - 57 liquid-dsp: 1 - 256 - 57 toktx: Zstd Compression 19 basis: UASTC Level 3 onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU svt-vp9: VMAF Optimized - Bosphorus 1080p toktx: UASTC 3 onednn: IP Shapes 3D - f32 - CPU intel-mlc: Idle Latency onednn: IP Shapes 1D - f32 - CPU basis: UASTC Level 2 incompact3d: input.i3d 193 Cells Per Direction toktx: UASTC 3 + Zstd Compression 19 onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU incompact3d: input.i3d 129 Cells Per Direction toktx: Zstd Compression 9 onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU toybrot: C++ Tasks draco: Church Facade onednn: IP Shapes 3D - bf16bf16bf16 - CPU draco: Lion toybrot: OpenMP toybrot: C++ Threads svt-vp9: Visual Quality Optimized - Bosphorus 1080p svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU svt-hevc: 10 - Bosphorus 1080p svt-hevc: 7 - Bosphorus 1080p r1 r1a r2 r2a r2b r3 r4 r5 167809 55415 173288 57190 191397 63279 209254 69054 195258 64477 208419 68818 194684 64298 313.920451 459.3 734.0 610.6 603.0 2183.5 1024.3 192913 63757 546.8 80.3 410.0 1094.8 290082 95768 285984 94379 7.37 225412 216.323 14.36 114.550 9.70 325766.94 439496.74 459455.38 426148.96 357285.28 145.717 15.09 804.392 29.20 4642.1 101.101 76.3 76.0 75.6 73.5 719 72.3 720 1058 843 620 1003 1834 19299.5 24.382 62.160 801.409 792.831 1.21594 447.971 445.144 445.519 33.07 7.49467 7.42 7.84 57.975 17.04 2.96135 35.918 181644819 32.113 76.9 183.4 13.247 120300000 31.539 5663.055 5669.700 8.852 619.458 623.494 363.255 363.038 292.736 289.126 116.074 115.972 74.320 77.287 6850 48051.5 324377.2 442422.3 459038.6 425933.7 356476.2 0.338327 0.215115 77.872 161.634619 5.477 20.952 3.53026 0.398282 36.91 3144800000 3415933333 3267133333 1735100000 885320000 441953333 217643333 110713333 57792000 0.239989 386.29 1.24809 35.1 0.918568 11.3586022 0.210919 0.593042 2.74370996 3.57247 0.864164 7879 1.80046 7318 7018 327.87 401.29 1.10991 0.877815 2.07944 499.23 290.67 173228 57242 188761 62311 311.960785 459.1 727.4 609.5 604.8 2175.3 1015.2 4.17 548.2 80.3 409.6 1094.5 7.55 225366 0.19 215.760 6.89 14.26 113.800 9.61 325184.58 441408.09 456629.89 424612.62 358364.56 145.550 15.19 793.363 28.99 4642.8 100.446 77.2 77.4 76.8 72.3 319 63.6 371 392 335 277 370 504 19452.0 21.25 24.360 61.930 804.323 791.927 1.22278 447.308 446.936 447.436 32.51 7.50059 7.55 8.04 57.710 0.51 13.34 28.66 2.96857 35.009 186263552 31.624 77.3 184.2 13.328 120133333 31.479 5663.612 5670.809 8.812 125.25 619.538 623.198 363.326 363.615 292.374 288.852 116.069 115.970 74.288 77.310 6964 50166.1 323924.2 442843.2 456260.3 424096.6 358385.5 0.341663 0.213643 78.159 156.969016 5.505 20.379 3.54367 0.395588 37.34 103.92 3162066667 3352733333 3263700000 1736800000 890273333 0.240122 393.46 1.25267 33.0 0.912279 11.2727114 0.210728 0.595661 2.73859096 3.57662 0.863214 7724 1.79881 7308 6980 329.53 408.24 1.12224 0.879137 2.08532 493.51 288.99 67.5 1374.663 325260.41 442460.05 456545.88 424818.83 358456.09 323826.9 442144.2 456408.6 424077.3 358269.7 32.5 160 166 192 307.622108 357.4 645.8 498.2 470.0 1684.4 111.2 403 2.01 458.7 78.2 370.1 804.5 3.22 53.073 3.213 4.078 48.732 7.174 3336 225343 0.14 885 226.440 3.30 14.28 1264 191.746 9.27 325409.99 441732.77 459226.53 425997.22 357774.43 1413 148.484 5.97 791.695 1614 12.03 4524.5 110.02 110.930 54.7 62.3 59.8 61.9 389.9 62.3 447.65 507.1 422.2 349 474 691 19311.1 7.45 27.997 214210.83 88.57 71.928 808.289 789.836 1.23796 446.389 447.287 447.701 14.30 71.78 56.660 28.4023 5.73 5.84 64.971 0.32 13.42 10.39 46.38 3.00464 36.424 181554218 38.395 75.0 181.6 16.065 120733333 12510.56 38.372 5662.763 5606.967 34.237 11.251 10.282 43.26 29.56 612.438 615.806 363.196 362.926 292.396 288.562 116.080 114.663 74.275 76.286 6984 49908.3 324209.8 440454.7 459309.8 425925.6 357742.9 0.341893 0.216806 78.33 160.262559 9.2907 16.3621 6.656 7.1887 21.575 3.53121 0.403409 27.80 36.20 3131866667 3400066667 3227433333 1699333333 862890000 428100000 213203333 110173333 56230333 19.781 17.163 0.243026 182.26 5.664 1.25313 0.943624 13.979 11.5617158 10.011 0.210324 0.602122 3.02281992 3.470 3.64232 0.874080 8050 7001 1.81774 6126 7412 7149 164.32 182.17 1.11874 0.869978 2.11712 234.51 158.16 189 386.390001 408.0 621.0 487.4 502.0 1723.9 580.5 404 2.05 458.2 78.2 370.3 662.8 3.20 3458 225291 0.15 887 226.199 3.36 13.89 1262 192.245 9.24 325218.50 440939.22 457141.24 424925.84 358268.00 1420 147.163 5.97 793.916 1580 11.94 4504.5 111.790 61.7 66.9 68.9 66.4 647 64.3 713.47 1024.2 913 532 862 1135 20652.9 7.38 28.018 71.130 796.689 793.080 1.24508 450.648 447.144 446.917 14.06 28.1815 5.65 5.92 65.960 0.33 16.47 10.39 3.00929 35.581 189214499 38.590 76.1 181.6 16.615 120833333 38.313 5662.342 5593.366 10.088 43.42 612.149 616.501 363.314 359.452 292.827 286.180 115.723 114.517 74.309 76.407 7003 49813.4 324227.4 449554.1 457190.5 424904.5 358463.7 0.341955 0.216586 78.079 159.187038 6.597 21.369 3.56224 0.406877 28.22 36.06 3143300000 3411000000 3232700000 1704500000 865410000 432170000 215343333 111510000 57197667 0.243308 185.53 1.24176 67.6 0.936941 14.5982965 0.218349 0.602314 3.56592774 3.64033 0.874968 8048 1.84339 7439 7203 164.51 181.52 1.14578 0.901823 2.10841 234.39 157.83 389.698280 373.8 622.0 487.7 515.6 1619.2 487.9 2.10 452.7 78.4 368.0 706.1 3.23 52.227 3.362 4.100 48.041 7.170 222747 0.14 224.290 3.36 13.94 193.839 9.25 325314.62 440315.41 458790.96 425848.09 357925.98 146.909 6.00 811.941 12.10 4525.7 109.96 111.673 63.7 67.6 72.4 70.8 647 70.2 765 1158 936 535 855 1167 20574.6 7.43 28.094 214241.34 88.68 70.758 792.296 792.049 1.24116 446.536 448.906 447.958 14.73 72.29 56.770 28.4613 5.68 5.87 65.888 0.33 14.79 10.54 46.73 3.00907 35.503 186013261 38.507 78.3 183.7 16.211 120666667 12553.44 37.796 5650.139 5611.995 34.420 11.226 10.208 42.37 29.69 615.975 619.638 363.279 359.573 292.610 286.004 116.070 114.646 74.292 76.403 7016 49937.3 324112.8 446396.0 458941.9 425822.1 358110.5 0.340243 0.215085 78.539 159.237752 9.3091 16.3729 6.746 7.1472 21.313 3.54783 0.402919 28.01 36.35 3140266667 3398800000 3245666667 1697500000 860046667 432013333 216773333 109430000 55251667 20.082 17.185 0.242450 184.07 5.562 1.24222 67.8 0.940714 14.159 14.6577489 10.029 0.217941 0.602038 3.57278153 3.697 3.64319 0.876227 8037 7082 1.81913 6170 7429 7141 162.21 179.13 1.11811 0.875421 2.10837 233.96 156.26 325312.30 440205.22 458756.46 425467.51 357550.82 324234.5 448800.1 458830.6 425508.1 357722.7 68.1 OpenBenchmarking.org
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2616.54, N = 9 167809 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 250 r1 12K 24K 36K 48K 60K SE +/- 857.30, N = 9 55415 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 500 r1a r1 40K 80K 120K 160K 200K SE +/- 1389.03, N = 9 SE +/- 2691.06, N = 9 173228 173288 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 128 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 128 - Warehouses: 500 r1a r1 12K 24K 36K 48K 60K SE +/- 484.29, N = 9 SE +/- 891.59, N = 9 57242 57190 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2831.11, N = 9 191397 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 250 r1 14K 28K 42K 56K 70K SE +/- 937.55, N = 9 63279 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 3390.81, N = 9 209254 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 250 r1 15K 30K 45K 60K 75K SE +/- 1078.76, N = 9 69054 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 500 r1 40K 80K 120K 160K 200K SE +/- 3159.46, N = 9 195258 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 500 r1 14K 28K 42K 56K 70K SE +/- 1031.07, N = 9 64477 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 500 r1 40K 80K 120K 160K 200K SE +/- 2885.40, N = 9 208419 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 32 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 32 - Warehouses: 500 r1 15K 30K 45K 60K 75K SE +/- 921.11, N = 9 68818 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
MariaDB Clients: 256 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 256 r2b 40 80 120 160 200 SE +/- 0.22, N = 3 160 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
MariaDB Clients: 512 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 512 r2b 40 80 120 160 200 SE +/- 0.87, N = 3 166 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
MariaDB Clients: 128 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 128 r3 r2b 40 80 120 160 200 SE +/- 0.35, N = 3 SE +/- 0.65, N = 3 189 192 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 500 r1a r1 40K 80K 120K 160K 200K SE +/- 2084.32, N = 9 SE +/- 2149.33, N = 3 188761 194684 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 64 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 64 - Warehouses: 500 r1a r1 14K 28K 42K 56K 70K SE +/- 730.55, N = 9 SE +/- 620.04, N = 3 62311 64298 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d r4 r3 r2b r1a r1 80 160 240 320 400 SE +/- 3.91, N = 9 SE +/- 4.39, N = 9 SE +/- 2.73, N = 9 SE +/- 0.12, N = 3 SE +/- 0.46, N = 3 389.70 386.39 307.62 311.96 313.92 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
GNU Radio Test: Hilbert Transform OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Hilbert Transform r4 r3 r2b r1a r1 100 200 300 400 500 SE +/- 24.71, N = 9 SE +/- 17.46, N = 9 SE +/- 47.90, N = 3 SE +/- 1.66, N = 3 SE +/- 2.02, N = 3 373.8 408.0 357.4 459.1 459.3 1. 3.8.1.0
GNU Radio Test: FM Deemphasis Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FM Deemphasis Filter r4 r3 r2b r1a r1 160 320 480 640 800 SE +/- 32.02, N = 9 SE +/- 31.57, N = 9 SE +/- 53.33, N = 3 SE +/- 1.04, N = 3 SE +/- 1.94, N = 3 622.0 621.0 645.8 727.4 734.0 1. 3.8.1.0
GNU Radio Test: IIR Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: IIR Filter r4 r3 r2b r1a r1 130 260 390 520 650 SE +/- 25.67, N = 9 SE +/- 26.49, N = 9 SE +/- 45.07, N = 3 SE +/- 0.46, N = 3 SE +/- 0.38, N = 3 487.7 487.4 498.2 609.5 610.6 1. 3.8.1.0
GNU Radio Test: FIR Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FIR Filter r4 r3 r2b r1a r1 130 260 390 520 650 SE +/- 11.25, N = 9 SE +/- 16.19, N = 9 SE +/- 44.41, N = 3 SE +/- 0.20, N = 3 SE +/- 1.45, N = 3 515.6 502.0 470.0 604.8 603.0 1. 3.8.1.0
GNU Radio Test: Signal Source (Cosine) OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Signal Source (Cosine) r4 r3 r2b r1a r1 500 1000 1500 2000 2500 SE +/- 82.03, N = 9 SE +/- 72.44, N = 9 SE +/- 168.17, N = 3 SE +/- 2.24, N = 3 SE +/- 0.93, N = 3 1619.2 1723.9 1684.4 2175.3 2183.5 1. 3.8.1.0
GNU Radio Test: Five Back to Back FIR Filters OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Five Back to Back FIR Filters r4 r3 r2b r1a r1 200 400 600 800 1000 SE +/- 48.36, N = 9 SE +/- 39.63, N = 9 SE +/- 1.12, N = 3 SE +/- 2.30, N = 3 SE +/- 2.54, N = 3 487.9 580.5 111.2 1015.2 1024.3 1. 3.8.1.0
MariaDB Clients: 64 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 64 r3 r2b 90 180 270 360 450 SE +/- 0.16, N = 3 SE +/- 0.62, N = 3 404 403 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K r4 r3 r2b r1a 0.9383 1.8766 2.8149 3.7532 4.6915 SE +/- 0.01, N = 3 SE +/- 0.02, N = 9 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 2.10 2.05 2.01 4.17 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Input: Fayalite-FIST r2a 300 600 900 1200 1500 1374.66
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 250 r1 40K 80K 120K 160K 200K SE +/- 2649.02, N = 3 192913 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 16 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 16 - Warehouses: 250 r1 14K 28K 42K 56K 70K SE +/- 880.35, N = 3 63757 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
LuaRadio Test: Complex Phase OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Complex Phase r4 r3 r2b r1a r1 120 240 360 480 600 SE +/- 4.50, N = 6 SE +/- 4.31, N = 6 SE +/- 3.61, N = 9 SE +/- 0.71, N = 3 SE +/- 0.25, N = 3 452.7 458.2 458.7 548.2 546.8
LuaRadio Test: Hilbert Transform OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Hilbert Transform r4 r3 r2b r1a r1 20 40 60 80 100 SE +/- 0.61, N = 6 SE +/- 0.47, N = 6 SE +/- 0.41, N = 9 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 78.4 78.2 78.2 80.3 80.3
LuaRadio Test: FM Deemphasis Filter OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: FM Deemphasis Filter r4 r3 r2b r1a r1 90 180 270 360 450 SE +/- 1.19, N = 6 SE +/- 4.83, N = 6 SE +/- 5.30, N = 9 SE +/- 1.40, N = 3 SE +/- 0.21, N = 3 368.0 370.3 370.1 409.6 410.0
LuaRadio Test: Five Back to Back FIR Filters OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Five Back to Back FIR Filters r4 r3 r2b r1a r1 200 400 600 800 1000 SE +/- 73.21, N = 6 SE +/- 74.31, N = 6 SE +/- 22.87, N = 9 SE +/- 0.62, N = 3 SE +/- 2.24, N = 3 706.1 662.8 804.5 1094.5 1094.8
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 250 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 250 r1 60K 120K 180K 240K 300K SE +/- 2006.72, N = 3 290082 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 250 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 250 r1 20K 40K 60K 80K 100K SE +/- 675.05, N = 3 95768 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 500 OpenBenchmarking.org Transactions Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 500 r1 60K 120K 180K 240K 300K SE +/- 2338.98, N = 3 285984 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
HammerDB - MariaDB Virtual Users: 8 - Warehouses: 500 OpenBenchmarking.org New Orders Per Minute, More Is Better HammerDB - MariaDB 10.5.9 Virtual Users: 8 - Warehouses: 500 r1 20K 40K 60K 80K 100K SE +/- 693.36, N = 3 94379 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lbz2 -lsnappy -ldl -lz -lrt
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K r4 r3 r2b r1a r1 2 4 6 8 10 SE +/- 0.03, N = 5 SE +/- 0.04, N = 3 SE +/- 0.03, N = 9 SE +/- 0.06, N = 3 SE +/- 0.09, N = 15 3.23 3.20 3.22 7.55 7.37 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 r4 r2b 12 24 36 48 60 SE +/- 0.75, N = 12 SE +/- 1.54, N = 3 52.23 53.07 MIN: 47.47 / MAX: 94.69 MIN: 49.59 / MAX: 69.62 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 r4 r2b 0.7565 1.513 2.2695 3.026 3.7825 SE +/- 0.021, N = 12 SE +/- 0.089, N = 3 3.362 3.213 MIN: 2.98 / MAX: 6.66 MIN: 2.8 / MAX: 6.7 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 r4 r2b 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.135, N = 12 SE +/- 0.333, N = 3 4.100 4.078 MIN: 2.97 / MAX: 12.98 MIN: 2.9 / MAX: 13.17 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 r4 r2b 11 22 33 44 55 SE +/- 1.07, N = 12 SE +/- 2.59, N = 3 48.04 48.73 MIN: 42.13 / MAX: 145.2 MIN: 43.19 / MAX: 69.59 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 r4 r2b 2 4 6 8 10 SE +/- 0.078, N = 12 SE +/- 0.002, N = 3 7.170 7.174 MIN: 6.38 / MAX: 9.97 MIN: 6.95 / MAX: 7.88 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
MariaDB Clients: 1 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 1 r3 r2b 700 1400 2100 2800 3500 SE +/- 61.33, N = 12 SE +/- 73.97, N = 15 3458 3336 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
SecureMark Benchmark: SecureMark-TLS OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS r4 r3 r2b r1a r1 50K 100K 150K 200K 250K SE +/- 2769.20, N = 3 SE +/- 267.95, N = 3 SE +/- 84.15, N = 3 SE +/- 236.12, N = 3 SE +/- 234.37, N = 3 222747 225291 225343 225366 225412 1. (CC) gcc options: -pedantic -O3
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K r4 r3 r2b r1a 0.0428 0.0856 0.1284 0.1712 0.214 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 12 SE +/- 0.00, N = 5 0.14 0.15 0.14 0.19 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
MariaDB Clients: 32 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 32 r3 r2b 200 400 600 800 1000 SE +/- 1.83, N = 3 SE +/- 0.26, N = 3 887 885 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 12.0 Build System: Unix Makefiles r4 r3 r2b r1a r1 50 100 150 200 250 SE +/- 0.43, N = 3 SE +/- 1.24, N = 3 SE +/- 0.77, N = 3 SE +/- 0.80, N = 3 SE +/- 0.91, N = 3 224.29 226.20 226.44 215.76 216.32
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p r4 r3 r2b r1a 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.04, N = 5 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 3.36 3.36 3.30 6.89 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
LuxCoreRender Scene: Orange Juice - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Orange Juice - Acceleration: CPU r4 r3 r2b r1a r1 4 8 12 16 20 SE +/- 0.13, N = 15 SE +/- 0.12, N = 15 SE +/- 0.18, N = 3 SE +/- 0.21, N = 3 SE +/- 0.13, N = 3 13.94 13.89 14.28 14.26 14.36 MIN: 11.06 / MAX: 17.84 MIN: 11.08 / MAX: 17.77 MIN: 11.93 / MAX: 17.73 MIN: 11.6 / MAX: 19.3 MIN: 11.58 / MAX: 19.44
MariaDB Clients: 16 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 16 r3 r2b 300 600 900 1200 1500 SE +/- 3.49, N = 3 SE +/- 1.85, N = 3 1262 1264 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Timed Erlang/OTP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Erlang/OTP Compilation 23.2 Time To Compile r4 r3 r2b r1a r1 40 80 120 160 200 SE +/- 1.56, N = 3 SE +/- 0.31, N = 3 SE +/- 1.08, N = 3 SE +/- 0.37, N = 3 SE +/- 0.18, N = 3 193.84 192.25 191.75 113.80 114.55
LuxCoreRender Scene: DLSC - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: DLSC - Acceleration: CPU r4 r3 r2b r1a r1 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 SE +/- 0.08, N = 15 SE +/- 0.09, N = 15 SE +/- 0.09, N = 3 9.25 9.24 9.27 9.61 9.70 MIN: 8.59 / MAX: 11.4 MIN: 8.74 / MAX: 11.37 MIN: 8.31 / MAX: 11.98 MIN: 8 / MAX: 12.27 MIN: 8.98 / MAX: 12.22
Intel Memory Latency Checker Test: Max Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - Stream-Triad Like r5 r4 r3 r2b r2a r1a r1 70K 140K 210K 280K 350K SE +/- 22.58, N = 3 SE +/- 7.71, N = 3 SE +/- 50.80, N = 3 SE +/- 50.20, N = 3 SE +/- 53.08, N = 3 SE +/- 11.61, N = 3 SE +/- 25.05, N = 3 325312.30 325314.62 325218.50 325409.99 325260.41 325184.58 325766.94
Intel Memory Latency Checker Test: Max Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 1:1 Reads-Writes r5 r4 r3 r2b r2a r1a r1 90K 180K 270K 360K 450K SE +/- 1051.98, N = 3 SE +/- 2322.32, N = 3 SE +/- 276.68, N = 3 SE +/- 3117.58, N = 3 SE +/- 1844.14, N = 3 SE +/- 1093.30, N = 3 SE +/- 821.19, N = 3 440205.22 440315.41 440939.22 441732.77 442460.05 441408.09 439496.74
Intel Memory Latency Checker Test: Max Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 2:1 Reads-Writes r5 r4 r3 r2b r2a r1a r1 100K 200K 300K 400K 500K SE +/- 53.22, N = 3 SE +/- 8.60, N = 3 SE +/- 89.89, N = 3 SE +/- 51.02, N = 3 SE +/- 54.98, N = 3 SE +/- 129.26, N = 3 SE +/- 33.49, N = 3 458756.46 458790.96 457141.24 459226.53 456545.88 456629.89 459455.38
Intel Memory Latency Checker Test: Max Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - 3:1 Reads-Writes r5 r4 r3 r2b r2a r1a r1 90K 180K 270K 360K 450K SE +/- 133.64, N = 3 SE +/- 67.02, N = 3 SE +/- 109.66, N = 3 SE +/- 71.38, N = 3 SE +/- 392.90, N = 3 SE +/- 465.24, N = 3 SE +/- 105.41, N = 3 425467.51 425848.09 424925.84 425997.22 424818.83 424612.62 426148.96
Intel Memory Latency Checker Test: Max Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Max Bandwidth - All Reads r5 r4 r3 r2b r2a r1a r1 80K 160K 240K 320K 400K SE +/- 46.23, N = 3 SE +/- 83.70, N = 3 SE +/- 59.61, N = 3 SE +/- 83.63, N = 3 SE +/- 107.35, N = 3 SE +/- 142.76, N = 3 SE +/- 67.01, N = 3 357550.82 357925.98 358268.00 357774.43 358456.09 358364.56 357285.28
MariaDB Clients: 8 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 8 r3 r2b 300 600 900 1200 1500 SE +/- 3.56, N = 3 SE +/- 10.97, N = 3 1420 1413 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 12.0 Build System: Ninja r4 r3 r2b r1a r1 30 60 90 120 150 SE +/- 0.56, N = 3 SE +/- 0.32, N = 3 SE +/- 1.12, N = 3 SE +/- 0.75, N = 3 SE +/- 0.52, N = 3 146.91 147.16 148.48 145.55 145.72
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K r4 r3 r2b r1a r1 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.07, N = 12 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 6.00 5.97 5.97 15.19 15.09 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU r4 r3 r2b r1a r1 200 400 600 800 1000 SE +/- 16.86, N = 14 SE +/- 0.83, N = 3 SE +/- 0.61, N = 3 SE +/- 1.56, N = 3 SE +/- 7.01, N = 3 811.94 793.92 791.70 793.36 804.39 MIN: 761.61 MIN: 769 MIN: 769.61 MIN: 765.14 MIN: 763.49 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
MariaDB Clients: 4 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 4 r3 r2b 300 600 900 1200 1500 SE +/- 7.20, N = 3 SE +/- 16.07, N = 3 1580 1614 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O2 -shared -lpthread -lsnappy -ldl -lz -lrt
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K r4 r3 r2b r1a r1 7 14 21 28 35 SE +/- 0.17, N = 3 SE +/- 0.12, N = 15 SE +/- 0.08, N = 15 SE +/- 0.29, N = 5 SE +/- 0.19, N = 3 12.10 11.94 12.03 28.99 29.20 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
GNU GMP GMPbench Total Time OpenBenchmarking.org GMPbench Score, More Is Better GNU GMP GMPbench 6.2.1 Total Time r4 r3 r2b r1a r1 1000 2000 3000 4000 5000 4525.7 4504.5 4524.5 4642.8 4642.1 1. (CC) gcc options: -O3 -fomit-frame-pointer -lm
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Barbershop - Compute: CPU-Only r4 r2b 20 40 60 80 100 SE +/- 0.59, N = 3 SE +/- 0.18, N = 3 109.96 110.02
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 15.11 Time To Compile r4 r3 r2b r1a r1 30 60 90 120 150 SE +/- 0.78, N = 3 SE +/- 0.68, N = 3 SE +/- 0.50, N = 3 SE +/- 0.29, N = 3 SE +/- 0.27, N = 3 111.67 111.79 110.93 100.45 101.10
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT r4 r3 r2b r1a r1 20 40 60 80 100 SE +/- 2.94, N = 15 SE +/- 2.33, N = 15 SE +/- 1.75, N = 15 SE +/- 0.90, N = 3 SE +/- 1.45, N = 13 63.7 61.7 54.7 77.2 76.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN r4 r3 r2b r1a r1 20 40 60 80 100 SE +/- 2.43, N = 14 SE +/- 1.88, N = 15 SE +/- 2.02, N = 15 SE +/- 0.69, N = 3 SE +/- 1.67, N = 13 67.6 66.9 62.3 77.4 76.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT r4 r3 r2b r1a r1 20 40 60 80 100 SE +/- 1.98, N = 15 SE +/- 1.99, N = 15 SE +/- 1.14, N = 15 SE +/- 1.01, N = 3 SE +/- 1.88, N = 13 72.4 68.9 59.8 76.8 75.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN r4 r3 r2b r1a r1 16 32 48 64 80 SE +/- 1.95, N = 15 SE +/- 2.18, N = 15 SE +/- 2.06, N = 15 SE +/- 3.11, N = 3 SE +/- 1.42, N = 14 70.8 66.4 61.9 72.3 73.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T r4 r3 r2b r1a r1 160 320 480 640 800 SE +/- 3.20, N = 15 SE +/- 2.02, N = 15 SE +/- 27.49, N = 15 SE +/- 5.04, N = 3 SE +/- 2.46, N = 13 647.0 647.0 389.9 319.0 719.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N r4 r3 r2b r1a r1 16 32 48 64 80 SE +/- 0.25, N = 15 SE +/- 3.93, N = 15 SE +/- 3.75, N = 15 SE +/- 2.90, N = 3 SE +/- 0.36, N = 14 70.2 64.3 62.3 63.6 72.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT r4 r3 r2b r1a r1 160 320 480 640 800 SE +/- 2.76, N = 15 SE +/- 50.57, N = 15 SE +/- 34.40, N = 14 SE +/- 34.44, N = 3 SE +/- 6.43, N = 14 765.00 713.47 447.65 371.00 720.00 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY r4 r3 r2b r1a r1 200 400 600 800 1000 SE +/- 5.62, N = 15 SE +/- 82.34, N = 15 SE +/- 40.80, N = 15 SE +/- 23.02, N = 3 SE +/- 20.63, N = 14 1158.0 1024.2 507.1 392.0 1058.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY r4 r3 r2b r1a r1 200 400 600 800 1000 SE +/- 9.73, N = 15 SE +/- 26.97, N = 15 SE +/- 35.11, N = 15 SE +/- 29.90, N = 3 SE +/- 25.47, N = 14 936.0 913.0 422.2 335.0 843.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT r4 r3 r2b r1a r1 130 260 390 520 650 SE +/- 2.45, N = 15 SE +/- 2.55, N = 15 SE +/- 5.60, N = 15 SE +/- 11.67, N = 3 SE +/- 2.34, N = 14 535 532 349 277 620 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY r4 r3 r2b r1a r1 200 400 600 800 1000 SE +/- 11.35, N = 15 SE +/- 8.11, N = 15 SE +/- 10.36, N = 15 SE +/- 15.25, N = 3 SE +/- 6.62, N = 14 855 862 474 370 1003 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY r4 r3 r2b r1a r1 400 800 1200 1600 2000 SE +/- 54.62, N = 15 SE +/- 51.32, N = 15 SE +/- 22.07, N = 15 SE +/- 4.10, N = 3 SE +/- 16.63, N = 14 1167 1135 691 504 1834 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M r4 r3 r2b r1a r1 4K 8K 12K 16K 20K SE +/- 243.31, N = 15 SE +/- 245.77, N = 3 SE +/- 151.73, N = 3 SE +/- 20.55, N = 3 SE +/- 23.28, N = 3 20574.6 20652.9 19311.1 19452.0 19299.5 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p r4 r3 r2b r1a 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.17, N = 3 7.43 7.38 7.45 21.25 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.10.20 Time To Compile r4 r3 r2b r1a r1 7 14 21 28 35 SE +/- 0.37, N = 14 SE +/- 0.41, N = 14 SE +/- 0.32, N = 14 SE +/- 0.28, N = 4 SE +/- 0.30, N = 4 28.09 28.02 28.00 24.36 24.38
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU r4 r2b 50K 100K 150K 200K 250K SE +/- 269.51, N = 3 SE +/- 247.29, N = 3 214241.34 214210.83 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Pabellon Barcelona - Compute: CPU-Only r4 r2b 20 40 60 80 100 SE +/- 0.28, N = 3 SE +/- 0.08, N = 3 88.68 88.57
Timed Wasmer Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Wasmer Compilation 1.0.2 Time To Compile r4 r3 r2b r1a r1 16 32 48 64 80 SE +/- 0.51, N = 3 SE +/- 0.66, N = 7 SE +/- 0.42, N = 3 SE +/- 0.62, N = 3 SE +/- 0.22, N = 3 70.76 71.13 71.93 61.93 62.16 1. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU r4 r3 r2b r1a r1 200 400 600 800 1000 SE +/- 2.67, N = 3 SE +/- 1.09, N = 3 SE +/- 9.76, N = 3 SE +/- 4.49, N = 3 SE +/- 7.46, N = 3 792.30 796.69 808.29 804.32 801.41 MIN: 763.96 MIN: 771.28 MIN: 767.97 MIN: 765.37 MIN: 767.38 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU r4 r3 r2b r1a r1 200 400 600 800 1000 SE +/- 1.96, N = 3 SE +/- 2.18, N = 3 SE +/- 1.48, N = 3 SE +/- 3.65, N = 3 SE +/- 2.07, N = 3 792.05 793.08 789.84 791.93 792.83 MIN: 765.9 MIN: 768.2 MIN: 767.03 MIN: 765.01 MIN: 763.76 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU r4 r3 r2b r1a r1 0.2801 0.5602 0.8403 1.1204 1.4005 SE +/- 0.00891, N = 15 SE +/- 0.01066, N = 15 SE +/- 0.01174, N = 15 SE +/- 0.01126, N = 15 SE +/- 0.01080, N = 15 1.24116 1.24508 1.23796 1.22278 1.21594 MIN: 0.85 MIN: 0.89 MIN: 0.87 MIN: 0.85 MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU r4 r3 r2b r1a r1 100 200 300 400 500 SE +/- 1.10, N = 3 SE +/- 2.40, N = 3 SE +/- 0.78, N = 3 SE +/- 0.90, N = 3 SE +/- 0.58, N = 3 446.54 450.65 446.39 447.31 447.97 MIN: 429.71 MIN: 432.96 MIN: 432.04 MIN: 432.33 MIN: 433.22 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU r4 r3 r2b r1a r1 100 200 300 400 500 SE +/- 3.51, N = 3 SE +/- 1.24, N = 3 SE +/- 0.65, N = 3 SE +/- 1.79, N = 3 SE +/- 0.58, N = 3 448.91 447.14 447.29 446.94 445.14 MIN: 431.33 MIN: 432.42 MIN: 433.06 MIN: 430.47 MIN: 431.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU r4 r3 r2b r1a r1 100 200 300 400 500 SE +/- 2.63, N = 3 SE +/- 0.04, N = 3 SE +/- 1.13, N = 3 SE +/- 2.18, N = 3 SE +/- 0.85, N = 3 447.96 446.92 447.70 447.44 445.52 MIN: 429.99 MIN: 433.64 MIN: 433.04 MIN: 429.4 MIN: 431.18 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K r4 r3 r2b r1a r1 8 16 24 32 40 SE +/- 0.08, N = 3 SE +/- 0.18, N = 4 SE +/- 0.15, N = 15 SE +/- 0.28, N = 3 SE +/- 0.28, N = 3 14.73 14.06 14.30 32.51 33.07 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Classroom - Compute: CPU-Only r4 r2b 16 32 48 64 80 SE +/- 0.13, N = 3 SE +/- 0.08, N = 3 72.29 71.78
KTX-Software toktx Settings: UASTC 4 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 4 + Zstd Compression 19 r4 r2b 13 26 39 52 65 SE +/- 0.74, N = 3 SE +/- 0.68, N = 4 56.77 56.66
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU r4 r3 r2b r1a r1 7 14 21 28 35 SE +/- 0.38629, N = 12 SE +/- 0.30585, N = 15 SE +/- 0.31773, N = 13 SE +/- 0.01835, N = 3 SE +/- 0.02080, N = 3 28.46130 28.18150 28.40230 7.50059 7.49467 MIN: 14.76 MIN: 14.34 MIN: 14.66 MIN: 6.91 MIN: 6.98 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
LuxCoreRender Scene: Danish Mood - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Danish Mood - Acceleration: CPU r4 r3 r2b r1a r1 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 SE +/- 0.08, N = 3 5.68 5.65 5.73 7.55 7.42 MIN: 1.26 / MAX: 7.6 MIN: 1.24 / MAX: 7.63 MIN: 1.3 / MAX: 7.65 MIN: 3.28 / MAX: 8.86 MIN: 3.2 / MAX: 8.74
LuxCoreRender Scene: LuxCore Benchmark - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: LuxCore Benchmark - Acceleration: CPU r4 r3 r2b r1a r1 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 5.87 5.92 5.84 8.04 7.84 MIN: 1.15 / MAX: 7.95 MIN: 1.15 / MAX: 7.98 MIN: 1.16 / MAX: 7.97 MIN: 3.51 / MAX: 9.33 MIN: 3.44 / MAX: 9.2
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 0 r4 r3 r2b r1a r1 15 30 45 60 75 SE +/- 0.68, N = 3 SE +/- 0.20, N = 3 SE +/- 0.22, N = 3 SE +/- 0.24, N = 3 SE +/- 0.21, N = 3 65.89 65.96 64.97 57.71 57.98 1. (CXX) g++ options: -O3 -fPIC -lm
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p r4 r3 r2b r1a 0.1148 0.2296 0.3444 0.4592 0.574 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.33 0.33 0.32 0.51 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
LuxCoreRender Scene: Rainbow Colors and Prism - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Rainbow Colors and Prism - Acceleration: CPU r4 r3 r2b r1a r1 4 8 12 16 20 SE +/- 0.79, N = 12 SE +/- 1.13, N = 12 SE +/- 0.87, N = 13 SE +/- 0.47, N = 15 SE +/- 1.05, N = 15 14.79 16.47 13.42 13.34 17.04 MIN: 9.85 / MAX: 20.95 MIN: 10.39 / MAX: 21.43 MIN: 8.28 / MAX: 21.15 MIN: 10.32 / MAX: 17.45 MIN: 11.27 / MAX: 22.05
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p r4 r3 r2b r1a 7 14 21 28 35 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 10.54 10.39 10.39 28.66 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Fishy Cat - Compute: CPU-Only r4 r2b 11 22 33 44 55 SE +/- 0.25, N = 3 SE +/- 0.15, N = 3 46.73 46.38
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU r4 r3 r2b r1a r1 0.6771 1.3542 2.0313 2.7084 3.3855 SE +/- 0.02449, N = 14 SE +/- 0.02478, N = 14 SE +/- 0.02287, N = 13 SE +/- 0.00276, N = 3 SE +/- 0.00128, N = 3 3.00907 3.00929 3.00464 2.96857 2.96135 MIN: 2.84 MIN: 2.84 MIN: 2.84 MIN: 2.84 MIN: 2.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
VOSK Speech Recognition Toolkit OpenBenchmarking.org Seconds, Fewer Is Better VOSK Speech Recognition Toolkit 0.3.21 r4 r3 r2b r1a r1 8 16 24 32 40 SE +/- 0.32, N = 3 SE +/- 0.43, N = 3 SE +/- 0.43, N = 3 SE +/- 0.29, N = 8 SE +/- 0.32, N = 3 35.50 35.58 36.42 35.01 35.92
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time r4 r3 r2b r1a r1 40M 80M 120M 160M 200M SE +/- 2183262.34, N = 4 SE +/- 1924842.52, N = 3 SE +/- 1982639.48, N = 3 SE +/- 2404481.41, N = 3 SE +/- 1585265.68, N = 15 186013261 189214499 181554218 186263552 181644819 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver
libavif avifenc Encoder Speed: 6, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6, Lossless r4 r3 r2b r1a r1 9 18 27 36 45 SE +/- 0.36, N = 6 SE +/- 0.35, N = 3 SE +/- 0.24, N = 3 SE +/- 0.09, N = 3 SE +/- 0.04, N = 3 38.51 38.59 38.40 31.62 32.11 1. (CXX) g++ options: -O3 -fPIC -lm
srsLTE Test: PHY_DL_Test OpenBenchmarking.org UE Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test r4 r3 r2b r1a r1 20 40 60 80 100 SE +/- 0.62, N = 3 SE +/- 1.14, N = 3 SE +/- 0.38, N = 3 SE +/- 1.16, N = 3 SE +/- 0.76, N = 3 78.3 76.1 75.0 77.3 76.9 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
srsLTE Test: PHY_DL_Test OpenBenchmarking.org eNb Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test r4 r3 r2b r1a r1 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 2.42, N = 3 SE +/- 1.23, N = 3 SE +/- 0.36, N = 3 SE +/- 1.15, N = 3 183.7 181.6 181.6 184.2 183.4 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
libavif avifenc Encoder Speed: 6 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6 r4 r3 r2b r1a r1 4 8 12 16 20 SE +/- 0.12, N = 15 SE +/- 0.13, N = 15 SE +/- 0.23, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 16.21 16.62 16.07 13.33 13.25 1. (CXX) g++ options: -O3 -fPIC -lm
srsLTE Test: OFDM_Test OpenBenchmarking.org Samples / Second, More Is Better srsLTE 20.10.1 Test: OFDM_Test r4 r3 r2b r1a r1 30M 60M 90M 120M 150M SE +/- 233333.33, N = 3 SE +/- 600925.21, N = 3 SE +/- 366666.67, N = 3 SE +/- 240370.09, N = 3 SE +/- 611010.09, N = 3 120666667 120833333 120733333 120133333 120300000 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory r4 r2b 3K 6K 9K 12K 15K SE +/- 118.72, N = 15 SE +/- 125.16, N = 15 12553.44 12510.56 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 2 r4 r3 r2b r1a r1 9 18 27 36 45 SE +/- 0.08, N = 3 SE +/- 0.20, N = 3 SE +/- 0.40, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 37.80 38.31 38.37 31.48 31.54 1. (CXX) g++ options: -O3 -fPIC -lm
Botan Test: AES-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt r4 r3 r2b r1a r1 1200 2400 3600 4800 6000 SE +/- 12.66, N = 3 SE +/- 1.10, N = 3 SE +/- 0.94, N = 3 SE +/- 0.12, N = 3 SE +/- 1.20, N = 3 5650.14 5662.34 5662.76 5663.61 5663.06 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: AES-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 r4 r3 r2b r1a r1 1200 2400 3600 4800 6000 SE +/- 51.03, N = 3 SE +/- 42.23, N = 3 SE +/- 55.60, N = 3 SE +/- 0.28, N = 3 SE +/- 0.92, N = 3 5612.00 5593.37 5606.97 5670.81 5669.70 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S r4 r2b 8 16 24 32 40 SE +/- 0.42, N = 3 SE +/- 0.21, N = 3 34.42 34.24 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 r4 r2b 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.08, N = 15 11.23 11.25 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
libavif avifenc Encoder Speed: 10, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10, Lossless r4 r3 r2b r1a r1 3 6 9 12 15 SE +/- 0.157, N = 15 SE +/- 0.130, N = 15 SE +/- 0.154, N = 15 SE +/- 0.016, N = 3 SE +/- 0.036, N = 3 10.208 10.088 10.282 8.812 8.852 1. (CXX) g++ options: -O3 -fPIC -lm
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p r4 r3 r2b r1a 30 60 90 120 150 SE +/- 0.28, N = 3 SE +/- 0.31, N = 15 SE +/- 0.49, N = 3 SE +/- 0.82, N = 15 42.37 43.42 43.26 125.25 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: BMW27 - Compute: CPU-Only r4 r2b 7 14 21 28 35 SE +/- 0.32, N = 3 SE +/- 0.08, N = 3 29.69 29.56
Botan Test: ChaCha20Poly1305 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt r4 r3 r2b r1a r1 130 260 390 520 650 SE +/- 2.81, N = 3 SE +/- 3.74, N = 3 SE +/- 3.49, N = 3 SE +/- 0.57, N = 3 SE +/- 0.40, N = 3 615.98 612.15 612.44 619.54 619.46 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 r4 r3 r2b r1a r1 130 260 390 520 650 SE +/- 2.98, N = 3 SE +/- 3.19, N = 3 SE +/- 3.48, N = 3 SE +/- 0.17, N = 3 SE +/- 0.03, N = 3 619.64 616.50 615.81 623.20 623.49 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt r4 r3 r2b r1a r1 80 160 240 320 400 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 363.28 363.31 363.20 363.33 363.26 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish r4 r3 r2b r1a r1 80 160 240 320 400 SE +/- 3.51, N = 3 SE +/- 3.73, N = 3 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 SE +/- 0.56, N = 3 359.57 359.45 362.93 363.62 363.04 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt r4 r3 r2b r1a r1 60 120 180 240 300 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 292.61 292.83 292.40 292.37 292.74 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish r4 r3 r2b r1a r1 60 120 180 240 300 SE +/- 2.83, N = 3 SE +/- 2.66, N = 3 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 286.00 286.18 288.56 288.85 289.13 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt r4 r3 r2b r1a r1 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.35, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 116.07 115.72 116.08 116.07 116.07 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 r4 r3 r2b r1a r1 30 60 90 120 150 SE +/- 1.17, N = 3 SE +/- 1.33, N = 3 SE +/- 1.15, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 114.65 114.52 114.66 115.97 115.97 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt r4 r3 r2b r1a r1 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 74.29 74.31 74.28 74.29 74.32 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI r4 r3 r2b r1a r1 20 40 60 80 100 SE +/- 0.87, N = 3 SE +/- 0.77, N = 3 SE +/- 1.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 76.40 76.41 76.29 77.31 77.29 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
toyBrot Fractal Generator Implementation: TBB OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: TBB r4 r3 r2b r1a r1 1500 3000 4500 6000 7500 SE +/- 81.70, N = 15 SE +/- 69.20, N = 15 SE +/- 73.83, N = 15 SE +/- 80.68, N = 3 SE +/- 59.06, N = 15 7016 7003 6984 6964 6850 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M r4 r3 r2b r1a r1 11K 22K 33K 44K 55K SE +/- 235.04, N = 3 SE +/- 358.18, N = 3 SE +/- 238.38, N = 3 SE +/- 588.34, N = 3 SE +/- 425.40, N = 7 49937.3 49813.4 49908.3 50166.1 48051.5 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Intel Memory Latency Checker Test: Peak Injection Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - Stream-Triad Like r5 r4 r3 r2b r2a r1a r1 70K 140K 210K 280K 350K SE +/- 55.81, N = 3 SE +/- 60.42, N = 3 SE +/- 32.03, N = 3 SE +/- 12.95, N = 3 SE +/- 34.05, N = 3 SE +/- 38.10, N = 3 SE +/- 177.93, N = 3 324234.5 324112.8 324227.4 324209.8 323826.9 323924.2 324377.2
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 1:1 Reads-Writes r5 r4 r3 r2b r2a r1a r1 100K 200K 300K 400K 500K SE +/- 847.23, N = 3 SE +/- 1601.80, N = 3 SE +/- 138.13, N = 3 SE +/- 314.54, N = 3 SE +/- 212.40, N = 3 SE +/- 148.63, N = 3 SE +/- 1187.16, N = 3 448800.1 446396.0 449554.1 440454.7 442144.2 442843.2 442422.3
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 2:1 Reads-Writes r5 r4 r3 r2b r2a r1a r1 100K 200K 300K 400K 500K SE +/- 12.06, N = 3 SE +/- 36.24, N = 3 SE +/- 73.04, N = 3 SE +/- 64.32, N = 3 SE +/- 115.55, N = 3 SE +/- 130.28, N = 3 SE +/- 274.15, N = 3 458830.6 458941.9 457190.5 459309.8 456408.6 456260.3 459038.6
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - 3:1 Reads-Writes r5 r4 r3 r2b r2a r1a r1 90K 180K 270K 360K 450K SE +/- 23.30, N = 3 SE +/- 23.30, N = 3 SE +/- 88.34, N = 3 SE +/- 25.04, N = 3 SE +/- 236.99, N = 3 SE +/- 94.95, N = 3 SE +/- 163.24, N = 3 425508.1 425822.1 424904.5 425925.6 424077.3 424096.6 425933.7
Intel Memory Latency Checker Test: Peak Injection Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker Test: Peak Injection Bandwidth - All Reads r5 r4 r3 r2b r2a r1a r1 80K 160K 240K 320K 400K SE +/- 23.85, N = 3 SE +/- 26.62, N = 3 SE +/- 24.95, N = 3 SE +/- 14.54, N = 3 SE +/- 37.47, N = 3 SE +/- 14.58, N = 3 SE +/- 709.43, N = 3 357722.7 358110.5 358463.7 357742.9 358269.7 358385.5 356476.2
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU r4 r3 r2b r1a r1 0.0769 0.1538 0.2307 0.3076 0.3845 SE +/- 0.004121, N = 3 SE +/- 0.003372, N = 6 SE +/- 0.003448, N = 5 SE +/- 0.002562, N = 3 SE +/- 0.000853, N = 3 0.340243 0.341955 0.341893 0.341663 0.338327 MIN: 0.3 MIN: 0.31 MIN: 0.3 MIN: 0.31 MIN: 0.3 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU r4 r3 r2b r1a r1 0.0488 0.0976 0.1464 0.1952 0.244 SE +/- 0.001544, N = 12 SE +/- 0.002019, N = 7 SE +/- 0.001893, N = 8 SE +/- 0.000781, N = 3 SE +/- 0.000867, N = 3 0.215085 0.216586 0.216806 0.213643 0.215115 MIN: 0.19 MIN: 0.19 MIN: 0.19 MIN: 0.19 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Helsing Digit Range: 14 digit OpenBenchmarking.org Seconds, Fewer Is Better Helsing 1.0-beta Digit Range: 14 digit r4 r3 r2b r1a r1 20 40 60 80 100 78.54 78.08 78.33 78.16 77.87 1. (CC) gcc options: -O2 -pthread -lcrypto
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput r4 r3 r2b r1a r1 40 80 120 160 200 SE +/- 0.47, N = 3 SE +/- 1.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.39, N = 3 SE +/- 0.15, N = 3 159.24 159.19 160.26 156.97 161.63 1. (CC) gcc options: -O3 -rdynamic
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough r4 r2b 3 6 9 12 15 SE +/- 0.0879, N = 7 SE +/- 0.0796, N = 8 9.3091 9.2907 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive r4 r2b 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 16.37 16.36 1. (CXX) g++ options: -O3 -flto -pthread
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10 r4 r3 r2b r1a r1 2 4 6 8 10 SE +/- 0.130, N = 15 SE +/- 0.145, N = 15 SE +/- 0.116, N = 15 SE +/- 0.014, N = 3 SE +/- 0.038, N = 3 6.746 6.597 6.656 5.505 5.477 1. (CXX) g++ options: -O3 -fPIC -lm
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium r4 r2b 2 4 6 8 10 SE +/- 0.0290, N = 3 SE +/- 0.0906, N = 15 7.1472 7.1887 1. (CXX) g++ options: -O3 -flto -pthread
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile r4 r3 r2b r1a r1 5 10 15 20 25 SE +/- 0.11, N = 3 SE +/- 0.15, N = 3 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 SE +/- 0.02, N = 3 21.31 21.37 21.58 20.38 20.95
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU r4 r3 r2b r1a r1 0.8015 1.603 2.4045 3.206 4.0075 SE +/- 0.00650, N = 3 SE +/- 0.01280, N = 3 SE +/- 0.00854, N = 3 SE +/- 0.00732, N = 3 SE +/- 0.00193, N = 3 3.54783 3.56224 3.53121 3.54367 3.53026 MIN: 3.37 MIN: 3.39 MIN: 3.37 MIN: 3.38 MIN: 3.38 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU r4 r3 r2b r1a r1 0.0915 0.183 0.2745 0.366 0.4575 SE +/- 0.002415, N = 14 SE +/- 0.003204, N = 10 SE +/- 0.004259, N = 4 SE +/- 0.001124, N = 3 SE +/- 0.001135, N = 3 0.402919 0.406877 0.403409 0.395588 0.398282 MIN: 0.36 MIN: 0.37 MIN: 0.36 MIN: 0.36 MIN: 0.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p r4 r3 r2b r1a r1 9 18 27 36 45 SE +/- 0.31, N = 3 SE +/- 0.14, N = 3 SE +/- 0.09, N = 3 SE +/- 0.24, N = 3 SE +/- 0.29, N = 3 28.01 28.22 27.80 37.34 36.91 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p r4 r3 r2b r1a 20 40 60 80 100 SE +/- 0.27, N = 3 SE +/- 0.26, N = 3 SE +/- 0.19, N = 3 SE +/- 1.01, N = 15 36.35 36.06 36.20 103.92 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Liquid-DSP Threads: 160 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 160 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1a r1 700M 1400M 2100M 2800M 3500M SE +/- 16411005.79, N = 3 SE +/- 14901789.60, N = 3 SE +/- 14685858.66, N = 3 SE +/- 2062630.47, N = 3 SE +/- 17047384.94, N = 3 3140266667 3143300000 3131866667 3162066667 3144800000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1a r1 700M 1400M 2100M 2800M 3500M SE +/- 16537936.19, N = 3 SE +/- 6896617.53, N = 3 SE +/- 14312737.14, N = 3 SE +/- 38975091.76, N = 3 SE +/- 8088331.79, N = 3 3398800000 3411000000 3400066667 3352733333 3415933333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1a r1 700M 1400M 2100M 2800M 3500M SE +/- 12876378.03, N = 3 SE +/- 14893734.70, N = 3 SE +/- 17049079.48, N = 3 SE +/- 2150193.79, N = 3 SE +/- 5206513.02, N = 3 3245666667 3232700000 3227433333 3263700000 3267133333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1a r1 400M 800M 1200M 1600M 2000M SE +/- 6582552.70, N = 3 SE +/- 10121648.97, N = 3 SE +/- 2515949.13, N = 3 SE +/- 3951371.07, N = 3 1697500000 1704500000 1699333333 1736800000 1735100000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1a r1 200M 400M 600M 800M 1000M SE +/- 10609570.10, N = 3 SE +/- 859903.10, N = 3 SE +/- 3620722.76, N = 3 SE +/- 669162.00, N = 3 SE +/- 691953.76, N = 3 860046667 865410000 862890000 890273333 885320000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1 90M 180M 270M 360M 450M SE +/- 2739929.03, N = 3 SE +/- 1240739.03, N = 3 SE +/- 2458908.97, N = 3 SE +/- 422150.58, N = 3 432013333 432170000 428100000 441953333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1 50M 100M 150M 200M 250M SE +/- 1956802.95, N = 3 SE +/- 1663583.82, N = 3 SE +/- 824809.74, N = 3 SE +/- 1090112.12, N = 3 216773333 215343333 213203333 217643333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1 20M 40M 60M 80M 100M SE +/- 132035.35, N = 3 SE +/- 430348.70, N = 3 SE +/- 907677.13, N = 3 SE +/- 729984.78, N = 3 109430000 111510000 110173333 110713333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 r4 r3 r2b r1 12M 24M 36M 48M 60M SE +/- 534784.17, N = 3 SE +/- 550708.74, N = 3 SE +/- 613156.95, N = 3 SE +/- 173700.89, N = 3 55251667 57197667 56230333 57792000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
KTX-Software toktx Settings: Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 19 r4 r2b 5 10 15 20 25 SE +/- 0.20, N = 3 SE +/- 0.22, N = 3 20.08 19.78
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 r4 r2b 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 17.19 17.16 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU r4 r3 r2b r1a r1 0.0547 0.1094 0.1641 0.2188 0.2735 SE +/- 0.002245, N = 7 SE +/- 0.002507, N = 5 SE +/- 0.003187, N = 3 SE +/- 0.000662, N = 3 SE +/- 0.000856, N = 3 0.242450 0.243308 0.243026 0.240122 0.239989 MIN: 0.22 MIN: 0.22 MIN: 0.22 MIN: 0.23 MIN: 0.22 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p r4 r3 r2b r1a r1 90 180 270 360 450 SE +/- 0.65, N = 3 SE +/- 1.57, N = 3 SE +/- 4.05, N = 12 SE +/- 16.03, N = 12 SE +/- 15.40, N = 12 184.07 185.53 182.26 393.46 386.29 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
KTX-Software toktx Settings: UASTC 3 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 r4 r2b 1.2744 2.5488 3.8232 5.0976 6.372 SE +/- 0.008, N = 3 SE +/- 0.053, N = 15 5.562 5.664
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU r4 r3 r2b r1a r1 0.282 0.564 0.846 1.128 1.41 SE +/- 0.01282, N = 3 SE +/- 0.01211, N = 3 SE +/- 0.00964, N = 3 SE +/- 0.01592, N = 15 SE +/- 0.00180, N = 3 1.24222 1.24176 1.25313 1.25267 1.24809 MIN: 1.19 MIN: 1.18 MIN: 1.2 MIN: 1.19 MIN: 1.2 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Intel Memory Latency Checker Test: Idle Latency OpenBenchmarking.org ns, Fewer Is Better Intel Memory Latency Checker Test: Idle Latency r5 r4 r3 r2a r2 r1a r1 15 30 45 60 75 SE +/- 0.09, N = 3 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.28, N = 8 SE +/- 0.09, N = 3 SE +/- 0.39, N = 3 SE +/- 0.10, N = 3 68.1 67.8 67.6 32.5 67.5 33.0 35.1
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU r4 r3 r2b r1a r1 0.2123 0.4246 0.6369 0.8492 1.0615 SE +/- 0.008450, N = 3 SE +/- 0.007264, N = 3 SE +/- 0.011253, N = 3 SE +/- 0.002111, N = 3 SE +/- 0.002101, N = 3 0.940714 0.936941 0.943624 0.912279 0.918568 MIN: 0.86 MIN: 0.85 MIN: 0.86 MIN: 0.86 MIN: 0.85 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 r4 r2b 4 8 12 16 20 SE +/- 0.15, N = 3 SE +/- 0.18, N = 3 14.16 13.98 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction r4 r3 r2b r1a r1 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 14.66 14.60 11.56 11.27 11.36 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
KTX-Software toktx Settings: UASTC 3 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 r4 r2b 3 6 9 12 15 SE +/- 0.11, N = 5 SE +/- 0.06, N = 3 10.03 10.01
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU r4 r3 r2b r1a r1 0.0491 0.0982 0.1473 0.1964 0.2455 SE +/- 0.004970, N = 15 SE +/- 0.003384, N = 15 SE +/- 0.004449, N = 15 SE +/- 0.001109, N = 3 SE +/- 0.002205, N = 15 0.217941 0.218349 0.210324 0.210728 0.210919 MIN: 0.19 MIN: 0.19 MIN: 0.18 MIN: 0.2 MIN: 0.19 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU r4 r3 r2b r1a r1 0.1355 0.271 0.4065 0.542 0.6775 SE +/- 0.003648, N = 3 SE +/- 0.004400, N = 3 SE +/- 0.004180, N = 3 SE +/- 0.000780, N = 3 SE +/- 0.001703, N = 3 0.602038 0.602314 0.602122 0.595661 0.593042 MIN: 0.56 MIN: 0.56 MIN: 0.56 MIN: 0.56 MIN: 0.56 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction r4 r3 r2b r1a r1 0.8039 1.6078 2.4117 3.2156 4.0195 SE +/- 0.02850005, N = 15 SE +/- 0.03072276, N = 15 SE +/- 0.02799890, N = 3 SE +/- 0.01532048, N = 3 SE +/- 0.00774937, N = 3 3.57278153 3.56592774 3.02281992 2.73859096 2.74370996 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
KTX-Software toktx Settings: Zstd Compression 9 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 9 r4 r2b 0.8318 1.6636 2.4954 3.3272 4.159 SE +/- 0.064, N = 15 SE +/- 0.003, N = 3 3.697 3.470
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU r4 r3 r2b r1a r1 0.8197 1.6394 2.4591 3.2788 4.0985 SE +/- 0.05617, N = 14 SE +/- 0.05675, N = 14 SE +/- 0.05421, N = 14 SE +/- 0.00795, N = 3 SE +/- 0.00924, N = 3 3.64319 3.64033 3.64232 3.57662 3.57247 MIN: 3.5 MIN: 3.47 MIN: 3.51 MIN: 3.5 MIN: 3.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU r4 r3 r2b r1a r1 0.1972 0.3944 0.5916 0.7888 0.986 SE +/- 0.007461, N = 14 SE +/- 0.007890, N = 14 SE +/- 0.008361, N = 14 SE +/- 0.002055, N = 3 SE +/- 0.002419, N = 3 0.876227 0.874968 0.874080 0.863214 0.864164 MIN: 0.84 MIN: 0.84 MIN: 0.83 MIN: 0.84 MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
toyBrot Fractal Generator Implementation: C++ Tasks OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Tasks r4 r3 r2b r1a r1 2K 4K 6K 8K 10K SE +/- 85.46, N = 4 SE +/- 93.55, N = 4 SE +/- 102.03, N = 3 SE +/- 80.44, N = 4 SE +/- 43.45, N = 3 8037 8048 8050 7724 7879 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Church Facade r4 r2b 1500 3000 4500 6000 7500 SE +/- 3.33, N = 3 SE +/- 20.01, N = 3 7082 7001 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU r4 r3 r2b r1a r1 0.4148 0.8296 1.2444 1.6592 2.074 SE +/- 0.00968, N = 3 SE +/- 0.02043, N = 3 SE +/- 0.01382, N = 3 SE +/- 0.00121, N = 3 SE +/- 0.00580, N = 3 1.81913 1.84339 1.81774 1.79881 1.80046 MIN: 1.68 MIN: 1.67 MIN: 1.69 MIN: 1.69 MIN: 1.68 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Lion r4 r2b 1300 2600 3900 5200 6500 SE +/- 21.15, N = 3 SE +/- 25.21, N = 3 6170 6126 1. (CXX) g++ options: -O3
toyBrot Fractal Generator Implementation: OpenMP OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: OpenMP r4 r3 r2b r1a r1 1600 3200 4800 6400 8000 SE +/- 91.12, N = 4 SE +/- 85.45, N = 4 SE +/- 101.59, N = 3 SE +/- 0.88, N = 3 SE +/- 5.13, N = 3 7429 7439 7412 7308 7318 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
toyBrot Fractal Generator Implementation: C++ Threads OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Threads r4 r3 r2b r1a r1 1500 3000 4500 6000 7500 SE +/- 76.94, N = 4 SE +/- 98.76, N = 3 SE +/- 89.67, N = 3 SE +/- 29.96, N = 3 SE +/- 49.12, N = 3 7141 7203 7149 6980 7018 1. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p r4 r3 r2b r1a r1 70 140 210 280 350 SE +/- 1.59, N = 3 SE +/- 1.63, N = 3 SE +/- 1.13, N = 3 SE +/- 1.10, N = 3 SE +/- 1.20, N = 3 162.21 164.51 164.32 329.53 327.87 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p r4 r3 r2b r1a r1 90 180 270 360 450 SE +/- 0.47, N = 3 SE +/- 2.25, N = 3 SE +/- 0.90, N = 3 SE +/- 0.66, N = 3 SE +/- 1.44, N = 3 179.13 181.52 182.17 408.24 401.29 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU r4 r3 r2b r1a r1 0.2578 0.5156 0.7734 1.0312 1.289 SE +/- 0.01182, N = 3 SE +/- 0.00975, N = 3 SE +/- 0.00330, N = 3 SE +/- 0.00124, N = 3 SE +/- 0.00274, N = 3 1.11811 1.14578 1.11874 1.12224 1.10991 MIN: 1.02 MIN: 1.04 MIN: 1.02 MIN: 1.02 MIN: 1.02 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU r4 r3 r2b r1a r1 0.2029 0.4058 0.6087 0.8116 1.0145 SE +/- 0.005244, N = 3 SE +/- 0.006631, N = 3 SE +/- 0.004902, N = 3 SE +/- 0.003986, N = 3 SE +/- 0.006225, N = 3 0.875421 0.901823 0.869978 0.879137 0.877815 MIN: 0.82 MIN: 0.84 MIN: 0.82 MIN: 0.83 MIN: 0.82 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU r4 r3 r2b r1a r1 0.4764 0.9528 1.4292 1.9056 2.382 SE +/- 0.01801, N = 3 SE +/- 0.01943, N = 3 SE +/- 0.01980, N = 3 SE +/- 0.00168, N = 3 SE +/- 0.00138, N = 3 2.10837 2.10841 2.11712 2.08532 2.07944 MIN: 2.03 MIN: 2.03 MIN: 2.03 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p r4 r3 r2b r1a r1 110 220 330 440 550 SE +/- 1.14, N = 3 SE +/- 1.80, N = 10 SE +/- 2.64, N = 4 SE +/- 4.78, N = 3 SE +/- 3.80, N = 3 233.96 234.39 234.51 493.51 499.23 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p r4 r3 r2b r1a r1 60 120 180 240 300 SE +/- 1.22, N = 3 SE +/- 1.64, N = 3 SE +/- 1.76, N = 5 SE +/- 1.37, N = 3 SE +/- 1.68, N = 3 156.26 157.83 158.16 288.99 290.67 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
Phoronix Test Suite v10.8.5