haswell may Intel Xeon E5-2687W v3 testing with a MSI X99S SLI PLUS (MS-7885) v1.0 (1.E0 BIOS) and NVIDIA GeForce GTX 770 on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2105102-IB-HASWELLMA37&grr .
haswell may Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution 1 2 3 Intel Xeon E5-2687W v3 @ 3.50GHz (10 Cores / 20 Threads) MSI X99S SLI PLUS (MS-7885) v1.0 (1.E0 BIOS) Intel Xeon E7 v3/Xeon 32GB 80GB INTEL SSDSCKGW08 NVIDIA GeForce GTX 770 Realtek ALC892 LG Ultra HD Intel I218-V Ubuntu 20.04 5.9.0-050900rc7daily20200928-generic (x86_64) 20200927 GNOME Shell 3.36.4 X Server 1.20.9 GCC 9.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0x44 Python Details - Python 3.8.5 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
haswell may build-llvm: Unix Makefiles build-llvm: Ninja blender: Barbershop - CPU-Only aom-av1: Speed 0 Two-Pass - Bosphorus 4K blender: Pabellon Barcelona - CPU-Only helsing: 14 digit build-nodejs: Time To Compile blender: Classroom - CPU-Only toktx: UASTC 4 + Zstd Compression 19 gnuradio: Hilbert Transform gnuradio: FM Deemphasis Filter gnuradio: IIR Filter gnuradio: FIR Filter gnuradio: Signal Source (Cosine) gnuradio: Five Back to Back FIR Filters xmrig: Monero - 1M aom-av1: Speed 4 Two-Pass - Bosphorus 4K libgav1: Chimera 1080p 10-bit securemark: SecureMark-TLS blender: Fishy Cat - CPU-Only gmpbench: Total Time luaradio: Complex Phase luaradio: Hilbert Transform luaradio: FM Deemphasis Filter luaradio: Five Back to Back FIR Filters xmrig: Wownero - 1M blender: BMW27 - CPU-Only hmmer: Pfam Database Search mrbayes: Primate Phylogeny Analysis aom-av1: Speed 6 Two-Pass - Bosphorus 4K build-erlang: Time To Compile aom-av1: Speed 4 Two-Pass - Bosphorus 1080p dav1d: Chimera 1080p 10-bit astcenc: Exhaustive mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 incompact3d: input.i3d 193 Cells Per Direction build-linux-kernel: Time To Compile svt-av1: Preset 4 - Bosphorus 4K svt-hevc: 1 - Bosphorus 1080p avifenc: 0 libgav1: Chimera 1080p sysbench: CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU libgav1: Summer Nature 4K vosk: avifenc: 6, Lossless basis: UASTC Level 3 aom-av1: Speed 0 Two-Pass - Bosphorus 1080p embree: Pathtracer - Asian Dragon Obj onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU build-mesa: Time To Compile qmcpack: simple-H2O embree: Pathtracer ISPC - Asian Dragon Obj srslte: OFDM_Test aom-av1: Speed 6 Realtime - Bosphorus 4K toybrot: C++ Tasks toybrot: OpenMP toybrot: C++ Threads pjsip: INVITE aom-av1: Speed 6 Two-Pass - Bosphorus 1080p pjsip: OPTIONS, Stateful srslte: PHY_DL_Test srslte: PHY_DL_Test embree: Pathtracer - Crown pjsip: OPTIONS, Stateless embree: Pathtracer ISPC - Crown compress-zstd: 19, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed embree: Pathtracer - Asian Dragon compress-zstd: 19 - Decompression Speed compress-zstd: 19 - Compression Speed avifenc: 2 stockfish: Total Time embree: Pathtracer ISPC - Asian Dragon chia-vdf: Square Plain C++ basis: UASTC Level 2 chia-vdf: Square Assembly Optimized aom-av1: Speed 6 Realtime - Bosphorus 1080p compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 3 - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 8 - Compression Speed compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 3, Long Mode - Compression Speed svt-av1: Preset 8 - Bosphorus 4K basis: ETC1S botan: AES-256 - Decrypt botan: AES-256 incompact3d: input.i3d 129 Cells Per Direction svt-av1: Preset 4 - Bosphorus 1080p botan: ChaCha20Poly1305 - Decrypt botan: ChaCha20Poly1305 botan: Twofish - Decrypt botan: Twofish botan: Blowfish - Decrypt botan: Blowfish botan: CAST-256 - Decrypt botan: CAST-256 botan: KASUMI - Decrypt botan: KASUMI libgav1: Summer Nature 1080p viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - sAXPY dav1d: Summer Nature 4K dav1d: Chimera 1080p tjbench: Decompression Throughput astcenc: Thorough viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - dGEMV-N aom-av1: Speed 8 Realtime - Bosphorus 4K toktx: UASTC 3 + Zstd Compression 19 toktx: Zstd Compression 19 viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - sCOPY onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU liquid-dsp: 20 - 256 - 57 liquid-dsp: 16 - 256 - 57 liquid-dsp: 8 - 256 - 57 liquid-dsp: 4 - 256 - 57 liquid-dsp: 2 - 256 - 57 liquid-dsp: 1 - 256 - 57 avifenc: 6 aom-av1: Speed 9 Realtime - Bosphorus 4K toktx: UASTC 3 viennacl: CPU BLAS - dCOPY onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU svt-vp9: VMAF Optimized - Bosphorus 1080p draco: Church Facade onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU basis: UASTC Level 0 aom-av1: Speed 8 Realtime - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p onednn: IP Shapes 3D - f32 - CPU dav1d: Summer Nature 1080p draco: Lion onednn: IP Shapes 3D - u8s8f32 - CPU avifenc: 10, Lossless aom-av1: Speed 9 Realtime - Bosphorus 1080p toktx: Zstd Compression 9 astcenc: Medium svt-hevc: 7 - Bosphorus 1080p onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU helsing: 12 digit sysbench: RAM / Memory svt-vp9: Visual Quality Optimized - Bosphorus 1080p avifenc: 10 svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p svt-hevc: 10 - Bosphorus 1080p onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU 1 2 3 964.590 925.475 775.57 0.09 652.69 599.723 595.295 589.06 444.730 264.7 517.1 411.9 474.6 1855.6 497.7 3039.3 2.37 31.64 191107 267.61 3657.3 417.4 77.9 328.2 678.6 4762.3 196.01 187.834 187.452 4.55 168.766 4.09 88.68 141.7632 57.177 6.133 5.116 53.833 8.740 127.615463 121.584 0.9 5.72 101.116 89.69 15805.17 4053.66 4049.70 4051.75 43.76 26.641 81.115 80.198 0.26 10.5853 2227.68 2215.41 2213.82 75.847 52.540 11.8105 72300000 8.98 66632 66314 66253 3321 12.39 5680 70.3 173.6 9.8735 36173 10.9213 2477.9 29.6 11.4729 2469.6 32.1 53.703 21549515 13.1137 108100 43.769 119633 15.19 2860.7 446.9 1958.4 2688.1 355.1 2799.0 609.1 9.987 34.099 3103.88 3116.537 31.8818099 3.065 572.534 578.937 286.169 288.183 345.055 346.405 109.251 109.393 71.946 75.999 126.28 26.9 27.6 38.8 36.2 41.3 145.15 472.73 158.053214 18.8220 27.9 43.7 45.7 27.51 23.681 23.367 29.4 49.2 27.5 6.55061 3.55190 452546667 441220000 343806667 168710000 93297000 49054667 18.928 37.40 17.165 4.12581 2.78165 131.62 10529 3.35073 3.48130 11.079 60.92 37.198 7.46460 394.33 7235 2.44319 9.080 72.30 4.174 7.9701 82.79 13.7665 12.7023 6.450 16674.21 108.08 5.375 137.41 173.18 8.39311 5.66569 964.077 925.736 778.21 0.09 659.72 601.194 593.248 586.2 444.538 264.3 517.9 413.5 472.1 1844.4 493.3 3012.8 2.36 31.66 191158 268.21 3645.3 417.3 78.3 328.2 662.4 4805.3 195.56 187.917 184.004 4.63 166.64 4.06 88.67 141.9129 56.812 6.106 5.178 53.942 8.692 127.763268 122.873 0.904 5.73 102.93 89.52 15806.12 4050.46 4053.76 4052.37 43.76 29.277 80.642 80.076 0.26 10.5473 2217.15 2217.72 2213.73 75.824 52.487 11.7505 63100000 9.17 66341 66391 66336 3320 12.41 5666 70.8 175.3 9.5853 36043 10.9347 2476.3 29.1 11.2554 2477.7 32.4 52.725 20756949 13.1354 108100 43.677 117600 15.37 2843.5 442.5 1970.8 2693.4 353.7 2819.5 602.4 9.943 34.278 3100.643 3125.518 31.7161522 3.018 573.601 577.483 286.613 288.542 344.851 346.287 109.567 109.742 71.885 75.92 125.69 26.6 28 43.8 41.4 41.2 144.78 475.88 155.494139 18.8103 27.7 43.6 45.6 27.45 23.467 23.264 29.2 48.9 27.5 6.56384 3.54188 452770000 444300000 343480000 167210000 93119000 49056000 19.232 37.23 17.263 27.5 4.13077 2.76679 112.13 10468 3.36355 3.4953 10.867 61.77 36.775 7.50073 395.79 7316 2.44752 9.289 73.91 4.102 8.121 82.63 13.7675 12.6855 6.446 16808.94 108.03 5.41 137.02 173.01 8.39791 5.65039 951.556 927.025 779.06 0.09 656.27 627.965 593.916 585.15 444.664 264.6 519.1 413.1 471 1843 499.6 3032.8 2.36 31.55 191151 268.63 3653.9 418.7 77.7 330.9 682.5 4844.9 195.45 187.558 187.497 4.64 167.352 4.09 88.65 141.7768 56.799 6.128 5.081 54.614 8.701 129.590805 123.825 0.902 5.73 101.731 90.08 15805.7 4051.36 4058.11 4051.76 43.55 29.207 80.774 80.104 0.25 10.5545 2216.7 2216.51 2216.1 76.334 50.184 11.7876 72700000 9.11 66331 66299 66224 3320 12.41 5689 69.8 172.6 10.0219 36169 10.989 2510.4 29.1 11.2491 2471.8 32.3 53.514 20973440 13.1041 108100 43.689 116500 15.37 2832.8 443.4 1996.5 2693.4 350.4 2816.2 614.4 9.95 34.387 3096.616 3106.226 31.4801865 2.968 574.418 578.173 286.502 288.535 344.919 346.248 109.563 109.754 71.948 75.936 126.39 26.7 27.7 43.8 41.4 41.3 144.26 476.55 156.529497 18.8248 27.7 43.7 45.4 27.43 23.801 23.404 29.4 48.9 27.4 6.52527 3.53866 452640000 437390000 342790000 168540000 94009000 49052000 18.908 37.3 17.279 27.5 4.11242 2.76696 113.45 10590 3.35822 3.55843 11.115 60.85 36.846 7.5126 392.01 7335 2.43904 9.065 72.17 4.435 8.1096 82.37 13.7742 12.6892 6.447 16459.64 108 5.368 137.66 173.06 8.40666 5.66558 OpenBenchmarking.org
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 12.0 Build System: Unix Makefiles 1 2 3 200 400 600 800 1000 SE +/- 10.37, N = 3 964.59 964.08 951.56
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 12.0 Build System: Ninja 1 2 3 200 400 600 800 1000 SE +/- 0.12, N = 3 925.48 925.74 927.03
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Barbershop - Compute: CPU-Only 1 2 3 200 400 600 800 1000 SE +/- 0.90, N = 3 775.57 778.21 779.06
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K 1 2 3 0.0203 0.0406 0.0609 0.0812 0.1015 SE +/- 0.00, N = 12 0.09 0.09 0.09 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Pabellon Barcelona - Compute: CPU-Only 1 2 3 140 280 420 560 700 SE +/- 0.85, N = 3 652.69 659.72 656.27
Helsing Digit Range: 14 digit OpenBenchmarking.org Seconds, Fewer Is Better Helsing 1.0-beta Digit Range: 14 digit 1 2 3 140 280 420 560 700 SE +/- 0.57, N = 3 599.72 601.19 627.97 1. (CC) gcc options: -O2 -pthread
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 15.11 Time To Compile 1 2 3 130 260 390 520 650 SE +/- 0.41, N = 3 595.30 593.25 593.92
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Classroom - Compute: CPU-Only 1 2 3 130 260 390 520 650 SE +/- 2.36, N = 3 589.06 586.20 585.15
KTX-Software toktx Settings: UASTC 4 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 4 + Zstd Compression 19 1 2 3 100 200 300 400 500 SE +/- 0.05, N = 3 444.73 444.54 444.66
GNU Radio Test: Hilbert Transform OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Hilbert Transform 1 2 3 60 120 180 240 300 SE +/- 3.05, N = 3 264.7 264.3 264.6 1. 3.8.1.0
GNU Radio Test: FM Deemphasis Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FM Deemphasis Filter 1 2 3 110 220 330 440 550 SE +/- 1.66, N = 3 517.1 517.9 519.1 1. 3.8.1.0
GNU Radio Test: IIR Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: IIR Filter 1 2 3 90 180 270 360 450 SE +/- 1.05, N = 3 411.9 413.5 413.1 1. 3.8.1.0
GNU Radio Test: FIR Filter OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: FIR Filter 1 2 3 100 200 300 400 500 SE +/- 0.32, N = 3 474.6 472.1 471.0 1. 3.8.1.0
GNU Radio Test: Signal Source (Cosine) OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Signal Source (Cosine) 1 2 3 400 800 1200 1600 2000 SE +/- 12.37, N = 3 1855.6 1844.4 1843.0 1. 3.8.1.0
GNU Radio Test: Five Back to Back FIR Filters OpenBenchmarking.org MiB/s, More Is Better GNU Radio Test: Five Back to Back FIR Filters 1 2 3 110 220 330 440 550 SE +/- 1.44, N = 3 497.7 493.3 499.6 1. 3.8.1.0
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M 1 2 3 700 1400 2100 2800 3500 SE +/- 13.28, N = 3 3039.3 3012.8 3032.8 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K 1 2 3 0.5333 1.0666 1.5999 2.1332 2.6665 SE +/- 0.00, N = 3 2.37 2.36 2.36 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
libgav1 Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better libgav1 0.16.3 Video Input: Chimera 1080p 10-bit 1 2 3 7 14 21 28 35 SE +/- 0.03, N = 3 31.64 31.66 31.55 1. (CXX) g++ options: -O3 -lpthread -lrt
SecureMark Benchmark: SecureMark-TLS OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS 1 2 3 40K 80K 120K 160K 200K SE +/- 56.19, N = 3 191107 191158 191151 1. (CC) gcc options: -pedantic -O3
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Fishy Cat - Compute: CPU-Only 1 2 3 60 120 180 240 300 SE +/- 0.20, N = 3 267.61 268.21 268.63
GNU GMP GMPbench Total Time OpenBenchmarking.org GMPbench Score, More Is Better GNU GMP GMPbench 6.2.1 Total Time 1 2 3 800 1600 2400 3200 4000 3657.3 3645.3 3653.9 1. (CC) gcc options: -O3 -fomit-frame-pointer -lm
LuaRadio Test: Complex Phase OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Complex Phase 1 2 3 90 180 270 360 450 SE +/- 0.09, N = 3 417.4 417.3 418.7
LuaRadio Test: Hilbert Transform OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Hilbert Transform 1 2 3 20 40 60 80 100 SE +/- 0.22, N = 3 77.9 78.3 77.7
LuaRadio Test: FM Deemphasis Filter OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: FM Deemphasis Filter 1 2 3 70 140 210 280 350 SE +/- 1.17, N = 3 328.2 328.2 330.9
LuaRadio Test: Five Back to Back FIR Filters OpenBenchmarking.org MiB/s, More Is Better LuaRadio 0.9.1 Test: Five Back to Back FIR Filters 1 2 3 150 300 450 600 750 SE +/- 6.78, N = 3 678.6 662.4 682.5
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M 1 2 3 1000 2000 3000 4000 5000 SE +/- 28.81, N = 3 4762.3 4805.3 4844.9 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: BMW27 - Compute: CPU-Only 1 2 3 40 80 120 160 200 SE +/- 0.36, N = 3 196.01 195.56 195.45
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.2 Pfam Database Search 1 2 3 40 80 120 160 200 SE +/- 0.26, N = 3 187.83 187.92 187.56 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm -lmpi
Timed MrBayes Analysis Primate Phylogeny Analysis OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis 1 2 3 40 80 120 160 200 SE +/- 2.70, N = 3 187.45 184.00 187.50 1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -mabm -O3 -std=c99 -pedantic -lm
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K 1 2 3 1.044 2.088 3.132 4.176 5.22 SE +/- 0.01, N = 3 4.55 4.63 4.64 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Timed Erlang/OTP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Erlang/OTP Compilation 23.2 Time To Compile 1 2 3 40 80 120 160 200 SE +/- 0.95, N = 3 168.77 166.64 167.35
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p 1 2 3 0.9203 1.8406 2.7609 3.6812 4.6015 SE +/- 0.01, N = 3 4.09 4.06 4.09 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Chimera 1080p 10-bit 1 2 3 20 40 60 80 100 SE +/- 0.01, N = 3 88.68 88.67 88.65 MIN: 56.07 / MAX: 221.22 MIN: 56.08 / MAX: 223.41 MIN: 56.17 / MAX: 216.38 1. (CC) gcc options: -pthread -lm
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive 1 2 3 30 60 90 120 150 SE +/- 0.01, N = 3 141.76 141.91 141.78 1. (CXX) g++ options: -O3 -flto -pthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: inception-v3 1 2 3 13 26 39 52 65 SE +/- 0.12, N = 3 57.18 56.81 56.80 MIN: 56.86 / MAX: 126.61 MIN: 56.64 / MAX: 59.14 MIN: 56.68 / MAX: 60.57 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: mobilenet-v1-1.0 1 2 3 2 4 6 8 10 SE +/- 0.034, N = 3 6.133 6.106 6.128 MIN: 6.04 / MAX: 6.26 MIN: 6.05 / MAX: 6.81 MIN: 6.07 / MAX: 6.87 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: MobileNetV2_224 1 2 3 1.1651 2.3302 3.4953 4.6604 5.8255 SE +/- 0.026, N = 3 5.116 5.178 5.081 MIN: 5.04 / MAX: 6.71 MIN: 5.08 / MAX: 5.99 MIN: 5.04 / MAX: 5.98 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: resnet-v2-50 1 2 3 12 24 36 48 60 SE +/- 0.09, N = 3 53.83 53.94 54.61 MIN: 53.59 / MAX: 57.32 MIN: 53.82 / MAX: 54.84 MIN: 54.3 / MAX: 55.77 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.3 Model: SqueezeNetV1.0 1 2 3 2 4 6 8 10 SE +/- 0.012, N = 3 8.740 8.692 8.701 MIN: 8.62 / MAX: 9.61 MIN: 8.58 / MAX: 10.15 MIN: 8.61 / MAX: 9.65 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction 1 2 3 30 60 90 120 150 SE +/- 1.48, N = 3 127.62 127.76 129.59 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.10.20 Time To Compile 1 2 3 30 60 90 120 150 SE +/- 1.30, N = 3 121.58 122.87 123.83
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 4 - Input: Bosphorus 4K 1 2 3 0.2034 0.4068 0.6102 0.8136 1.017 0.900 0.904 0.902 1. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p 1 2 3 1.2893 2.5786 3.8679 5.1572 6.4465 SE +/- 0.00, N = 3 5.72 5.73 5.73 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 0 1 2 3 20 40 60 80 100 SE +/- 0.79, N = 3 101.12 102.93 101.73 1. (CXX) g++ options: -O3 -fPIC -lm
libgav1 Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better libgav1 0.16.3 Video Input: Chimera 1080p 1 2 3 20 40 60 80 100 SE +/- 0.07, N = 3 89.69 89.52 90.08 1. (CXX) g++ options: -O3 -lpthread -lrt
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU 1 2 3 3K 6K 9K 12K 15K SE +/- 0.16, N = 3 15805.17 15806.12 15805.70 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 5.40, N = 3 4053.66 4050.46 4051.36 MIN: 4038.97 MIN: 4046.74 MIN: 4047.46 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 1.22, N = 3 4049.70 4053.76 4058.11 MIN: 4043.71 MIN: 4050.14 MIN: 4053.28 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 2.36, N = 3 4051.75 4052.37 4051.76 MIN: 4043.32 MIN: 4048.89 MIN: 4048.8 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
libgav1 Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better libgav1 0.16.3 Video Input: Summer Nature 4K 1 2 3 10 20 30 40 50 SE +/- 0.13, N = 3 43.76 43.76 43.55 1. (CXX) g++ options: -O3 -lpthread -lrt
VOSK Speech Recognition Toolkit OpenBenchmarking.org Seconds, Fewer Is Better VOSK Speech Recognition Toolkit 0.3.21 1 2 3 7 14 21 28 35 SE +/- 0.22, N = 13 26.64 29.28 29.21
libavif avifenc Encoder Speed: 6, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6, Lossless 1 2 3 20 40 60 80 100 SE +/- 0.35, N = 3 81.12 80.64 80.77 1. (CXX) g++ options: -O3 -fPIC -lm
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 1 2 3 20 40 60 80 100 SE +/- 0.01, N = 3 80.20 80.08 80.10 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p 1 2 3 0.0585 0.117 0.1755 0.234 0.2925 SE +/- 0.00, N = 3 0.26 0.26 0.25 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer - Model: Asian Dragon Obj 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 10.59 10.55 10.55 MIN: 10.51 / MAX: 10.72 MIN: 10.51 / MAX: 10.68 MIN: 10.51 / MAX: 10.69
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 16.51, N = 3 2227.68 2217.15 2216.70 MIN: 2203.44 MIN: 2215.26 MIN: 2214.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 1.49, N = 3 2215.41 2217.72 2216.51 MIN: 2210.71 MIN: 2215.61 MIN: 2214.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 1.12, N = 3 2213.82 2213.73 2216.10 MIN: 2209.54 MIN: 2211.69 MIN: 2214.02 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile 1 2 3 20 40 60 80 100 SE +/- 0.19, N = 3 75.85 75.82 76.33
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.11 Input: simple-H2O 1 2 3 12 24 36 48 60 SE +/- 0.70, N = 5 52.54 52.49 50.18 1. (CXX) g++ options: -fopenmp -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -fomit-frame-pointer -ffast-math -pthread -lm
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer ISPC - Model: Asian Dragon Obj 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 11.81 11.75 11.79 MIN: 11.75 / MAX: 11.97 MIN: 11.7 / MAX: 11.89 MIN: 11.74 / MAX: 11.94
srsLTE Test: OFDM_Test OpenBenchmarking.org Samples / Second, More Is Better srsLTE 20.10.1 Test: OFDM_Test 1 2 3 16M 32M 48M 64M 80M 72300000 63100000 72700000 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K 1 2 3 3 6 9 12 15 SE +/- 0.11, N = 3 8.98 9.17 9.11 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
toyBrot Fractal Generator Implementation: C++ Tasks OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Tasks 1 2 3 14K 28K 42K 56K 70K SE +/- 177.25, N = 3 66632 66341 66331 1. (CXX) g++ options: -O3 -lpthread
toyBrot Fractal Generator Implementation: OpenMP OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: OpenMP 1 2 3 14K 28K 42K 56K 70K SE +/- 20.85, N = 3 66314 66391 66299 1. (CXX) g++ options: -O3 -lpthread
toyBrot Fractal Generator Implementation: C++ Threads OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Threads 1 2 3 14K 28K 42K 56K 70K SE +/- 26.27, N = 3 66253 66336 66224 1. (CXX) g++ options: -O3 -lpthread
PJSIP Method: INVITE OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: INVITE 1 2 3 700 1400 2100 2800 3500 SE +/- 3.71, N = 3 3321 3320 3320 1. (CC) gcc options: -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lssl -lcrypto -luuid -lm -lrt -lpthread -lasound -O2
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 12.39 12.41 12.41 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
PJSIP Method: OPTIONS, Stateful OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateful 1 2 3 1200 2400 3600 4800 6000 SE +/- 1.86, N = 3 5680 5666 5689 1. (CC) gcc options: -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lssl -lcrypto -luuid -lm -lrt -lpthread -lasound -O2
srsLTE Test: PHY_DL_Test OpenBenchmarking.org UE Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test 1 2 3 16 32 48 64 80 SE +/- 0.09, N = 3 70.3 70.8 69.8 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
srsLTE Test: PHY_DL_Test OpenBenchmarking.org eNb Mb/s, More Is Better srsLTE 20.10.1 Test: PHY_DL_Test 1 2 3 40 80 120 160 200 SE +/- 0.25, N = 3 173.6 175.3 172.6 1. (CXX) g++ options: -std=c++11 -fno-strict-aliasing -march=native -mfpmath=sse -mavx2 -fvisibility=hidden -O3 -fno-trapping-math -fno-math-errno -rdynamic -lpthread -lmbedcrypto -lconfig++ -lsctp -lbladeRF -lm -lfftw3f
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer - Model: Crown 1 2 3 3 6 9 12 15 SE +/- 0.0077, N = 3 9.8735 9.5853 10.0219 MIN: 9.81 / MAX: 10.02 MIN: 9.53 / MAX: 9.7 MIN: 9.96 / MAX: 10.15
PJSIP Method: OPTIONS, Stateless OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateless 1 2 3 8K 16K 24K 32K 40K SE +/- 173.85, N = 3 36173 36043 36169 1. (CC) gcc options: -lSDL2 -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lssl -lcrypto -luuid -lm -lrt -lpthread -lasound -O2
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer ISPC - Model: Crown 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 10.92 10.93 10.99 MIN: 10.84 / MAX: 11.11 MIN: 10.86 / MAX: 11.08 MIN: 10.92 / MAX: 11.15
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 19, Long Mode - Decompression Speed 1 2 3 500 1000 1500 2000 2500 SE +/- 17.64, N = 3 2477.9 2476.3 2510.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 19, Long Mode - Compression Speed 1 2 3 7 14 21 28 35 SE +/- 0.46, N = 3 29.6 29.1 29.1 1. (CC) gcc options: -O3 -pthread -lz -llzma
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer - Model: Asian Dragon 1 2 3 3 6 9 12 15 SE +/- 0.05, N = 3 11.47 11.26 11.25 MIN: 11.35 / MAX: 11.67 MIN: 11.22 / MAX: 11.35 MIN: 11.21 / MAX: 11.38
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 19 - Decompression Speed 1 2 3 500 1000 1500 2000 2500 SE +/- 22.95, N = 3 2469.6 2477.7 2471.8 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 19 - Compression Speed 1 2 3 8 16 24 32 40 SE +/- 0.17, N = 3 32.1 32.4 32.3 1. (CC) gcc options: -O3 -pthread -lz -llzma
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 2 1 2 3 12 24 36 48 60 SE +/- 0.36, N = 3 53.70 52.73 53.51 1. (CXX) g++ options: -O3 -fPIC -lm
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time 1 2 3 5M 10M 15M 20M 25M SE +/- 245397.16, N = 3 21549515 20756949 20973440 1. (CXX) g++ options: -lgcov -m64 -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -mbmi2 -flto -flto=jobserver
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer ISPC - Model: Asian Dragon 1 2 3 3 6 9 12 15 SE +/- 0.05, N = 3 13.11 13.14 13.10 MIN: 13 / MAX: 13.35 MIN: 13.08 / MAX: 13.27 MIN: 13.05 / MAX: 13.24
Chia Blockchain VDF Test: Square Plain C++ OpenBenchmarking.org IPS, More Is Better Chia Blockchain VDF 1.0.1 Test: Square Plain C++ 1 2 3 20K 40K 60K 80K 100K 108100 108100 108100 1. (CXX) g++ options: -flto -no-pie -lgmpxx -lgmp -lboost_system -pthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 1 2 3 10 20 30 40 50 SE +/- 0.02, N = 3 43.77 43.68 43.69 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Chia Blockchain VDF Test: Square Assembly Optimized OpenBenchmarking.org IPS, More Is Better Chia Blockchain VDF 1.0.1 Test: Square Assembly Optimized 1 2 3 30K 60K 90K 120K 150K SE +/- 384.42, N = 3 119633 117600 116500 1. (CXX) g++ options: -flto -no-pie -lgmpxx -lgmp -lboost_system -pthread
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p 1 2 3 4 8 12 16 20 SE +/- 0.08, N = 3 15.19 15.37 15.37 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 8, Long Mode - Decompression Speed 1 2 3 600 1200 1800 2400 3000 SE +/- 2.20, N = 3 2860.7 2843.5 2832.8 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 8, Long Mode - Compression Speed 1 2 3 100 200 300 400 500 SE +/- 1.64, N = 3 446.9 442.5 443.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 3 - Compression Speed 1 2 3 400 800 1200 1600 2000 SE +/- 16.60, N = 3 1958.4 1970.8 1996.5 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 8 - Decompression Speed 1 2 3 600 1200 1800 2400 3000 SE +/- 5.69, N = 3 2688.1 2693.4 2693.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 8 - Compression Speed 1 2 3 80 160 240 320 400 SE +/- 0.92, N = 3 355.1 353.7 350.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 3, Long Mode - Decompression Speed 1 2 3 600 1200 1800 2400 3000 SE +/- 9.64, N = 3 2799.0 2819.5 2816.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.9 Compression Level: 3, Long Mode - Compression Speed 1 2 3 130 260 390 520 650 SE +/- 1.16, N = 3 609.1 602.4 614.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 8 - Input: Bosphorus 4K 1 2 3 3 6 9 12 15 9.987 9.943 9.950 1. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S 1 2 3 8 16 24 32 40 SE +/- 0.20, N = 3 34.10 34.28 34.39 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Botan Test: AES-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt 1 2 3 700 1400 2100 2800 3500 SE +/- 5.86, N = 3 3103.88 3100.64 3096.62 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: AES-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 1 2 3 700 1400 2100 2800 3500 SE +/- 7.97, N = 3 3116.54 3125.52 3106.23 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction 1 2 3 7 14 21 28 35 SE +/- 0.24, N = 3 31.88 31.72 31.48 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 4 - Input: Bosphorus 1080p 1 2 3 0.6896 1.3792 2.0688 2.7584 3.448 3.065 3.018 2.968 1. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
Botan Test: ChaCha20Poly1305 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt 1 2 3 120 240 360 480 600 SE +/- 0.35, N = 3 572.53 573.60 574.42 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 1 2 3 130 260 390 520 650 SE +/- 0.68, N = 3 578.94 577.48 578.17 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt 1 2 3 60 120 180 240 300 SE +/- 0.19, N = 3 286.17 286.61 286.50 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish 1 2 3 60 120 180 240 300 SE +/- 0.14, N = 3 288.18 288.54 288.54 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt 1 2 3 70 140 210 280 350 SE +/- 0.00, N = 3 345.06 344.85 344.92 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish 1 2 3 80 160 240 320 400 SE +/- 0.06, N = 3 346.41 346.29 346.25 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt 1 2 3 20 40 60 80 100 SE +/- 0.40, N = 3 109.25 109.57 109.56 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 1 2 3 20 40 60 80 100 SE +/- 0.44, N = 3 109.39 109.74 109.75 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt 1 2 3 16 32 48 64 80 SE +/- 0.03, N = 3 71.95 71.89 71.95 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI 1 2 3 20 40 60 80 100 SE +/- 0.04, N = 3 76.00 75.92 75.94 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
libgav1 Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better libgav1 0.16.3 Video Input: Summer Nature 1080p 1 2 3 30 60 90 120 150 SE +/- 0.29, N = 3 126.28 125.69 126.39 1. (CXX) g++ options: -O3 -lpthread -lrt
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT 1 2 3 6 12 18 24 30 SE +/- 0.20, N = 3 26.9 26.6 26.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN 1 2 3 7 14 21 28 35 SE +/- 0.09, N = 3 27.6 28.0 27.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT 1 2 3 10 20 30 40 50 SE +/- 4.97, N = 3 38.8 43.8 43.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY 1 2 3 9 18 27 36 45 36.2 41.4 41.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY 1 2 3 9 18 27 36 45 SE +/- 0.03, N = 3 41.3 41.2 41.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Summer Nature 4K 1 2 3 30 60 90 120 150 SE +/- 0.38, N = 3 145.15 144.78 144.26 MIN: 134.65 / MAX: 164.89 MIN: 133.45 / MAX: 163.18 MIN: 129.68 / MAX: 162.64 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Chimera 1080p 1 2 3 100 200 300 400 500 SE +/- 2.47, N = 3 472.73 475.88 476.55 MIN: 350.42 / MAX: 614.21 MIN: 357.72 / MAX: 609.66 MIN: 358.56 / MAX: 609.47 1. (CC) gcc options: -pthread -lm
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput 1 2 3 30 60 90 120 150 SE +/- 0.22, N = 3 158.05 155.49 156.53 1. (CC) gcc options: -O3 -rdynamic
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 18.82 18.81 18.82 1. (CXX) g++ options: -O3 -flto -pthread
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT 1 2 3 7 14 21 28 35 SE +/- 0.13, N = 3 27.9 27.7 27.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT 1 2 3 10 20 30 40 50 SE +/- 0.00, N = 3 43.7 43.7 43.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N 1 2 3 10 20 30 40 50 SE +/- 0.03, N = 3 45.7 45.6 45.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 27.51 27.45 27.43 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
KTX-Software toktx Settings: UASTC 3 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 1 2 3 6 12 18 24 30 SE +/- 0.05, N = 3 23.68 23.47 23.80
KTX-Software toktx Settings: Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 19 1 2 3 6 12 18 24 30 SE +/- 0.07, N = 3 23.37 23.26 23.40
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN 1 2 3 7 14 21 28 35 SE +/- 0.00, N = 3 29.4 30.0 29.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T 1 2 3 11 22 33 44 55 SE +/- 0.03, N = 3 49.2 49.1 49.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY 1 2 3 6 12 18 24 30 SE +/- 0.00, N = 3 27.5 27.4 27.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.00898, N = 3 6.55061 6.56384 6.52527 MIN: 6.33 MIN: 6.37 MIN: 6.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.7992 1.5984 2.3976 3.1968 3.996 SE +/- 0.01251, N = 3 3.55190 3.54188 3.53866 MIN: 3.52 MIN: 3.52 MIN: 3.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Liquid-DSP Threads: 20 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 20 - Buffer Length: 256 - Filter Length: 57 1 2 3 100M 200M 300M 400M 500M SE +/- 79652.02, N = 3 452546667 452770000 452640000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 1 2 3 100M 200M 300M 400M 500M SE +/- 3448975.69, N = 3 441220000 444300000 437390000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 1 2 3 70M 140M 210M 280M 350M SE +/- 99554.56, N = 3 343806667 343480000 342790000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 4 - Buffer Length: 256 - Filter Length: 57 1 2 3 40M 80M 120M 160M 200M SE +/- 5773.50, N = 3 168710000 167210000 168540000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 2 - Buffer Length: 256 - Filter Length: 57 1 2 3 20M 40M 60M 80M 100M SE +/- 357366.76, N = 3 93297000 93119000 94009000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 1 2 3 11M 22M 33M 44M 55M SE +/- 3179.80, N = 3 49054667 49056000 49052000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
libavif avifenc Encoder Speed: 6 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6 1 2 3 5 10 15 20 25 SE +/- 0.06, N = 3 18.93 19.23 18.91 1. (CXX) g++ options: -O3 -fPIC -lm
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K 1 2 3 9 18 27 36 45 SE +/- 0.06, N = 3 37.40 37.23 37.30 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
KTX-Software toktx Settings: UASTC 3 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 1 2 3 4 8 12 16 20 SE +/- 0.13, N = 3 17.17 17.26 17.28
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY 2 3 6 12 18 24 30 27.5 27.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 0.9294 1.8588 2.7882 3.7176 4.647 SE +/- 0.01500, N = 3 4.12581 4.13077 4.11242 MIN: 4.05 MIN: 4.04 MIN: 4.03 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.6259 1.2518 1.8777 2.5036 3.1295 SE +/- 0.00273, N = 3 2.78165 2.76679 2.76696 MIN: 2.75 MIN: 2.75 MIN: 2.74 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p 1 2 3 30 60 90 120 150 SE +/- 1.64, N = 12 131.62 112.13 113.45 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Church Facade 1 2 3 2K 4K 6K 8K 10K SE +/- 29.72, N = 3 10529 10468 10590 1. (CXX) g++ options: -O3
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 0.7568 1.5136 2.2704 3.0272 3.784 SE +/- 0.00231, N = 3 3.35073 3.36355 3.35822 MIN: 3.26 MIN: 3.3 MIN: 3.28 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.8006 1.6012 2.4018 3.2024 4.003 SE +/- 0.00516, N = 3 3.48130 3.49530 3.55843 MIN: 3.4 MIN: 3.43 MIN: 3.47 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 1 2 3 3 6 9 12 15 SE +/- 0.05, N = 3 11.08 10.87 11.12 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p 1 2 3 14 28 42 56 70 SE +/- 0.40, N = 3 60.92 61.77 60.85 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 8 - Input: Bosphorus 1080p 1 2 3 9 18 27 36 45 37.20 36.78 36.85 1. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.00495, N = 3 7.46460 7.50073 7.51260 MIN: 7.41 MIN: 7.46 MIN: 7.47 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Summer Nature 1080p 1 2 3 90 180 270 360 450 SE +/- 1.59, N = 3 394.33 395.79 392.01 MIN: 310.75 / MAX: 435.18 MIN: 309.28 / MAX: 431.68 MIN: 304.75 / MAX: 426.38 1. (CC) gcc options: -pthread -lm
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.4.1 Model: Lion 1 2 3 1600 3200 4800 6400 8000 SE +/- 67.87, N = 3 7235 7316 7335 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.5507 1.1014 1.6521 2.2028 2.7535 SE +/- 0.00083, N = 3 2.44319 2.44752 2.43904 MIN: 2.41 MIN: 2.42 MIN: 2.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
libavif avifenc Encoder Speed: 10, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10, Lossless 1 2 3 3 6 9 12 15 SE +/- 0.010, N = 3 9.080 9.289 9.065 1. (CXX) g++ options: -O3 -fPIC -lm
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p 1 2 3 16 32 48 64 80 SE +/- 0.31, N = 3 72.30 73.91 72.17 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
KTX-Software toktx Settings: Zstd Compression 9 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 9 1 2 3 0.9979 1.9958 2.9937 3.9916 4.9895 SE +/- 0.043, N = 8 4.174 4.102 4.435
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium 1 2 3 2 4 6 8 10 SE +/- 0.0531, N = 3 7.9701 8.1210 8.1096 1. (CXX) g++ options: -O3 -flto -pthread
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p 1 2 3 20 40 60 80 100 SE +/- 0.06, N = 3 82.79 82.63 82.37 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 13.77 13.77 13.77 MIN: 13.7 MIN: 13.7 MIN: 13.71 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.00, N = 3 12.70 12.69 12.69 MIN: 12.47 MIN: 12.57 MIN: 12.64 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Helsing Digit Range: 12 digit OpenBenchmarking.org Seconds, Fewer Is Better Helsing 1.0-beta Digit Range: 12 digit 1 2 3 2 4 6 8 10 SE +/- 0.004, N = 3 6.450 6.446 6.447 1. (CC) gcc options: -O2 -pthread
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory 1 2 3 4K 8K 12K 16K 20K SE +/- 67.40, N = 3 16674.21 16808.94 16459.64 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p 1 2 3 20 40 60 80 100 SE +/- 0.18, N = 3 108.08 108.03 108.00 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10 1 2 3 1.2173 2.4346 3.6519 4.8692 6.0865 SE +/- 0.015, N = 3 5.375 5.410 5.368 1. (CXX) g++ options: -O3 -fPIC -lm
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p 1 2 3 30 60 90 120 150 SE +/- 0.08, N = 3 137.41 137.02 137.66 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p 1 2 3 40 80 120 160 200 SE +/- 0.32, N = 3 173.18 173.01 173.06 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.01285, N = 3 8.39311 8.39791 8.40666 MIN: 8.34 MIN: 8.36 MIN: 8.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.2748 2.5496 3.8244 5.0992 6.374 SE +/- 0.01512, N = 3 5.66569 5.65039 5.66558 MIN: 5.57 MIN: 5.56 MIN: 5.58 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Phoronix Test Suite v10.8.4