Intel Core i9-10980XE GCC compiler benchmarking by Michael Larabel for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2107032-IB-10980XECO53 Intel 10980XE GCC Compiler Benchmarks - Phoronix Test Suite Intel 10980XE GCC Compiler Benchmarks Intel Core i9-10980XE GCC compiler benchmarking by Michael Larabel for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2107032-IB-10980XECO53&export=pdf&sor&grs .
Intel 10980XE GCC Compiler Benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution GCC 8.5 GCC 9.4 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 Intel Core i9-10980XE @ 4.80GHz (18 Cores / 36 Threads) ASRock X299 Steel Legend (P1.30 BIOS) Intel Sky Lake-E DMI3 Registers 32GB Samsung SSD 970 PRO 512GB NVIDIA NV132 11GB Realtek ALC1220 ASUS VP28U Intel I219-V + Intel I211 Ubuntu 21.04 5.11.0-22-generic (x86_64) GNOME Shell 3.38.4 X Server + Wayland nouveau 4.3 Mesa 21.0.1 1.0.2 GCC 8.5.0 ext4 2560x1600 GCC 9.4.0 GCC 10.3.0 GCC 11.1.0 GCC 12.0.0 20210701 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Environment Details - CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" Compiler Details - --disable-multilib --enable-checking=release --enable-languages=c,c++ Processor Details - Scaling Governor: intel_cpufreq schedutil - CPU Microcode: 0x5003102 Python Details - Python 3.9.5 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
Intel 10980XE GCC Compiler Benchmarks graphics-magick: Sharpen financebench: Bonds OpenMP compress-zstd: 8, Long Mode - Compression Speed espeak: Text-To-Speech Synthesis botan: ChaCha20Poly1305 botan: ChaCha20Poly1305 - Decrypt financebench: Repo OpenMP graphics-magick: Swirl onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU smallpt: Global Illumination Renderer; 128 Samples mnn: resnet-v2-50 viennacl: CPU BLAS - dGEMM-NN botan: Twofish viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-TT botan: Twofish - Decrypt graphics-magick: Resizing tnn: CPU - MobileNet v2 graphics-magick: Rotate botan: Blowfish mrbayes: Primate Phylogeny Analysis onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU botan: Blowfish - Decrypt quantlib: ncnn: CPU-v2-v2 - mobilenet-v2 coremark: CoreMark Size 666 - Iterations Per Second compress-zstd: 19, Long Mode - Decompression Speed botan: CAST-256 - Decrypt botan: CAST-256 gcrypt: mnn: inception-v3 ncnn: CPU - googlenet tnn: CPU - SqueezeNet v2 ncnn: CPU - efficientnet-b0 compress-zstd: 19 - Decompression Speed graphics-magick: HWB Color Space ncnn: CPU - resnet18 ncnn: CPU - mnasnet vpxenc: Speed 5 - Bosphorus 4K mnn: mobilenetV3 webp: Quality 100, Highest Compression cryptopp: Unkeyed Algorithms onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU compress-zstd: 8 - Decompression Speed etcpak: DXT1 stockfish: Total Time onednn: IP Shapes 3D - u8s8f32 - CPU mnn: MobileNetV2_224 compress-zstd: 8, Long Mode - Decompression Speed svt-vp9: VMAF Optimized - Bosphorus 1080p ncnn: CPU - resnet50 graphics-magick: Noise-Gaussian liquid-dsp: 36 - 256 - 57 ncnn: CPU-v3-v3 - mobilenet-v3 mnn: SqueezeNetV1.0 cryptopp: Keyed Algorithms webp: Quality 100, Lossless, Highest Compression dav1d: Chimera 1080p 10-bit ncnn: CPU - squeezenet_ssd x265: Bosphorus 4K dav1d: Summer Nature 4K vpxenc: Speed 0 - Bosphorus 4K viennacl: CPU BLAS - sAXPY tnn: CPU - SqueezeNet v1.1 aom-av1: Speed 6 Two-Pass - Bosphorus 4K aom-av1: Speed 9 Realtime - Bosphorus 4K ncnn: CPU - regnety_400m etcpak: ETC1 + Dithering viennacl: CPU BLAS - dGEMV-N encode-opus: WAV To Opus Encode compress-zstd: 8 - Compression Speed onednn: IP Shapes 3D - bf16bf16bf16 - CPU tachyon: Total Time liquid-dsp: 32 - 256 - 57 svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU ncnn: CPU - blazeface ncnn: CPU - mobilenet onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU webp: Quality 100, Lossless viennacl: CPU BLAS - sCOPY onednn: Recurrent Neural Network Inference - u8s8f32 - CPU hmmer: Pfam Database Search botan: KASUMI ngspice: C7552 encode-mp3: WAV To MP3 compress-7zip: Compress Speed Test compress-zstd: 19, Long Mode - Compression Speed onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU etcpak: ETC2 pjsip: OPTIONS, Stateless ncnn: CPU - alexnet pjsip: INVITE c-ray: Total Time - 4K, 16 Rays Per Pixel vosk: aom-av1: Speed 8 Realtime - Bosphorus 4K botan: KASUMI - Decrypt himeno: Poisson Pressure Solver ncnn: CPU - vgg16 securemark: SecureMark-TLS graphics-magick: Enhanced blosc: blosclz mnn: mobilenet-v1-1.0 aom-av1: Speed 6 Realtime - Bosphorus 4K cryptopp: Integer + Elliptic Curve Public Key Algorithms svt-vp9: Visual Quality Optimized - Bosphorus 1080p kvazaar: Bosphorus 4K - Very Fast ncnn: CPU - shufflenet-v2 sqlite-speedtest: Timed Time - Size 1,000 encode-flac: WAV To FLAC compress-zstd: 19 - Compression Speed tjbench: Decompression Throughput svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K viennacl: CPU BLAS - sDOT kvazaar: Bosphorus 4K - Ultra Fast pjsip: OPTIONS, Stateful svt-hevc: 1 - Bosphorus 1080p ngspice: C2670 svt-hevc: 10 - Bosphorus 1080p onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU svt-hevc: 7 - Bosphorus 1080p viennacl: CPU BLAS - dGEMV-T botan: AES-256 gnupg: 2.7GB Sample File Encryption tnn: CPU - DenseNet viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU encode-wavpack: WAV To WavPack botan: AES-256 - Decrypt libgav1: Chimera 1080p 10-bit libgav1: Summer Nature 4K ncnn: CPU - yolov4-tiny mnn: squeezenetv1.1 viennacl: CPU BLAS - dDOT onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU GCC 8.5 GCC 9.4 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 197 74820.187500 335.5 26.836 984.019 977.165 42354.898437 792 0.679482 5.286 27.297 55.5 416.184 56.0 54.7 55.2 420.366 1444 321.387 809 491.138 146.096 0.441500 482.780 2586.3 5.36 618050.830009 2819.2 152.429 152.235 194.121 30.154 12.73 70.124 6.69 2676.8 903 10.54 4.82 8.22 2.403 6.642 377.780902 1573.18 1573.03 9.82858 3361.7 1445.235 51140013 1.23239 3.704 3547.4 306.08 18.33 390 916790000 4.72 5.594 714.487669 37.695 215.16 15.63 21.53 197.46 4.75 68.8 289.963 4.46 27.18 13.77 329.128 69.5 8.283 425.7 2.94903 48.1611 924630000 310.79 939.923 2.56 14.16 1.76308 17.202 45.9 937.607 126.204 99.716 126.200 8.716 99711 43.3 9.34766 198.109 135302 9.05 3307 29.818 20.721 19.40 97.644 4522.731170 36.20 259869 424 11800.7 2.433 7.51 5519.144212 247.45 21.35 5.05 57.606 8.436 60.3 218.813581 1.340 11.925 77.0 40.13 5801 12.88 134.346 374.62 5.54062 11.0180 190.03 79.9 3998.627 64.239 3505.931 57.1 38.1 0.528416 0.461459 13.355 3991.201 20.84 4.564 63.6 7.89032 265 76452.778646 375.6 28.107 951.152 945.449 42795.053385 752 0.828866 6.045 31.452 56.1 414.445 56.2 54.1 54.9 411.573 1617 347.436 776 486.315 157.332 0.467594 475.408 2568.8 5.10 650499.474888 3017.6 150.879 150.585 193.973 32.412 12.44 73.938 6.57 2802.7 864 10.58 4.73 8.61 2.297 6.560 375.231492 1639.78 1636.96 9.79185 3436.8 1450.560 50622552 1.27991 3.594 3628.4 295.07 17.59 387 921670000 4.57 5.540 719.762484 36.869 219.69 15.08 21.64 199.89 4.87 71.2 296.142 4.44 27.24 13.87 336.106 71.8 8.241 429.4 2.86008 47.8849 930236667 302.25 961.383 2.54 13.94 1.79150 16.823 47.1 960.128 126.506 100.411 127.688 8.525 98298 43.5 9.55455 199.097 135903 9.15 3252 30.035 20.993 19.34 98.557 4609.398563 36.01 263457 424 11802.6 2.455 7.43 5538.547039 243.48 21.20 4.97 57.034 8.500 60.2 220.617787 1.355 12.021 78.0 40.16 5729 12.91 134.222 376.89 5.59533 10.9596 190.62 79.7 3987.080 64.228 3527.682 57.4 38.1 0.526798 0.459456 13.375 3993.890 21.35 4.420 63.7 8.10038 317 49799.115885 370.1 32.672 788.573 780.123 35458.662761 761 0.680522 6.130 28.255 58.7 404.157 58.9 56.4 57.1 411.522 1585 311.604 765 486.288 154.629 0.426902 474.804 2529.8 5.12 630485.588510 2876.9 151.248 151.184 193.411 30.804 13.06 69.297 6.62 2701.9 864 11.16 4.80 8.59 2.341 6.866 372.583041 1566.75 1565.41 9.39062 3285.5 1484.230 49942276 1.22703 3.669 3479.0 299.97 17.87 403 940660000 4.64 5.612 717.165559 37.869 222.27 15.26 21.84 195.15 4.84 70.3 286.215 4.48 27.86 13.63 329.486 71.6 8.283 424.0 2.93978 47.9045 939716667 306.89 938.220 2.61 13.81 1.75520 17.251 46.4 935.962 126.687 98.200 126.877 8.730 97426 43.9 9.35095 197.629 138222 9.18 3281 30.430 20.753 19.28 96.879 4538.661961 36.35 263472 427 11713.4 2.467 7.38 5503.078606 246.75 21.09 5.02 57.303 8.369 60.1 219.542799 1.359 12.062 77.4 40.64 5744 13.04 133.912 372.69 5.53334 10.9199 189.88 79.6 3999.248 64.424 3508.290 57.3 38.3 0.526014 0.460482 13.340 3998.479 21.26 4.271 54.7 7.90163 319 48802.755208 469.4 35.086 779.376 774.645 34558.223958 924 0.697080 6.201 28.445 51.0 367.677 51.9 49.8 50.5 373.562 1571 314.561 852 442.670 142.934 0.430676 439.555 2749.1 5.06 597455.160812 2775.7 140.850 141.031 208.343 31.275 12.84 69.802 6.56 2642.2 916 10.97 4.72 8.63 2.412 6.780 360.414192 1566.25 1563.05 9.41346 3351.9 1468.968 52206963 1.23012 3.754 3553.2 297.32 17.77 403 954536667 4.66 5.723 692.986399 36.471 223.06 15.23 21.16 192.95 4.87 70.7 288.910 4.33 28.07 13.93 327.329 71.7 8.456 419.5 2.93428 47.8659 951170000 305.72 936.880 2.55 13.79 1.76264 16.803 46.7 938.043 126.616 100.697 129.396 8.732 98149 44.3 9.35550 194.759 136444 9.08 3240 29.960 20.894 19.64 97.001 4592.947401 36.56 259565 432 11926.8 2.477 7.51 5593.130439 244.05 21.01 5.05 57.411 8.411 61.0 218.618162 1.347 12.090 77.6 40.31 5769 12.96 134.711 375.65 5.54544 10.9199 191.02 79.3 3985.266 64.204 3508.390 57.4 38.3 0.528133 0.459956 13.353 3995.232 21.37 4.609 63.7 7.90477 319 48317.579427 385.5 27.281 781.383 775.488 34223.303386 903 0.678626 5.991 28.272 51.2 366.648 51.9 49.8 51.5 374.707 1607 318.056 794 442.265 145.174 0.425675 442.410 2773.6 4.89 601830.263292 2782.7 140.480 140.309 196.222 30.797 12.21 69.231 6.27 2773.9 884 10.93 4.58 8.65 2.372 6.548 374.640795 1564.71 1567.41 9.59452 3332.7 1419.365 50734571 1.22496 3.729 3531.4 293.71 17.74 403 937553333 4.54 5.506 714.205269 37.161 221.94 15.28 21.08 194.06 4.92 70.1 287.490 4.40 28.10 13.48 325.278 71.4 8.186 432.7 2.94670 49.3237 944433333 303.54 935.378 2.54 13.81 1.74471 16.892 46.4 936.491 129.454 100.609 128.320 8.599 98493 43.9 9.34206 197.128 137966 8.99 3304 29.973 20.574 19.67 96.610 4580.188073 36.70 264514 429 11889.4 2.477 7.49 5532.467940 244.56 21.15 4.98 56.715 8.379 60.8 217.425816 1.348 11.966 77.1 40.20 5763 12.94 135.428 375.32 5.53672 10.9330 191.33 79.5 3972.020 64.613 3524.746 57.2 38.2 0.527939 0.459660 13.331 3993.173 21.32 28.24 22.83 4.283 63.7 7.91258 OpenBenchmarking.org
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 9.4 GCC 8.5 70 140 210 280 350 SE +/- 0.33, N = 3 319 319 317 265 197 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lz -lm -lpthread
FinanceBench Benchmark: Bonds OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 8.5 GCC 9.4 16K 32K 48K 64K 80K SE +/- 48.36, N = 3 SE +/- 36.89, N = 3 SE +/- 11.94, N = 3 SE +/- 5.53, N = 3 SE +/- 1061.39, N = 3 48317.58 48802.76 49799.12 74820.19 76452.78 1. (CXX) g++ options: -O3 -march=native -fopenmp
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Compression Speed GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 GCC 8.5 100 200 300 400 500 SE +/- 3.45, N = 3 SE +/- 3.88, N = 15 SE +/- 2.56, N = 13 SE +/- 3.28, N = 15 SE +/- 3.63, N = 3 469.4 385.5 375.6 370.1 335.5 1. (CC) gcc options: -O3 -march=native -pthread -lz
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis GCC 8.5 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 GCC 11.1 8 16 24 32 40 SE +/- 0.19, N = 4 SE +/- 0.18, N = 4 SE +/- 0.23, N = 4 SE +/- 0.09, N = 4 SE +/- 0.16, N = 4 26.84 27.28 28.11 32.67 35.09 1. (CC) gcc options: -O3 -march=native -std=c99
Botan Test: ChaCha20Poly1305 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 GCC 8.5 GCC 9.4 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 200 400 600 800 1000 SE +/- 2.79, N = 3 SE +/- 1.36, N = 3 SE +/- 0.15, N = 3 SE +/- 0.73, N = 3 SE +/- 1.05, N = 3 984.02 951.15 788.57 781.38 779.38 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt GCC 8.5 GCC 9.4 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 200 400 600 800 1000 SE +/- 2.38, N = 3 SE +/- 0.08, N = 3 SE +/- 0.79, N = 3 SE +/- 0.47, N = 3 SE +/- 0.89, N = 3 977.17 945.45 780.12 775.49 774.65 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 8.5 GCC 9.4 9K 18K 27K 36K 45K SE +/- 22.03, N = 3 SE +/- 43.71, N = 3 SE +/- 120.35, N = 3 SE +/- 34.41, N = 3 SE +/- 10.60, N = 3 34223.30 34558.22 35458.66 42354.90 42795.05 1. (CXX) g++ options: -O3 -march=native -fopenmp
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl GCC 11.1 GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 GCC 9.4 200 400 600 800 1000 SE +/- 1.20, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 924 903 792 761 752 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lz -lm -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 GCC 11.1 GCC 9.4 0.1865 0.373 0.5595 0.746 0.9325 SE +/- 0.008934, N = 3 SE +/- 0.003298, N = 3 SE +/- 0.005954, N = 8 SE +/- 0.005568, N = 3 SE +/- 0.008442, N = 15 0.678626 0.679482 0.680522 0.697080 0.828866 MIN: 0.65 MIN: 0.66 MIN: 0.63 MIN: 0.67 MIN: 0.74 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Smallpt Global Illumination Renderer; 128 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 128 Samples GCC 8.5 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 GCC 11.1 2 4 6 8 10 SE +/- 0.007, N = 3 SE +/- 0.017, N = 3 SE +/- 0.011, N = 3 SE +/- 0.033, N = 3 SE +/- 0.004, N = 3 5.286 5.991 6.045 6.130 6.201 1. (CXX) g++ options: -fopenmp -O3 -march=native
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: resnet-v2-50 GCC 8.5 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 7 14 21 28 35 SE +/- 0.10, N = 3 SE +/- 0.24, N = 3 SE +/- 0.19, N = 3 SE +/- 0.06, N = 3 SE +/- 0.47, N = 15 27.30 28.26 28.27 28.45 31.45 MIN: 26.86 / MAX: 27.93 MIN: 27.6 / MAX: 28.76 MIN: 27.7 / MAX: 28.91 MIN: 27.77 / MAX: 28.83 MIN: 24.41 / MAX: 36.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GCC 10.3 GCC 9.4 GCC 8.5 GCC 12.0.0 20210701 GCC 11.1 13 26 39 52 65 SE +/- 0.63, N = 3 SE +/- 0.13, N = 3 SE +/- 0.75, N = 3 SE +/- 0.44, N = 3 SE +/- 0.15, N = 3 58.7 56.1 55.5 51.2 51.0 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
Botan Test: Twofish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish GCC 8.5 GCC 9.4 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 90 180 270 360 450 SE +/- 0.33, N = 3 SE +/- 0.12, N = 3 SE +/- 0.23, N = 3 SE +/- 1.13, N = 3 SE +/- 0.30, N = 3 416.18 414.45 404.16 367.68 366.65 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GCC 10.3 GCC 9.4 GCC 8.5 GCC 12.0.0 20210701 GCC 11.1 13 26 39 52 65 SE +/- 1.17, N = 3 SE +/- 0.36, N = 3 SE +/- 1.55, N = 2 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 58.9 56.2 56.0 51.9 51.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GCC 10.3 GCC 8.5 GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 13 26 39 52 65 SE +/- 0.48, N = 3 SE +/- 0.25, N = 3 SE +/- 0.27, N = 3 SE +/- 0.37, N = 3 SE +/- 0.15, N = 3 56.4 54.7 54.1 49.8 49.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GCC 10.3 GCC 8.5 GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 13 26 39 52 65 SE +/- 0.70, N = 2 SE +/- 0.32, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 SE +/- 0.20, N = 3 57.1 55.2 54.9 51.5 50.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
Botan Test: Twofish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt GCC 8.5 GCC 9.4 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 90 180 270 360 450 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.26, N = 3 SE +/- 0.17, N = 3 SE +/- 0.56, N = 3 420.37 411.57 411.52 374.71 373.56 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing GCC 9.4 GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 GCC 8.5 300 600 900 1200 1500 SE +/- 4.93, N = 3 SE +/- 7.21, N = 3 SE +/- 8.67, N = 3 SE +/- 3.51, N = 3 SE +/- 2.91, N = 3 1617 1607 1585 1571 1444 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lz -lm -lpthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 GCC 8.5 GCC 9.4 80 160 240 320 400 SE +/- 0.31, N = 3 SE +/- 0.19, N = 3 SE +/- 0.17, N = 3 SE +/- 0.29, N = 3 SE +/- 0.34, N = 3 311.60 314.56 318.06 321.39 347.44 MIN: 309.73 / MAX: 322.67 MIN: 312.66 / MAX: 328.44 MIN: 316.44 / MAX: 326.16 MIN: 319.29 / MAX: 341.28 MIN: 345.68 / MAX: 356.59 1. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate GCC 11.1 GCC 8.5 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 200 400 600 800 1000 SE +/- 2.52, N = 3 SE +/- 7.07, N = 15 SE +/- 7.25, N = 15 SE +/- 3.51, N = 3 SE +/- 5.24, N = 3 852 809 794 776 765 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lz -lm -lpthread
Botan Test: Blowfish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish GCC 8.5 GCC 9.4 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 110 220 330 440 550 SE +/- 0.01, N = 3 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 SE +/- 0.12, N = 3 SE +/- 0.07, N = 3 491.14 486.32 486.29 442.67 442.27 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Timed MrBayes Analysis Primate Phylogeny Analysis OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis GCC 11.1 GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 GCC 9.4 30 60 90 120 150 SE +/- 0.32, N = 3 SE +/- 1.02, N = 3 SE +/- 0.45, N = 3 SE +/- 0.58, N = 3 SE +/- 1.65, N = 12 142.93 145.17 146.10 154.63 157.33 1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -maes -mavx -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mrdrnd -mbmi -mbmi2 -madx -mmpx -mabm -O3 -std=c99 -pedantic -march=native -lm
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 GCC 8.5 GCC 9.4 0.1052 0.2104 0.3156 0.4208 0.526 SE +/- 0.000391, N = 3 SE +/- 0.004576, N = 4 SE +/- 0.004665, N = 3 SE +/- 0.003487, N = 3 SE +/- 0.002588, N = 3 0.425675 0.426902 0.430676 0.441500 0.467594 MIN: 0.41 MIN: 0.4 MIN: 0.41 MIN: 0.41 MIN: 0.44 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Botan Test: Blowfish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt GCC 8.5 GCC 9.4 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 100 200 300 400 500 SE +/- 0.04, N = 3 SE +/- 0.21, N = 3 SE +/- 0.30, N = 3 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 482.78 475.41 474.80 442.41 439.56 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 GCC 12.0.0 20210701 GCC 11.1 GCC 8.5 GCC 9.4 GCC 10.3 600 1200 1800 2400 3000 SE +/- 0.85, N = 3 SE +/- 33.86, N = 4 SE +/- 19.67, N = 3 SE +/- 35.29, N = 3 SE +/- 18.68, N = 3 2773.6 2749.1 2586.3 2568.8 2529.8 1. (CXX) g++ options: -O3 -march=native -rdynamic
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v2-v2 - Model: mobilenet-v2 GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 GCC 10.3 GCC 8.5 1.206 2.412 3.618 4.824 6.03 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 4.89 5.06 5.10 5.12 5.36 MIN: 4.72 / MAX: 10.07 MIN: 4.72 / MAX: 10.01 MIN: 4.73 / MAX: 10.4 MIN: 4.76 / MAX: 8.89 MIN: 4.96 / MAX: 8.4 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second GCC 9.4 GCC 10.3 GCC 8.5 GCC 12.0.0 20210701 GCC 11.1 140K 280K 420K 560K 700K SE +/- 1621.16, N = 3 SE +/- 2003.80, N = 3 SE +/- 3624.80, N = 3 SE +/- 2406.19, N = 3 SE +/- 1267.23, N = 3 650499.47 630485.59 618050.83 601830.26 597455.16 1. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed GCC 9.4 GCC 10.3 GCC 8.5 GCC 12.0.0 20210701 GCC 11.1 600 1200 1800 2400 3000 SE +/- 4.29, N = 3 SE +/- 2.75, N = 3 SE +/- 10.70, N = 3 SE +/- 2.25, N = 3 SE +/- 14.84, N = 3 3017.6 2876.9 2819.2 2782.7 2775.7 1. (CC) gcc options: -O3 -march=native -pthread -lz
Botan Test: CAST-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt GCC 8.5 GCC 10.3 GCC 9.4 GCC 11.1 GCC 12.0.0 20210701 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.50, N = 3 SE +/- 0.28, N = 3 152.43 151.25 150.88 140.85 140.48 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 GCC 8.5 GCC 10.3 GCC 9.4 GCC 11.1 GCC 12.0.0 20210701 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.30, N = 3 152.24 151.18 150.59 141.03 140.31 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Gcrypt Library OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 GCC 10.3 GCC 9.4 GCC 8.5 GCC 12.0.0 20210701 GCC 11.1 50 100 150 200 250 SE +/- 0.44, N = 3 SE +/- 0.21, N = 3 SE +/- 0.27, N = 3 SE +/- 0.33, N = 3 SE +/- 0.18, N = 3 193.41 193.97 194.12 196.22 208.34 1. (CC) gcc options: -O3 -march=native -fvisibility=hidden -lgpg-error
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: inception-v3 GCC 8.5 GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 GCC 9.4 8 16 24 32 40 SE +/- 0.10, N = 3 SE +/- 0.43, N = 3 SE +/- 0.42, N = 3 SE +/- 0.42, N = 3 SE +/- 0.28, N = 15 30.15 30.80 30.80 31.28 32.41 MIN: 29.77 / MAX: 30.53 MIN: 30.14 / MAX: 31.87 MIN: 30.12 / MAX: 31.88 MIN: 30.28 / MAX: 31.97 MIN: 29.17 / MAX: 33.89 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: googlenet GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 GCC 11.1 GCC 10.3 3 6 9 12 15 SE +/- 0.31, N = 3 SE +/- 0.28, N = 3 SE +/- 0.23, N = 3 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 12.21 12.44 12.73 12.84 13.06 MIN: 11.77 / MAX: 13.02 MIN: 12 / MAX: 13.22 MIN: 12.1 / MAX: 19.86 MIN: 12.68 / MAX: 16.73 MIN: 12.84 / MAX: 14.29 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 GCC 8.5 GCC 9.4 16 32 48 64 80 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 1.05, N = 3 69.23 69.30 69.80 70.12 73.94 MIN: 68.59 / MAX: 70.34 MIN: 68.65 / MAX: 70.61 MIN: 69.16 / MAX: 70.99 MIN: 69.44 / MAX: 71.63 MIN: 72.33 / MAX: 77.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: efficientnet-b0 GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 GCC 10.3 GCC 8.5 2 4 6 8 10 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 6.27 6.56 6.57 6.62 6.69 MIN: 6.05 / MAX: 14.37 MIN: 6.27 / MAX: 11.76 MIN: 6.28 / MAX: 14.64 MIN: 6.24 / MAX: 24.41 MIN: 6.33 / MAX: 10.85 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed GCC 9.4 GCC 12.0.0 20210701 GCC 10.3 GCC 8.5 GCC 11.1 600 1200 1800 2400 3000 SE +/- 14.28, N = 3 SE +/- 16.86, N = 3 SE +/- 9.76, N = 3 SE +/- 3.24, N = 3 SE +/- 7.69, N = 9 2802.7 2773.9 2701.9 2676.8 2642.2 1. (CC) gcc options: -O3 -march=native -pthread -lz
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space GCC 11.1 GCC 8.5 GCC 12.0.0 20210701 GCC 10.3 GCC 9.4 200 400 600 800 1000 SE +/- 1.15, N = 3 SE +/- 0.88, N = 3 SE +/- 1.33, N = 3 916 903 884 864 864 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lz -lm -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet18 GCC 8.5 GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 3 6 9 12 15 SE +/- 0.23, N = 3 SE +/- 0.28, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 10.54 10.58 10.93 10.97 11.16 MIN: 10.19 / MAX: 17.98 MIN: 10.2 / MAX: 11.57 MIN: 10.84 / MAX: 11.27 MIN: 10.84 / MAX: 20.49 MIN: 11.03 / MAX: 11.45 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mnasnet GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 GCC 10.3 GCC 8.5 1.0845 2.169 3.2535 4.338 5.4225 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 4.58 4.72 4.73 4.80 4.82 MIN: 4.39 / MAX: 10.87 MIN: 4.46 / MAX: 10.72 MIN: 4.44 / MAX: 11.54 MIN: 4.42 / MAX: 16.22 MIN: 4.55 / MAX: 12.4 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.10.0 Speed: Speed 5 - Input: Bosphorus 4K GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 GCC 10.3 GCC 8.5 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 8.65 8.63 8.61 8.59 8.22 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=native -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenetV3 GCC 9.4 GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 GCC 11.1 0.5427 1.0854 1.6281 2.1708 2.7135 SE +/- 0.015, N = 15 SE +/- 0.011, N = 3 SE +/- 0.011, N = 3 SE +/- 0.012, N = 3 SE +/- 0.032, N = 3 2.297 2.341 2.372 2.403 2.412 MIN: 1.96 / MAX: 2.53 MIN: 2.16 / MAX: 2.53 MIN: 2.25 / MAX: 2.5 MIN: 2.28 / MAX: 2.54 MIN: 2.23 / MAX: 2.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 GCC 11.1 GCC 10.3 2 4 6 8 10 SE +/- 0.043, N = 3 SE +/- 0.007, N = 3 SE +/- 0.007, N = 3 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 6.548 6.560 6.642 6.780 6.866 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16
Crypto++ Test: Unkeyed Algorithms OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Unkeyed Algorithms GCC 8.5 GCC 9.4 GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 80 160 240 320 400 SE +/- 0.02, N = 3 SE +/- 0.65, N = 3 SE +/- 0.68, N = 3 SE +/- 0.17, N = 3 SE +/- 0.04, N = 3 377.78 375.23 374.64 372.58 360.41 1. (CXX) g++ options: -O3 -march=native -fPIC -pthread -pipe
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 8.5 GCC 9.4 400 800 1200 1600 2000 SE +/- 1.54, N = 3 SE +/- 3.72, N = 3 SE +/- 1.06, N = 3 SE +/- 0.81, N = 3 SE +/- 1.53, N = 3 1564.71 1566.25 1566.75 1573.18 1639.78 MIN: 1557.51 MIN: 1553.36 MIN: 1558.77 MIN: 1566.72 MIN: 1630.86 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 GCC 9.4 400 800 1200 1600 2000 SE +/- 2.29, N = 3 SE +/- 0.93, N = 3 SE +/- 0.75, N = 3 SE +/- 1.09, N = 3 SE +/- 2.09, N = 3 1563.05 1565.41 1567.41 1573.03 1636.96 MIN: 1555.64 MIN: 1559.12 MIN: 1561.17 MIN: 1566.77 MIN: 1629.18 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 3 6 9 12 15 SE +/- 0.02137, N = 3 SE +/- 0.01742, N = 3 SE +/- 0.02038, N = 3 SE +/- 0.00795, N = 3 SE +/- 0.02091, N = 3 9.39062 9.41346 9.59452 9.79185 9.82858 MIN: 9.29 MIN: 9.27 MIN: 9.4 MIN: 9.62 MIN: 9.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed GCC 9.4 GCC 8.5 GCC 11.1 GCC 12.0.0 20210701 GCC 10.3 700 1400 2100 2800 3500 SE +/- 2.27, N = 5 SE +/- 2.28, N = 3 SE +/- 5.37, N = 3 SE +/- 2.91, N = 3 SE +/- 3.06, N = 3 3436.8 3361.7 3351.9 3332.7 3285.5 1. (CC) gcc options: -O3 -march=native -pthread -lz
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 GCC 10.3 GCC 11.1 GCC 9.4 GCC 8.5 GCC 12.0.0 20210701 300 600 900 1200 1500 SE +/- 0.37, N = 3 SE +/- 0.47, N = 3 SE +/- 0.45, N = 3 SE +/- 1.51, N = 3 SE +/- 1.30, N = 3 1484.23 1468.97 1450.56 1445.24 1419.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time GCC 11.1 GCC 8.5 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 11M 22M 33M 44M 55M SE +/- 211973.29, N = 3 SE +/- 562135.06, N = 15 SE +/- 432778.20, N = 8 SE +/- 508293.85, N = 15 SE +/- 623930.99, N = 3 52206963 51140013 50734571 50622552 49942276 1. (CXX) g++ options: -lgcov -m64 -lpthread -O3 -march=native -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -mavx2 -mavx512f -mavx512bw -mavx512vnni -mavx512dq -mavx512vl -msse4.1 -mssse3 -msse2 -mbmi2 -flto -fprofile-use -fno-peel-loops -fno-tracer -flto=jobserver
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 GCC 8.5 GCC 9.4 0.288 0.576 0.864 1.152 1.44 SE +/- 0.00255, N = 3 SE +/- 0.00554, N = 3 SE +/- 0.00400, N = 3 SE +/- 0.00683, N = 3 SE +/- 0.00348, N = 3 1.22496 1.22703 1.23012 1.23239 1.27991 MIN: 1.18 MIN: 1.18 MIN: 1.19 MIN: 1.18 MIN: 1.23 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: MobileNetV2_224 GCC 9.4 GCC 10.3 GCC 8.5 GCC 12.0.0 20210701 GCC 11.1 0.8447 1.6894 2.5341 3.3788 4.2235 SE +/- 0.040, N = 15 SE +/- 0.090, N = 3 SE +/- 0.069, N = 3 SE +/- 0.058, N = 3 SE +/- 0.018, N = 3 3.594 3.669 3.704 3.729 3.754 MIN: 3.07 / MAX: 4.08 MIN: 3.42 / MAX: 4.19 MIN: 3.31 / MAX: 3.95 MIN: 3.41 / MAX: 3.94 MIN: 3.47 / MAX: 3.95 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8, Long Mode - Decompression Speed GCC 9.4 GCC 11.1 GCC 8.5 GCC 12.0.0 20210701 GCC 10.3 800 1600 2400 3200 4000 SE +/- 2.52, N = 13 SE +/- 3.35, N = 3 SE +/- 5.55, N = 3 SE +/- 3.04, N = 15 SE +/- 3.49, N = 15 3628.4 3553.2 3547.4 3531.4 3479.0 1. (CC) gcc options: -O3 -march=native -pthread -lz
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p GCC 8.5 GCC 10.3 GCC 11.1 GCC 9.4 GCC 12.0.0 20210701 70 140 210 280 350 SE +/- 2.56, N = 13 SE +/- 1.76, N = 14 SE +/- 2.83, N = 6 SE +/- 4.16, N = 3 SE +/- 3.01, N = 5 306.08 299.97 297.32 295.07 293.71 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet50 GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 8.5 5 10 15 20 25 SE +/- 0.29, N = 3 SE +/- 0.24, N = 3 SE +/- 0.24, N = 3 SE +/- 0.29, N = 3 SE +/- 0.30, N = 3 17.59 17.74 17.77 17.87 18.33 MIN: 17.07 / MAX: 18.69 MIN: 17.09 / MAX: 18.96 MIN: 17.07 / MAX: 28.68 MIN: 17.16 / MAX: 18.62 MIN: 17.58 / MAX: 24.57 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 8.5 GCC 9.4 90 180 270 360 450 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 403 403 403 390 387 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lz -lm -lpthread
Liquid-DSP Threads: 36 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 36 - Buffer Length: 256 - Filter Length: 57 GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 200M 400M 600M 800M 1000M SE +/- 1013283.99, N = 3 SE +/- 272213.15, N = 3 SE +/- 1056729.76, N = 3 SE +/- 588132.64, N = 3 SE +/- 120554.28, N = 3 954536667 940660000 937553333 921670000 916790000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v3-v3 - Model: mobilenet-v3 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 GCC 11.1 GCC 8.5 1.062 2.124 3.186 4.248 5.31 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 4.54 4.57 4.64 4.66 4.72 MIN: 4.37 / MAX: 11.47 MIN: 4.37 / MAX: 9.05 MIN: 4.36 / MAX: 10.01 MIN: 4.46 / MAX: 12.92 MIN: 4.49 / MAX: 7.39 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: SqueezeNetV1.0 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 GCC 10.3 GCC 11.1 1.2877 2.5754 3.8631 5.1508 6.4385 SE +/- 0.079, N = 3 SE +/- 0.039, N = 15 SE +/- 0.056, N = 3 SE +/- 0.083, N = 3 SE +/- 0.020, N = 3 5.506 5.540 5.594 5.612 5.723 MIN: 5.22 / MAX: 5.91 MIN: 5.06 / MAX: 6.72 MIN: 5.4 / MAX: 5.85 MIN: 5.24 / MAX: 6.01 MIN: 5.47 / MAX: 6.72 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Crypto++ Test: Keyed Algorithms OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Keyed Algorithms GCC 9.4 GCC 10.3 GCC 8.5 GCC 12.0.0 20210701 GCC 11.1 160 320 480 640 800 SE +/- 0.35, N = 3 SE +/- 0.48, N = 3 SE +/- 0.10, N = 3 SE +/- 0.24, N = 3 SE +/- 0.20, N = 3 719.76 717.17 714.49 714.21 692.99 1. (CXX) g++ options: -O3 -march=native -fPIC -pthread -pipe
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression GCC 11.1 GCC 9.4 GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 9 18 27 36 45 SE +/- 0.00, N = 3 SE +/- 0.13, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.01, N = 3 36.47 36.87 37.16 37.70 37.87 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.9.0 Video Input: Chimera 1080p 10-bit GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 50 100 150 200 250 SE +/- 1.18, N = 3 SE +/- 0.50, N = 3 SE +/- 0.48, N = 3 SE +/- 0.34, N = 3 SE +/- 0.39, N = 3 223.06 222.27 221.94 219.69 215.16 MIN: 157.45 / MAX: 397.96 MIN: 157.09 / MAX: 436.51 MIN: 157.38 / MAX: 404.98 -lm - MIN: 156.35 / MAX: 406.23 -lm - MIN: 151.62 / MAX: 411.26 1. (CC) gcc options: -O3 -march=native -pthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: squeezenet_ssd GCC 9.4 GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.13, N = 3 SE +/- 0.36, N = 3 15.08 15.23 15.26 15.28 15.63 MIN: 14.88 / MAX: 21.62 MIN: 14.88 / MAX: 17.07 MIN: 14.97 / MAX: 18.92 MIN: 14.89 / MAX: 16.1 MIN: 15.02 / MAX: 16.85 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K GCC 10.3 GCC 9.4 GCC 8.5 GCC 11.1 GCC 12.0.0 20210701 5 10 15 20 25 SE +/- 0.13, N = 3 SE +/- 0.12, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 21.84 21.64 21.53 21.16 21.08 1. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.9.0 Video Input: Summer Nature 4K GCC 9.4 GCC 8.5 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 40 80 120 160 200 SE +/- 1.65, N = 3 SE +/- 0.89, N = 3 SE +/- 2.16, N = 3 SE +/- 1.97, N = 6 SE +/- 1.13, N = 3 199.89 197.46 195.15 194.06 192.95 -lm - MIN: 143.79 / MAX: 228.44 -lm - MIN: 150.4 / MAX: 226.05 MIN: 149.2 / MAX: 222.59 MIN: 131.48 / MAX: 225.93 MIN: 132.83 / MAX: 217.9 1. (CC) gcc options: -O3 -march=native -pthread
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.10.0 Speed: Speed 0 - Input: Bosphorus 4K GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 GCC 10.3 GCC 8.5 1.107 2.214 3.321 4.428 5.535 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 4.92 4.87 4.87 4.84 4.75 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=native -fPIC -U_FORTIFY_SOURCE -std=gnu++11
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GCC 9.4 GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 16 32 48 64 80 SE +/- 0.12, N = 3 SE +/- 0.31, N = 3 SE +/- 0.30, N = 3 SE +/- 0.35, N = 3 SE +/- 1.48, N = 3 71.2 70.7 70.3 70.1 68.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 GCC 8.5 GCC 9.4 60 120 180 240 300 SE +/- 0.10, N = 3 SE +/- 0.19, N = 3 SE +/- 0.63, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 286.22 287.49 288.91 289.96 296.14 MIN: 285.02 / MAX: 287.82 MIN: 285.88 / MAX: 299.62 MIN: 286.05 / MAX: 294.45 MIN: 288.43 / MAX: 291.61 MIN: 294.66 / MAX: 298.56 1. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K GCC 10.3 GCC 8.5 GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 1.008 2.016 3.024 4.032 5.04 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 10 SE +/- 0.04, N = 3 4.48 4.46 4.44 4.40 4.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 9.4 GCC 8.5 7 14 21 28 35 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.15, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 28.10 28.07 27.86 27.24 27.18 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: regnety_400m GCC 12.0.0 20210701 GCC 10.3 GCC 8.5 GCC 9.4 GCC 11.1 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 SE +/- 0.21, N = 3 SE +/- 0.11, N = 3 13.48 13.63 13.77 13.87 13.93 MIN: 13.11 / MAX: 14.04 MIN: 13.01 / MAX: 14.57 MIN: 12.93 / MAX: 14.62 MIN: 13.18 / MAX: 15.01 MIN: 13.17 / MAX: 14.48 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
Etcpak Configuration: ETC1 + Dithering OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 + Dithering GCC 9.4 GCC 10.3 GCC 8.5 GCC 11.1 GCC 12.0.0 20210701 70 140 210 280 350 SE +/- 0.09, N = 3 SE +/- 0.38, N = 3 SE +/- 2.72, N = 3 SE +/- 0.20, N = 3 SE +/- 0.05, N = 3 336.11 329.49 329.13 327.33 325.28 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GCC 9.4 GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 16 32 48 64 80 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 SE +/- 0.03, N = 3 SE +/- 0.31, N = 3 SE +/- 0.12, N = 3 71.8 71.7 71.6 71.4 69.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 GCC 10.3 GCC 11.1 2 4 6 8 10 SE +/- 0.011, N = 5 SE +/- 0.021, N = 5 SE +/- 0.011, N = 5 SE +/- 0.031, N = 5 SE +/- 0.013, N = 5 8.186 8.241 8.283 8.283 8.456 1. (CXX) g++ options: -O3 -march=native -fvisibility=hidden -logg -lm
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 GCC 10.3 GCC 11.1 90 180 270 360 450 SE +/- 5.56, N = 3 SE +/- 4.23, N = 5 SE +/- 4.99, N = 3 SE +/- 4.87, N = 3 SE +/- 5.64, N = 3 432.7 429.4 425.7 424.0 419.5 1. (CC) gcc options: -O3 -march=native -pthread -lz
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU GCC 9.4 GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 0.6635 1.327 1.9905 2.654 3.3175 SE +/- 0.01601, N = 3 SE +/- 0.01584, N = 3 SE +/- 0.01261, N = 3 SE +/- 0.01797, N = 3 SE +/- 0.01360, N = 3 2.86008 2.93428 2.93978 2.94670 2.94903 MIN: 2.77 MIN: 2.83 MIN: 2.83 MIN: 2.84 MIN: 2.85 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Tachyon Total Time OpenBenchmarking.org Seconds, Fewer Is Better Tachyon 0.99b6 Total Time GCC 11.1 GCC 9.4 GCC 10.3 GCC 8.5 GCC 12.0.0 20210701 11 22 33 44 55 SE +/- 0.26, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.18, N = 3 47.87 47.88 47.90 48.16 49.32 1. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 GCC 11.1 GCC 12.0.0 20210701 GCC 10.3 GCC 9.4 GCC 8.5 200M 400M 600M 800M 1000M SE +/- 4781007.56, N = 3 SE +/- 4623189.13, N = 3 SE +/- 2904171.33, N = 3 SE +/- 539269.05, N = 3 SE +/- 353836.12, N = 3 951170000 944433333 939716667 930236667 924630000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p GCC 8.5 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 70 140 210 280 350 SE +/- 0.21, N = 3 SE +/- 1.79, N = 3 SE +/- 2.35, N = 3 SE +/- 0.68, N = 3 SE +/- 1.42, N = 3 310.79 306.89 305.72 303.54 302.25 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 8.5 GCC 9.4 200 400 600 800 1000 SE +/- 0.15, N = 3 SE +/- 0.70, N = 3 SE +/- 0.16, N = 3 SE +/- 1.32, N = 3 SE +/- 1.46, N = 3 935.38 936.88 938.22 939.92 961.38 MIN: 931.25 MIN: 931.54 MIN: 930.52 MIN: 933.27 MIN: 955.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: blazeface GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 GCC 8.5 GCC 10.3 0.5873 1.1746 1.7619 2.3492 2.9365 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 2.54 2.54 2.55 2.56 2.61 MIN: 2.45 / MAX: 3.32 MIN: 2.46 / MAX: 3.12 MIN: 2.47 / MAX: 3.17 MIN: 2.5 / MAX: 3.31 MIN: 2.47 / MAX: 3.3 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mobilenet GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 13.79 13.81 13.81 13.94 14.16 MIN: 13.61 / MAX: 14.23 MIN: 13.51 / MAX: 20.36 MIN: 13.64 / MAX: 14.54 MIN: 13.68 / MAX: 22.27 MIN: 13.93 / MAX: 14.76 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 GCC 8.5 GCC 9.4 0.4031 0.8062 1.2093 1.6124 2.0155 SE +/- 0.00613, N = 3 SE +/- 0.00494, N = 3 SE +/- 0.00508, N = 3 SE +/- 0.00719, N = 3 SE +/- 0.00782, N = 3 1.74471 1.75520 1.76264 1.76308 1.79150 MIN: 1.66 MIN: 1.69 MIN: 1.69 MIN: 1.7 MIN: 1.74 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless GCC 11.1 GCC 9.4 GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 16.80 16.82 16.89 17.20 17.25 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GCC 9.4 GCC 11.1 GCC 12.0.0 20210701 GCC 10.3 GCC 8.5 11 22 33 44 55 SE +/- 0.17, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 SE +/- 0.12, N = 3 SE +/- 0.27, N = 3 47.1 46.7 46.4 46.4 45.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 GCC 11.1 GCC 9.4 200 400 600 800 1000 SE +/- 0.69, N = 3 SE +/- 0.46, N = 3 SE +/- 0.50, N = 3 SE +/- 0.60, N = 3 SE +/- 0.63, N = 3 935.96 936.49 937.61 938.04 960.13 MIN: 930.6 MIN: 932.18 MIN: 932.91 MIN: 933.04 MIN: 955.07 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.2 Pfam Database Search GCC 8.5 GCC 9.4 GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.32, N = 3 SE +/- 0.10, N = 3 126.20 126.51 126.62 126.69 129.45 1. (CC) gcc options: -O3 -march=native -pthread -lhmmer -leasel -lm -lmpi
Botan Test: KASUMI OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 GCC 10.3 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.25, N = 3 SE +/- 0.12, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 100.70 100.61 100.41 99.72 98.20 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Ngspice Circuit: C7552 OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 GCC 8.5 GCC 10.3 GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 30 60 90 120 150 SE +/- 1.35, N = 3 SE +/- 1.51, N = 3 SE +/- 1.54, N = 3 SE +/- 1.05, N = 3 SE +/- 0.26, N = 3 126.20 126.88 127.69 128.32 129.40 1. (CC) gcc options: -O3 -march=native -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 GCC 9.4 GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 GCC 11.1 2 4 6 8 10 SE +/- 0.011, N = 3 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 8.525 8.599 8.716 8.730 8.732 1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -march=native -lm
7-Zip Compression Compress Speed Test OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 16.02 Compress Speed Test GCC 8.5 GCC 12.0.0 20210701 GCC 9.4 GCC 11.1 GCC 10.3 20K 40K 60K 80K 100K SE +/- 69.57, N = 3 SE +/- 228.39, N = 3 SE +/- 343.09, N = 3 SE +/- 304.44, N = 3 SE +/- 46.23, N = 3 99711 98493 98298 98149 97426 1. (CXX) g++ options: -pipe -lpthread
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed GCC 11.1 GCC 12.0.0 20210701 GCC 10.3 GCC 9.4 GCC 8.5 10 20 30 40 50 SE +/- 0.07, N = 3 SE +/- 0.15, N = 3 SE +/- 0.19, N = 3 SE +/- 0.13, N = 3 SE +/- 0.15, N = 3 44.3 43.9 43.9 43.5 43.3 1. (CC) gcc options: -O3 -march=native -pthread -lz
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 GCC 11.1 GCC 9.4 3 6 9 12 15 SE +/- 0.00808, N = 3 SE +/- 0.01311, N = 3 SE +/- 0.01276, N = 3 SE +/- 0.01164, N = 3 SE +/- 0.01517, N = 3 9.34206 9.34766 9.35095 9.35550 9.55455 MIN: 9.28 MIN: 9.29 MIN: 9.29 MIN: 9.29 MIN: 9.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 GCC 9.4 GCC 8.5 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 40 80 120 160 200 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 199.10 198.11 197.63 197.13 194.76 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
PJSIP Method: OPTIONS, Stateless OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateless GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 GCC 8.5 30K 60K 90K 120K 150K SE +/- 1006.09, N = 3 SE +/- 734.59, N = 3 SE +/- 946.90, N = 3 SE +/- 1635.40, N = 4 SE +/- 578.10, N = 3 138222 137966 136444 135903 135302 1. (CC) gcc options: -lstdc++ -lssl -lcrypto -lm -lrt -lpthread -O3 -march=native
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: alexnet GCC 12.0.0 20210701 GCC 8.5 GCC 11.1 GCC 9.4 GCC 10.3 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 8.99 9.05 9.08 9.15 9.18 MIN: 8.73 / MAX: 9.39 MIN: 8.96 / MAX: 19.48 MIN: 9 / MAX: 11.81 MIN: 9.08 / MAX: 9.59 MIN: 9.11 / MAX: 9.74 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
PJSIP Method: INVITE OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: INVITE GCC 8.5 GCC 12.0.0 20210701 GCC 10.3 GCC 9.4 GCC 11.1 700 1400 2100 2800 3500 SE +/- 25.16, N = 15 SE +/- 25.40, N = 15 SE +/- 6.36, N = 3 SE +/- 27.02, N = 3 SE +/- 7.22, N = 3 3307 3304 3281 3252 3240 1. (CC) gcc options: -lstdc++ -lssl -lcrypto -lm -lrt -lpthread -O3 -march=native
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 8.5 GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 29.82 29.96 29.97 30.04 30.43 1. (CC) gcc options: -lm -lpthread -O3 -march=native
VOSK Speech Recognition Toolkit OpenBenchmarking.org Seconds, Fewer Is Better VOSK Speech Recognition Toolkit 0.3.21 GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 GCC 11.1 GCC 9.4 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.15, N = 3 SE +/- 0.12, N = 3 SE +/- 0.26, N = 3 SE +/- 0.13, N = 3 20.57 20.72 20.75 20.89 20.99
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K GCC 12.0.0 20210701 GCC 11.1 GCC 8.5 GCC 9.4 GCC 10.3 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 19.67 19.64 19.40 19.34 19.28 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Botan Test: KASUMI - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt GCC 9.4 GCC 8.5 GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 98.56 97.64 97.00 96.88 96.61 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 9.4 GCC 11.1 GCC 12.0.0 20210701 GCC 10.3 GCC 8.5 1000 2000 3000 4000 5000 SE +/- 5.70, N = 3 SE +/- 0.64, N = 3 SE +/- 2.82, N = 3 SE +/- 13.08, N = 3 SE +/- 0.61, N = 3 4609.40 4592.95 4580.19 4538.66 4522.73 1. (CC) gcc options: -O3 -march=native -mavx2
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: vgg16 GCC 9.4 GCC 8.5 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 8 16 24 32 40 SE +/- 0.49, N = 3 SE +/- 0.37, N = 3 SE +/- 0.47, N = 3 SE +/- 0.52, N = 3 SE +/- 0.53, N = 3 36.01 36.20 36.35 36.56 36.70 MIN: 35.37 / MAX: 37.68 MIN: 35.36 / MAX: 47.25 MIN: 35.3 / MAX: 37.7 MIN: 35.42 / MAX: 58.41 MIN: 35.5 / MAX: 41.99 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
SecureMark Benchmark: SecureMark-TLS OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS GCC 12.0.0 20210701 GCC 10.3 GCC 9.4 GCC 8.5 GCC 11.1 60K 120K 180K 240K 300K SE +/- 101.43, N = 3 SE +/- 95.66, N = 3 SE +/- 247.29, N = 3 SE +/- 106.27, N = 3 SE +/- 68.80, N = 3 264514 263472 263457 259869 259565 1. (CC) gcc options: -pedantic -O3
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced GCC 11.1 GCC 12.0.0 20210701 GCC 10.3 GCC 9.4 GCC 8.5 90 180 270 360 450 432 429 427 424 424 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lz -lm -lpthread
C-Blosc Compressor: blosclz OpenBenchmarking.org MB/s, More Is Better C-Blosc 2.0 Compressor: blosclz GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 GCC 10.3 3K 6K 9K 12K 15K SE +/- 69.56, N = 3 SE +/- 37.20, N = 3 SE +/- 18.15, N = 3 SE +/- 11.42, N = 3 SE +/- 21.15, N = 3 11926.8 11889.4 11802.6 11800.7 11713.4 1. (CC) gcc options: -std=gnu99 -O3 -pthread -lrt -lm
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenet-v1-1.0 GCC 8.5 GCC 9.4 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 0.5573 1.1146 1.6719 2.2292 2.7865 SE +/- 0.025, N = 3 SE +/- 0.012, N = 15 SE +/- 0.025, N = 3 SE +/- 0.024, N = 3 SE +/- 0.029, N = 3 2.433 2.455 2.467 2.477 2.477 MIN: 2.32 / MAX: 2.62 MIN: 2.23 / MAX: 3.16 MIN: 2.32 / MAX: 2.68 MIN: 2.3 / MAX: 2.65 MIN: 2.3 / MAX: 2.74 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K GCC 11.1 GCC 8.5 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 12 SE +/- 0.08, N = 15 SE +/- 0.04, N = 3 7.51 7.51 7.49 7.43 7.38 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Crypto++ Test: Integer + Elliptic Curve Public Key Algorithms OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Integer + Elliptic Curve Public Key Algorithms GCC 11.1 GCC 9.4 GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 1200 2400 3600 4800 6000 SE +/- 1.99, N = 3 SE +/- 6.16, N = 3 SE +/- 5.19, N = 3 SE +/- 1.33, N = 3 SE +/- 1.85, N = 3 5593.13 5538.55 5532.47 5519.14 5503.08 1. (CXX) g++ options: -O3 -march=native -fPIC -pthread -pipe
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p GCC 8.5 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 50 100 150 200 250 SE +/- 1.34, N = 3 SE +/- 3.33, N = 3 SE +/- 1.58, N = 3 SE +/- 3.10, N = 3 SE +/- 1.22, N = 3 247.45 246.75 244.56 244.05 243.48 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast GCC 8.5 GCC 9.4 GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 21.35 21.20 21.15 21.09 21.01 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lpthread -lm -lrt
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: shufflenet-v2 GCC 9.4 GCC 12.0.0 20210701 GCC 10.3 GCC 8.5 GCC 11.1 1.1363 2.2726 3.4089 4.5452 5.6815 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 4.97 4.98 5.02 5.05 5.05 MIN: 4.8 / MAX: 8.9 MIN: 4.88 / MAX: 8.6 MIN: 4.83 / MAX: 15.94 MIN: 4.83 / MAX: 14.14 MIN: 4.88 / MAX: 9.37 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 GCC 12.0.0 20210701 GCC 9.4 GCC 10.3 GCC 11.1 GCC 8.5 13 26 39 52 65 SE +/- 0.11, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 SE +/- 0.10, N = 3 SE +/- 0.22, N = 3 56.72 57.03 57.30 57.41 57.61 1. (CC) gcc options: -O3 -march=native -ldl -lz -lpthread
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.2 WAV To FLAC GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 GCC 8.5 GCC 9.4 2 4 6 8 10 SE +/- 0.004, N = 5 SE +/- 0.014, N = 5 SE +/- 0.009, N = 5 SE +/- 0.012, N = 5 SE +/- 0.006, N = 5 8.369 8.379 8.411 8.436 8.500 1. (CXX) g++ options: -O3 -march=native -fvisibility=hidden -logg -lm
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed GCC 11.1 GCC 12.0.0 20210701 GCC 8.5 GCC 9.4 GCC 10.3 14 28 42 56 70 SE +/- 0.50, N = 9 SE +/- 0.47, N = 3 SE +/- 0.19, N = 3 SE +/- 0.50, N = 3 SE +/- 0.56, N = 3 61.0 60.8 60.3 60.2 60.1 1. (CC) gcc options: -O3 -march=native -pthread -lz
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput GCC 9.4 GCC 10.3 GCC 8.5 GCC 11.1 GCC 12.0.0 20210701 50 100 150 200 250 SE +/- 0.84, N = 3 SE +/- 0.30, N = 3 SE +/- 0.03, N = 3 SE +/- 0.47, N = 3 SE +/- 0.26, N = 3 220.62 219.54 218.81 218.62 217.43 1. (CC) gcc options: -O3 -march=native -rdynamic
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 4 - Input: Bosphorus 4K GCC 10.3 GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 GCC 8.5 0.3058 0.6116 0.9174 1.2232 1.529 SE +/- 0.003, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 SE +/- 0.005, N = 3 1.359 1.355 1.348 1.347 1.340 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8.7 Encoder Mode: Preset 8 - Input: Bosphorus 4K GCC 11.1 GCC 10.3 GCC 9.4 GCC 12.0.0 20210701 GCC 8.5 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 12.09 12.06 12.02 11.97 11.93 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT GCC 9.4 GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 20 40 60 80 100 SE +/- 0.26, N = 3 SE +/- 0.43, N = 3 SE +/- 0.33, N = 3 SE +/- 0.49, N = 3 SE +/- 0.42, N = 3 78.0 77.6 77.4 77.1 77.0 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 9 18 27 36 45 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 40.64 40.31 40.20 40.16 40.13 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lpthread -lm -lrt
PJSIP Method: OPTIONS, Stateful OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateful GCC 8.5 GCC 11.1 GCC 12.0.0 20210701 GCC 10.3 GCC 9.4 1200 2400 3600 4800 6000 SE +/- 16.26, N = 3 SE +/- 53.69, N = 3 SE +/- 23.13, N = 3 SE +/- 8.67, N = 3 SE +/- 24.67, N = 3 5801 5769 5763 5744 5729 1. (CC) gcc options: -lstdc++ -lssl -lcrypto -lm -lrt -lpthread -O3 -march=native
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 13.04 12.96 12.94 12.91 12.88 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
Ngspice Circuit: C2670 OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 GCC 10.3 GCC 9.4 GCC 8.5 GCC 11.1 GCC 12.0.0 20210701 30 60 90 120 150 SE +/- 1.13, N = 3 SE +/- 0.71, N = 3 SE +/- 0.81, N = 3 SE +/- 0.92, N = 3 SE +/- 0.83, N = 3 133.91 134.22 134.35 134.71 135.43 1. (CC) gcc options: -O3 -march=native -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p GCC 9.4 GCC 11.1 GCC 12.0.0 20210701 GCC 8.5 GCC 10.3 80 160 240 320 400 SE +/- 0.96, N = 3 SE +/- 1.99, N = 3 SE +/- 0.75, N = 3 SE +/- 0.95, N = 3 SE +/- 1.74, N = 3 376.89 375.65 375.32 374.62 372.69 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 GCC 11.1 GCC 9.4 1.2589 2.5178 3.7767 5.0356 6.2945 SE +/- 0.01959, N = 3 SE +/- 0.02089, N = 3 SE +/- 0.02371, N = 3 SE +/- 0.02335, N = 3 SE +/- 0.02285, N = 3 5.53334 5.53672 5.54062 5.54544 5.59533 MIN: 5.38 MIN: 5.38 MIN: 5.4 MIN: 5.4 MIN: 5.45 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 10.92 10.92 10.93 10.96 11.02 MIN: 10.74 MIN: 10.74 MIN: 10.77 MIN: 10.74 MIN: 10.79 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 GCC 8.5 GCC 10.3 40 80 120 160 200 SE +/- 0.32, N = 3 SE +/- 0.31, N = 3 SE +/- 0.18, N = 3 SE +/- 0.58, N = 3 SE +/- 0.44, N = 3 191.33 191.02 190.62 190.03 189.88 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GCC 8.5 GCC 9.4 GCC 10.3 GCC 12.0.0 20210701 GCC 11.1 20 40 60 80 100 SE +/- 0.25, N = 3 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 SE +/- 0.38, N = 3 79.9 79.7 79.6 79.5 79.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
Botan Test: AES-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 GCC 10.3 GCC 8.5 GCC 9.4 GCC 11.1 GCC 12.0.0 20210701 900 1800 2700 3600 4500 SE +/- 4.08, N = 3 SE +/- 0.87, N = 3 SE +/- 0.82, N = 3 SE +/- 2.67, N = 3 SE +/- 7.30, N = 3 3999.25 3998.63 3987.08 3985.27 3972.02 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
GnuPG 2.7GB Sample File Encryption OpenBenchmarking.org Seconds, Fewer Is Better GnuPG 2.2.27 2.7GB Sample File Encryption GCC 11.1 GCC 9.4 GCC 8.5 GCC 10.3 GCC 12.0.0 20210701 14 28 42 56 70 SE +/- 0.23, N = 3 SE +/- 0.19, N = 3 SE +/- 0.17, N = 3 SE +/- 0.36, N = 3 SE +/- 0.56, N = 3 64.20 64.23 64.24 64.42 64.61 1. (CC) gcc options: -O3 -march=native
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet GCC 8.5 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 800 1600 2400 3200 4000 SE +/- 0.16, N = 3 SE +/- 0.59, N = 3 SE +/- 0.77, N = 3 SE +/- 0.21, N = 3 SE +/- 2.55, N = 3 3505.93 3508.29 3508.39 3524.75 3527.68 MIN: 3487.54 / MAX: 3535.34 MIN: 3489.27 / MAX: 3603.98 MIN: 3486.98 / MAX: 3606.8 MIN: 3509.67 / MAX: 3548.51 MIN: 3508.67 / MAX: 3981.67 1. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GCC 11.1 GCC 9.4 GCC 10.3 GCC 12.0.0 20210701 GCC 8.5 13 26 39 52 65 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 SE +/- 0.00, N = 3 57.4 57.4 57.3 57.2 57.1 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GCC 11.1 GCC 10.3 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 9 18 27 36 45 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.15, N = 3 SE +/- 0.00, N = 3 38.3 38.3 38.2 38.1 38.1 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU GCC 10.3 GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 GCC 8.5 0.1189 0.2378 0.3567 0.4756 0.5945 SE +/- 0.003324, N = 3 SE +/- 0.003361, N = 3 SE +/- 0.003225, N = 3 SE +/- 0.003036, N = 3 SE +/- 0.003001, N = 3 0.526014 0.526798 0.527939 0.528133 0.528416 MIN: 0.5 MIN: 0.5 MIN: 0.5 MIN: 0.5 MIN: 0.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU GCC 9.4 GCC 12.0.0 20210701 GCC 11.1 GCC 10.3 GCC 8.5 0.1038 0.2076 0.3114 0.4152 0.519 SE +/- 0.001062, N = 3 SE +/- 0.001427, N = 3 SE +/- 0.000687, N = 3 SE +/- 0.001585, N = 3 SE +/- 0.000631, N = 3 0.459456 0.459660 0.459956 0.460482 0.461459 MIN: 0.45 MIN: 0.45 MIN: 0.45 MIN: 0.45 MIN: 0.45 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack GCC 12.0.0 20210701 GCC 10.3 GCC 11.1 GCC 8.5 GCC 9.4 3 6 9 12 15 SE +/- 0.00, N = 5 SE +/- 0.02, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 13.33 13.34 13.35 13.36 13.38 1. (CXX) g++ options: -O3 -march=native -rdynamic
Botan Test: AES-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt GCC 10.3 GCC 11.1 GCC 9.4 GCC 12.0.0 20210701 GCC 8.5 900 1800 2700 3600 4500 SE +/- 0.87, N = 3 SE +/- 0.79, N = 3 SE +/- 0.39, N = 3 SE +/- 4.17, N = 3 SE +/- 1.28, N = 3 3998.48 3995.23 3993.89 3993.17 3991.20 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
libgav1 Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better libgav1 0.16.3 Video Input: Chimera 1080p 10-bit GCC 12.0.0 20210701 5 10 15 20 25 SE +/- 0.01, N = 3 21.32 1. (CXX) g++ options: -O3 -march=native -lpthread -lrt
libgav1 Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better libgav1 0.16.3 Video Input: Summer Nature 4K GCC 12.0.0 20210701 7 14 21 28 35 SE +/- 0.01, N = 3 28.24 1. (CXX) g++ options: -O3 -march=native -lpthread -lrt
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: yolov4-tiny GCC 8.5 GCC 10.3 GCC 9.4 GCC 11.1 GCC 12.0.0 20210701 5 10 15 20 25 SE +/- 0.25, N = 3 SE +/- 0.17, N = 3 SE +/- 0.19, N = 3 SE +/- 0.10, N = 3 SE +/- 1.86, N = 3 20.84 21.26 21.35 21.37 22.83 MIN: 19.92 / MAX: 24.91 MIN: 20 / MAX: 24.4 MIN: 20.42 / MAX: 33.9 MIN: 20.44 / MAX: 22.72 MIN: 20.18 / MAX: 937.4 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread -pthread
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: squeezenetv1.1 GCC 10.3 GCC 12.0.0 20210701 GCC 9.4 GCC 8.5 GCC 11.1 1.037 2.074 3.111 4.148 5.185 SE +/- 0.162, N = 3 SE +/- 0.149, N = 3 SE +/- 0.061, N = 15 SE +/- 0.036, N = 3 SE +/- 0.007, N = 3 4.271 4.283 4.420 4.564 4.609 MIN: 3.97 / MAX: 4.72 MIN: 3.97 / MAX: 4.71 MIN: 3.98 / MAX: 4.76 MIN: 4.42 / MAX: 4.75 MIN: 4.51 / MAX: 4.78 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT GCC 12.0.0 20210701 GCC 11.1 GCC 9.4 GCC 8.5 GCC 10.3 14 28 42 56 70 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 9.10, N = 3 63.7 63.7 63.7 63.6 54.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -rdynamic -lOpenCL
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU GCC 8.5 GCC 10.3 GCC 11.1 GCC 12.0.0 20210701 GCC 9.4 2 4 6 8 10 SE +/- 0.02602, N = 3 SE +/- 0.03159, N = 3 SE +/- 0.03513, N = 3 SE +/- 0.03413, N = 3 SE +/- 0.17046, N = 14 7.89032 7.90163 7.90477 7.91258 8.10038 MIN: 7.58 MIN: 7.61 MIN: 7.56 MIN: 7.63 MIN: 7.58 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Phoronix Test Suite v10.8.4