AMD EPYC 7763 64-Core testing with a Supermicro H12SSL-i v1.01 (2.0 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2104140-IB-EPYC7763L31 EPYC 7763 LLVM Clang Compiler Tests - Phoronix Test Suite EPYC 7763 LLVM Clang Compiler Tests AMD EPYC 7763 64-Core testing with a Supermicro H12SSL-i v1.01 (2.0 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2104140-IB-EPYC7763L31&grt&sro .
EPYC 7763 LLVM Clang Compiler Tests Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution Clang 12.0 Clang 11.0 Clang 12.0 LTO GCC 9.3 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 AMD EPYC 7763 64-Core @ 2.45GHz (64 Cores / 128 Threads) Supermicro H12SSL-i v1.01 (2.0 BIOS) AMD Starship/Matisse 126GB 3841GB Micron_9300_MTFDHAL3T8TDP ASPEED 2 x Broadcom NetXtreme BCM5720 2-port PCIe Ubuntu 20.04 5.12.0-051200rc6daily20210408-generic (x86_64) 20210407 GNOME Shell 3.36.4 X Server 1.20.8 Clang 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73 ext4 1024x768 Clang 11.0.0-2~ubuntu20.04.1 Clang 12.0.0-++20210409092622+fa0971b87fb2-1~exp1~20210409193326.73 GCC 9.3.0 GCC 10.3.0 GCC 11.0.1 20210413 Clang 12.0.0 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Environment Details - Clang 12.0: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" - Clang 11.0: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" - Clang 12.0 LTO: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto" - GCC 9.3: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" - GCC 10.3: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" - GCC 11.0.1: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" - AMD AOCC 3.0: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001119 Python Details - Python 3.8.2 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected Compiler Details - GCC 9.3: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - GCC 10.3: --disable-multilib --enable-checking=release - GCC 11.0.1: --disable-multilib --enable-checking=release - AMD AOCC 3.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: (unknown)
EPYC 7763 LLVM Clang Compiler Tests aom-av1: Speed 0 Two-Pass - Bosphorus 4K aom-av1: Speed 4 Two-Pass - Bosphorus 4K aom-av1: Speed 6 Realtime - Bosphorus 4K aom-av1: Speed 6 Two-Pass - Bosphorus 4K aom-av1: Speed 8 Realtime - Bosphorus 4K aom-av1: Speed 9 Realtime - Bosphorus 4K aom-av1: Speed 0 Two-Pass - Bosphorus 1080p aom-av1: Speed 4 Two-Pass - Bosphorus 1080p aom-av1: Speed 6 Realtime - Bosphorus 1080p aom-av1: Speed 6 Two-Pass - Bosphorus 1080p aom-av1: Speed 8 Realtime - Bosphorus 1080p aom-av1: Speed 9 Realtime - Bosphorus 1080p astcenc: Medium astcenc: Thorough astcenc: Exhaustive botan: KASUMI botan: KASUMI - Decrypt botan: AES-256 botan: AES-256 - Decrypt botan: Twofish botan: Twofish - Decrypt botan: Blowfish botan: Blowfish - Decrypt botan: CAST-256 botan: CAST-256 - Decrypt botan: ChaCha20Poly1305 botan: ChaCha20Poly1305 - Decrypt c-ray: Total Time - 4K, 16 Rays Per Pixel coremark: CoreMark Size 666 - Iterations Per Second dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit etcpak: DXT1 etcpak: ETC1 etcpak: ETC2 fftw: Stock - 1D FFT Size 32 fftw: Stock - 1D FFT Size 1024 fftw: Stock - 1D FFT Size 2048 fftw: Stock - 1D FFT Size 4096 fftw: Stock - 2D FFT Size 1024 fftw: Stock - 2D FFT Size 2048 fftw: Stock - 2D FFT Size 4096 fftw: Float + SSE - 1D FFT Size 32 fftw: Float + SSE - 1D FFT Size 1024 fftw: Float + SSE - 1D FFT Size 2048 fftw: Float + SSE - 1D FFT Size 4096 fftw: Float + SSE - 2D FFT Size 1024 fftw: Float + SSE - 2D FFT Size 2048 fftw: Float + SSE - 2D FFT Size 4096 financebench: Repo OpenMP financebench: Bonds OpenMP encode-flac: WAV To FLAC gcrypt: graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space jpegxl: PNG - 5 jpegxl: PNG - 7 jpegxl: PNG - 8 jpegxl: JPEG - 5 jpegxl: JPEG - 7 jpegxl: JPEG - 8 encode-mp3: WAV To MP3 avifenc: 0 avifenc: 2 avifenc: 6 avifenc: 10 avifenc: 6, Lossless avifenc: 10, Lossless libraw: Post-Processing Benchmark liquid-dsp: 1 - 256 - 57 liquid-dsp: 32 - 256 - 57 liquid-dsp: 64 - 256 - 57 liquid-dsp: 128 - 256 - 57 compress-lz4: 3 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 9 - Decompression Speed ngspice: C2670 ngspice: C7552 onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onnx: yolov4 - OpenMP CPU onnx: bertsquad-10 - OpenMP CPU onnx: fcn-resnet101-11 - OpenMP CPU onnx: shufflenet-v2-10 - OpenMP CPU onnx: super-resolution-10 - OpenMP CPU encode-opus: WAV To Opus Encode pgbench: 100 - 1 - Read Only pgbench: 100 - 1 - Read Only - Average Latency pgbench: 100 - 1 - Read Write pgbench: 100 - 1 - Read Write - Average Latency pgbench: 100 - 100 - Read Only pgbench: 100 - 100 - Read Only - Average Latency pgbench: 100 - 250 - Read Only pgbench: 100 - 250 - Read Only - Average Latency pgbench: 100 - 100 - Read Write pgbench: 100 - 100 - Read Write - Average Latency pgbench: 100 - 250 - Read Write pgbench: 100 - 250 - Read Write - Average Latency povray: Trace Time quantlib: scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation securemark: SecureMark-TLS simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID svt-av1: Enc Mode 0 - 1080p svt-av1: Enc Mode 4 - 1080p svt-av1: Enc Mode 8 - 1080p svt-hevc: 1 - Bosphorus 1080p svt-hevc: 7 - Bosphorus 1080p svt-hevc: 10 - Bosphorus 1080p svt-vp9: VMAF Optimized - Bosphorus 1080p svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p svt-vp9: Visual Quality Optimized - Bosphorus 1080p tachyon: Total Time mrbayes: Primate Phylogeny Analysis toybrot: TBB toybrot: OpenMP toybrot: C++ Tasks toybrot: C++ Threads tscp: AI Chess Performance viennacl: CPU BLAS - sCOPY viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-TT webp: Default webp: Quality 100 webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression webp2: Default webp2: Quality 75, Compression Effort 7 webp2: Quality 95, Compression Effort 7 webp2: Quality 100, Compression Effort 5 webp2: Quality 100, Lossless Compression x265: Bosphorus 4K x265: Bosphorus 1080p Clang 12.0 Clang 11.0 Clang 12.0 LTO GCC 9.3 GCC 10.3 GCC 11.0.1 AMD AOCC 3.0 0.21 4.87 17.22 8.99 33.39 38.11 0.53 7.10 26.85 22.13 88.78 103.17 4.0058 6.7647 18.9936 82.644 84.229 4659.338 4682.455 315.409 321.190 380.054 351.284 132.820 133.048 850.496 843.404 15.870 1785466.283969 1198.22 541.56 1244.11 308.32 2718.525 284.642 202.085 13333 10805 10467 9862.0 9088.3 7789.9 6744.1 15649 50350 51254 45428 36239 31935 22797 33246.837239 51596.867187 7.854 236.924 1993 712 614 1076 2136 457 605 74.27 12.15 0.82 66.66 66.38 28.13 8.256 47.884 25.175 9.510 3.361 25.220 5.746 41.78 55663000 1564833333 3070633333 3643766667 52.07 13911.5 48.50 13926.5 118.870 95.956 1.07701 3.28507 1.07507 0.710124 1.22132 1.44425 2.36797 2.03606 0.491940 0.779776 1302.70 593.972 1307.49 590.182 0.313689 1305.10 597.481 1.17258 333 498 112 9904 4456 7.567 24310 0.041 3281 0.305 1069022 0.094 1071209 0.234 62319 1.607 56684 4.431 9.296 2653.8 3190.62 675.13 363.85 4280.22 8848.40 1785.50 265204 2.75 0.84 4.60 4.62 0.183 11.474 118.067 41.09 345.30 643.58 487.43 488.23 372.49 16.0468 89.116 6780 7507 7437 7220 1570966 471 357 434 604 878 819 69.1 626 48.6 65.7 51.9 73.0 1.331 2.199 19.016 6.309 38.449 2.739 109.525 207.008 6.690 374.035 30.32 74.00 0.21 4.95 17.13 9.14 33.14 37.28 0.53 7.20 26.61 22.00 86.09 100.55 3.9837 6.7674 19.0255 79.149 80.221 4901.127 4895.558 299.214 302.405 319.234 351.075 128.586 127.740 848.236 840.637 15.599 1790837.010000 1190.41 543.43 1251.25 184.19 1872.759 205.065 168.819 13324 10564 10004.2 9438.6 8809.6 7878.5 6823.8 14590 50740 50084 46676 36181 31741 22913 33178.498698 51900.434896 7.979 240.205 1915 665 613 1068 2034 463 616 78.41 12.01 0.8 65.58 65.43 27.24 8.250 47.894 25.472 9.536 3.429 26.034 5.879 38.71 56307000 1578400000 3051366667 3596533333 52.35 13840.3 49.01 13927.9 103.826 90.527 1.08011 3.52787 1.07577 0.594729 0.841169 1.45757 2.31859 1.60540 0.489278 0.779101 1276.04 563.200 1277.62 562.970 0.315522 1271.91 563.247 1.15140 346 471 108 9797 4523 7.392 24943 0.040 3312 0.302 1069367 0.094 1065506 0.235 61616 1.626 54488 4.603 9.408 2640.2 3319.34 674.86 399.16 4590.37 9146.88 1785.42 260119 2.68 0.81 4.41 4.41 0.181 11.821 117.392 41.01 346.89 652.74 481.05 482.02 373.99 16.4099 88.620 6247 7029 6836 6395 1638265 495 412 462 1877 1043 933 51.2 677 83.6 79.3 88.3 84.0 1.336 2.240 18.573 6.243 37.727 2.743 109.636 203.634 7.366 392.849 29.94 73.36 2719.985 284.763 202.101 50.93 13715.0 48.47 13698.7 2657.8 93.633 7085 7367 7143 0.2 4.78 16.29 9.57 34.56 39.12 0.5 6.69 24.84 21.42 91.97 106.55 4.8745 7.8537 19.4794 84.864 84.130 5484.676 5391.990 337.355 339.069 412.846 412.072 127.298 127.343 616.096 611.977 9.158 2086609.978010 1145.50 530.82 1228.63 305.36 1082.365 269.673 174.812 14399 11689 11053 10548 9798.6 8408.5 7007.3 16590 53275 52749 52099 36321 31341 25068 42399.807757 76805.580729 8.534 232.572 2129 709 806 1217 1238 547 785 7.011 52.217 27.784 10.399 3.659 29.080 6.131 60.20 61404000 1721900000 2940466667 3012066667 53.83 13793.4 51.97 13895.3 101.535 89.091 1.17486 3.67278 1.17434 0.654010 0.869308 7.19213 2.99759 1.66260 0.599140 0.786762 1357.29 658.660 1358.56 659.191 0.376992 1356.91 657.876 0.717782 351 495 116 9419 5183 7.504 23895 0.042 3298 0.303 1057125 0.095 1067486 0.235 59364 1.688 53825 4.657 9.968 2338.9 3229.22 668.10 384.03 3765.88 9178.97 2149.15 238935 2.75 0.94 3.93 3.98 0.129 9.325 92.984 38.41 322.42 605.50 463.12 464.57 354.21 15.6837 89.163 5107 5451 5414 5142 1446372 1217 813 636 1587 1521 1133 65.0 798 98.5 95.3 100.9 97.9 1.397 2.273 19.298 7.053 39.072 2.778 118.447 220.944 6.753 388.946 28.91 72.14 0.21 4.84 17.03 9.10 35.26 39.32 0.52 6.87 26.49 21.64 93.05 107.46 4.8699 7.8370 19.4583 79.115 81.453 5525.710 5529.402 341.847 325.389 422.138 420.853 127.741 127.775 485.019 476.175 9.029 2110880.427978 1171.04 536.71 1245.11 316.14 1114.603 281.146 173.226 12576 11319 10711 10179 9247.3 8134.5 6974.0 16650 52054 53497 52130 35973 32061 23774 34979.294271 51770.509114 8.567 231.238 2112 689 807 1039 1208 544 772 7.231 51.454 27.386 10.417 3.643 26.911 6.107 58.90 62467333 1718000000 2942866667 3005033333 52.87 13906.1 52.36 13806.6 103.598 90.432 1.17894 3.61144 1.19747 0.646252 0.870784 7.23686 3.00341 1.64268 0.602155 0.782476 1382.41 659.265 1379.51 658.277 0.377733 1375.71 658.038 0.788192 351 505 115 10197 5559 7.469 24845 0.040 3369 0.297 1076357 0.093 1089731 0.230 58894 1.701 53019 4.731 9.570 2392.6 3235.94 682.87 388.98 3820.77 9248.89 2038.15 242700 2.77 0.9 4.02 4.13 0.169 11.230 109.697 39.03 330.53 615.62 472.61 477.67 364.12 16.1468 93.656 5181 5524 5610 5383 1467179 1065.60 1350.0 592.97 1461.2 2158.4 1056.42 56.2 741.4 98.7 94.4 104 98.5 1.372 2.225 18.883 7.078 38.548 2.918 116.655 215.565 6.934 406.027 28.60 72.60 0.21 4.84 17.37 9.41 35.26 39.71 0.52 6.95 27.01 22.11 94.40 111.27 4.8160 7.6989 19.6189 9.227 2176407.665929 1180.44 538.28 1249.74 334.35 12765 11044 10675 10205 9238.4 8231.1 6948.2 16590 51706 54710 51391 35718 31662 24888 34199.600260 51376.816406 8.709 233.514 2161 694 809 1082 1188 550 771 7.473 51.034 27.103 10.291 3.607 27.057 6.149 57.24 60886333 1679800000 2989400000 3055766667 51.32 13882.2 51.17 13857.4 103.005 90.264 7.381 23661 0.042 3383 0.296 1090824 0.092 1090160 0.230 56369 1.777 53102 4.722 3182.35 647.82 388.88 3462.66 9263.55 2148.84 243861 0.176 11.905 110.702 38.86 329.32 611.73 472.32 478.16 366.39 15.4989 89.432 1494250 1210 1496 649 1599 2359 1153 63.9 794 100.5 95.0 104 99.3 1.386 2.274 18.314 7.003 37.948 28.79 71.79 3.8811 6.6409 18.9127 82.827 82.949 4891.072 4887.573 304.996 303.806 319.787 355.059 127.768 128.008 845.141 838.089 15.649 1720060.441307 1188.43 541.58 1251.91 192.00 2654.721 211.733 178.852 13192 10669 10227 9603.2 8902.1 7784.8 6875.3 16146 49685 44412 45521 36100 31013 23111 33146.028646 51885.519531 9.280 240.405 1929 660 617 1057 1866 466 614 79.23 11.37 0.81 65.57 65.68 27.29 8.142 48.127 25.598 9.725 3.543 25.783 5.948 41.64 57411333 1609633333 3100400000 3606466667 53.77 13562.5 50.32 13561.5 103.929 91.986 1.03899 3.41583 1.04484 0.554231 0.833921 1.37059 2.28755 1.59597 0.459724 0.773233 1267.18 544.099 1259.59 544.306 0.301885 1268.08 544.600 1.17044 386 459 122 11325 4383 9.494 2725.7 3298.29 690.94 398.96 4594.27 9021.83 1785.45 264637 2.73 0.82 4.33 4.47 0.183 11.690 116.493 40.95 343.85 638.10 476.95 478.62 373.89 16.0581 86.742 6945 7477 7189 7144 1697846 531 326 477 1944 1017 1165 55.2 783 84.0 78.8 90.0 84.4 1.351 2.262 19.126 6.578 38.338 2.816 109.811 205.034 7.403 382.985 30.44 73.51 OpenBenchmarking.org
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.0473 0.0946 0.1419 0.1892 0.2365 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.21 0.21 0.21 0.21 0.20 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 1.1138 2.2276 3.3414 4.4552 5.569 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 4.95 4.87 4.84 4.84 4.78 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 17.13 17.22 17.03 17.37 16.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 6 9.14 8.99 9.10 9.41 9.57 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 8 16 24 32 40 SE +/- 0.22, N = 3 SE +/- 0.48, N = 3 SE +/- 0.19, N = 3 SE +/- 0.47, N = 3 SE +/- 0.12, N = 3 33.14 33.39 35.26 35.26 34.56 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 9 18 27 36 45 SE +/- 0.31, N = 3 SE +/- 0.43, N = 3 SE +/- 0.19, N = 3 SE +/- 0.29, N = 3 SE +/- 0.38, N = 3 37.28 38.11 39.32 39.71 39.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.1193 0.2386 0.3579 0.4772 0.5965 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.53 0.53 0.52 0.52 0.50 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 7.20 7.10 6.87 6.95 6.69 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.27, N = 3 SE +/- 0.25, N = 3 SE +/- 0.28, N = 3 SE +/- 0.13, N = 3 26.61 26.85 26.49 27.01 24.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 5 10 15 20 25 SE +/- 0.15, N = 3 SE +/- 0.05, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 22.00 22.13 21.64 22.11 21.42 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.51, N = 3 SE +/- 1.07, N = 3 SE +/- 0.65, N = 3 SE +/- 0.47, N = 3 SE +/- 0.89, N = 3 86.09 88.78 93.05 94.40 91.97 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.0 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.53, N = 3 SE +/- 0.31, N = 3 SE +/- 1.76, N = 3 SE +/- 1.15, N = 8 SE +/- 1.10, N = 8 100.55 103.17 107.46 111.27 106.55 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Medium AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 1.0968 2.1936 3.2904 4.3872 5.484 SE +/- 0.0042, N = 3 SE +/- 0.0013, N = 3 SE +/- 0.0116, N = 3 SE +/- 0.0047, N = 3 SE +/- 0.0099, N = 3 SE +/- 0.0035, N = 3 3.8811 3.9837 4.0058 4.8699 4.8160 4.8745 1. (CXX) g++ options: -O3 -march=native -flto -pthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Thorough AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2 4 6 8 10 SE +/- 0.0015, N = 3 SE +/- 0.0026, N = 3 SE +/- 0.0028, N = 3 SE +/- 0.0011, N = 3 SE +/- 0.0034, N = 3 SE +/- 0.0029, N = 3 6.6409 6.7674 6.7647 7.8370 7.6989 7.8537 1. (CXX) g++ options: -O3 -march=native -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.4 Preset: Exhaustive AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 18.91 19.03 18.99 19.46 19.62 19.48 1. (CXX) g++ options: -O3 -march=native -flto -pthread
Botan Test: KASUMI OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 82.83 79.15 82.64 79.12 84.86 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 82.95 80.22 84.23 81.45 84.13 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: AES-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 1200 2400 3600 4800 6000 SE +/- 0.05, N = 3 SE +/- 2.16, N = 3 SE +/- 2.14, N = 3 SE +/- 4.47, N = 3 SE +/- 42.69, N = 3 4891.07 4901.13 4659.34 5525.71 5484.68 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: AES-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 1200 2400 3600 4800 6000 SE +/- 3.70, N = 3 SE +/- 1.35, N = 3 SE +/- 4.78, N = 3 SE +/- 5.42, N = 3 SE +/- 11.31, N = 3 4887.57 4895.56 4682.46 5529.40 5391.99 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 70 140 210 280 350 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.13, N = 3 SE +/- 0.52, N = 3 SE +/- 0.04, N = 3 305.00 299.21 315.41 341.85 337.36 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 70 140 210 280 350 SE +/- 0.06, N = 3 SE +/- 0.15, N = 3 SE +/- 0.16, N = 3 SE +/- 0.44, N = 3 SE +/- 0.04, N = 3 303.81 302.41 321.19 325.39 339.07 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 90 180 270 360 450 SE +/- 1.14, N = 3 SE +/- 1.73, N = 3 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 0.09, N = 3 319.79 319.23 380.05 422.14 412.85 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 90 180 270 360 450 SE +/- 1.17, N = 3 SE +/- 2.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.95, N = 3 SE +/- 0.12, N = 3 355.06 351.08 351.28 420.85 412.07 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.33, N = 3 SE +/- 0.09, N = 3 127.77 128.59 132.82 127.74 127.30 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 30 60 90 120 150 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.32, N = 3 SE +/- 0.08, N = 3 128.01 127.74 133.05 127.78 127.34 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 200 400 600 800 1000 SE +/- 3.15, N = 3 SE +/- 0.62, N = 3 SE +/- 4.85, N = 3 SE +/- 0.28, N = 3 SE +/- 0.13, N = 3 845.14 848.24 850.50 485.02 616.10 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 200 400 600 800 1000 SE +/- 3.17, N = 3 SE +/- 0.16, N = 3 SE +/- 4.64, N = 3 SE +/- 0.02, N = 3 SE +/- 0.40, N = 3 838.09 840.64 843.40 476.18 611.98 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 4 8 12 16 20 SE +/- 0.063, N = 3 SE +/- 0.009, N = 3 SE +/- 0.023, N = 3 SE +/- 0.014, N = 3 SE +/- 0.027, N = 3 SE +/- 0.014, N = 3 15.649 15.599 15.870 9.029 9.227 9.158 1. (CC) gcc options: -lm -lpthread -O3 -march=native
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 500K 1000K 1500K 2000K 2500K SE +/- 3670.84, N = 3 SE +/- 971.31, N = 3 SE +/- 984.68, N = 3 SE +/- 2170.85, N = 3 SE +/- 5755.65, N = 3 SE +/- 4791.32, N = 3 1720060.44 1790837.01 1785466.28 2110880.43 2176407.67 2086609.98 1. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Chimera 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 300 600 900 1200 1500 SE +/- 0.97, N = 3 SE +/- 6.69, N = 3 SE +/- 2.95, N = 3 SE +/- 3.74, N = 3 SE +/- 1.75, N = 3 SE +/- 5.12, N = 3 1188.43 1190.41 1198.22 1171.04 1180.44 1145.50 -lm - MIN: 703.73 / MAX: 1484.94 -lm - MIN: 685.16 / MAX: 1496.36 MIN: 700.24 / MAX: 1494.16 -lm - MIN: 683.28 / MAX: 1473.51 -lm - MIN: 680.31 / MAX: 1485.74 -lm - MIN: 664.19 / MAX: 1441.54 1. (CC) gcc options: -O3 -march=native -pthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Summer Nature 4K AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 120 240 360 480 600 SE +/- 1.13, N = 3 SE +/- 1.43, N = 3 SE +/- 1.79, N = 3 SE +/- 0.67, N = 3 SE +/- 2.51, N = 3 SE +/- 1.35, N = 3 541.58 543.43 541.56 536.71 538.28 530.82 -lm - MIN: 259.4 / MAX: 585.8 -lm - MIN: 256.75 / MAX: 593.99 MIN: 252.01 / MAX: 587.53 -lm - MIN: 256.44 / MAX: 577.82 -lm - MIN: 251.6 / MAX: 584.38 -lm - MIN: 248.84 / MAX: 574.28 1. (CC) gcc options: -O3 -march=native -pthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Summer Nature 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 300 600 900 1200 1500 SE +/- 4.95, N = 3 SE +/- 2.13, N = 3 SE +/- 7.87, N = 3 SE +/- 8.15, N = 3 SE +/- 1.96, N = 3 SE +/- 2.25, N = 3 1251.91 1251.25 1244.11 1245.11 1249.74 1228.63 -lm - MIN: 543.89 / MAX: 1394.16 -lm - MIN: 556.46 / MAX: 1394.06 MIN: 549.81 / MAX: 1390.03 -lm - MIN: 539.07 / MAX: 1398.87 -lm - MIN: 559.74 / MAX: 1387.11 -lm - MIN: 555.28 / MAX: 1361.68 1. (CC) gcc options: -O3 -march=native -pthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.2 Video Input: Chimera 1080p 10-bit AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 70 140 210 280 350 SE +/- 0.39, N = 3 SE +/- 0.48, N = 3 SE +/- 0.93, N = 3 SE +/- 0.21, N = 3 SE +/- 1.11, N = 3 SE +/- 0.71, N = 3 192.00 184.19 308.32 316.14 334.35 305.36 -lm - MIN: 118.57 / MAX: 324.98 -lm - MIN: 114.52 / MAX: 310.5 MIN: 220.53 / MAX: 490.51 -lm - MIN: 218.19 / MAX: 515.85 -lm - MIN: 234.24 / MAX: 544.9 -lm - MIN: 210.86 / MAX: 493.21 1. (CC) gcc options: -O3 -march=native -pthread
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 9.3 600 1200 1800 2400 3000 SE +/- 8.09, N = 3 SE +/- 1.69, N = 3 SE +/- 2.64, N = 3 SE +/- 6.09, N = 3 SE +/- 0.48, N = 3 SE +/- 0.16, N = 3 2654.72 1872.76 2718.53 2719.99 1114.60 1082.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC1 AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 9.3 60 120 180 240 300 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.11, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 211.73 205.07 284.64 284.76 281.15 269.67 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 9.3 40 80 120 160 200 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 178.85 168.82 202.09 202.10 173.23 174.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
FFTW Build: Stock - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 3K 6K 9K 12K 15K SE +/- 41.35, N = 3 SE +/- 20.33, N = 3 SE +/- 24.25, N = 3 SE +/- 16.05, N = 3 SE +/- 45.16, N = 3 SE +/- 67.28, N = 3 13192 13324 13333 12576 12765 14399 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Stock - Size: 1D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 1024 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 3K 6K 9K 12K 15K SE +/- 34.64, N = 3 SE +/- 35.53, N = 3 SE +/- 27.10, N = 3 SE +/- 32.26, N = 3 SE +/- 189.35, N = 3 SE +/- 44.20, N = 3 10669 10564 10805 11319 11044 11689 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Stock - Size: 1D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 2048 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2K 4K 6K 8K 10K SE +/- 39.89, N = 3 SE +/- 28.76, N = 3 SE +/- 7.75, N = 3 SE +/- 14.75, N = 3 SE +/- 55.19, N = 3 SE +/- 37.69, N = 3 10227.0 10004.2 10467.0 10711.0 10675.0 11053.0 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Stock - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2K 4K 6K 8K 10K SE +/- 43.38, N = 3 SE +/- 15.16, N = 3 SE +/- 101.36, N = 3 SE +/- 57.26, N = 3 SE +/- 48.56, N = 3 SE +/- 20.21, N = 3 9603.2 9438.6 9862.0 10179.0 10205.0 10548.0 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Stock - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 1024 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2K 4K 6K 8K 10K SE +/- 14.28, N = 3 SE +/- 45.95, N = 3 SE +/- 48.25, N = 3 SE +/- 41.68, N = 3 SE +/- 25.87, N = 3 SE +/- 19.46, N = 3 8902.1 8809.6 9088.3 9247.3 9238.4 9798.6 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Stock - Size: 2D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 2048 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2K 4K 6K 8K 10K SE +/- 19.99, N = 3 SE +/- 27.38, N = 3 SE +/- 65.76, N = 3 SE +/- 56.49, N = 3 SE +/- 36.00, N = 3 SE +/- 50.36, N = 3 7784.8 7878.5 7789.9 8134.5 8231.1 8408.5 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Stock - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 1500 3000 4500 6000 7500 SE +/- 65.81, N = 3 SE +/- 60.67, N = 3 SE +/- 35.20, N = 3 SE +/- 25.90, N = 3 SE +/- 23.67, N = 3 SE +/- 30.40, N = 3 6875.3 6823.8 6744.1 6974.0 6948.2 7007.3 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 4K 8K 12K 16K 20K SE +/- 5.33, N = 3 SE +/- 129.55, N = 3 SE +/- 48.79, N = 3 SE +/- 108.41, N = 3 SE +/- 168.99, N = 3 SE +/- 170.19, N = 8 16146 14590 15649 16650 16590 16590 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 1024 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 11K 22K 33K 44K 55K SE +/- 621.84, N = 15 SE +/- 585.78, N = 3 SE +/- 952.64, N = 12 SE +/- 439.64, N = 15 SE +/- 568.96, N = 3 SE +/- 788.42, N = 3 49685 50740 50350 52054 51706 53275 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 2048 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 12K 24K 36K 48K 60K SE +/- 756.91, N = 3 SE +/- 582.34, N = 3 SE +/- 439.50, N = 3 SE +/- 743.81, N = 3 SE +/- 156.75, N = 3 SE +/- 725.00, N = 3 44412 50084 51254 53497 54710 52749 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 11K 22K 33K 44K 55K SE +/- 542.47, N = 15 SE +/- 413.24, N = 15 SE +/- 671.66, N = 15 SE +/- 844.19, N = 3 SE +/- 227.13, N = 3 SE +/- 228.68, N = 3 45521 46676 45428 52130 51391 52099 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 1024 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 8K 16K 24K 32K 40K SE +/- 455.21, N = 12 SE +/- 530.09, N = 4 SE +/- 165.99, N = 3 SE +/- 301.69, N = 3 SE +/- 442.82, N = 3 SE +/- 79.87, N = 3 36100 36181 36239 35973 35718 36321 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 2048 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 7K 14K 21K 28K 35K SE +/- 378.89, N = 6 SE +/- 146.10, N = 3 SE +/- 77.17, N = 3 SE +/- 14.99, N = 3 SE +/- 209.56, N = 3 SE +/- 37.37, N = 3 31013 31741 31935 32061 31662 31341 1. (CC) gcc options: -pthread -O3 -march=native -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 5K 10K 15K 20K 25K SE +/- 349.17, N = 9 SE +/- 220.77, N = 3 SE +/- 348.10, N = 9 SE +/- 538.47, N = 9 SE +/- 160.97, N = 3 SE +/- 106.49, N = 3 23111 22913 22797 23774 24888 25068 1. (CC) gcc options: -pthread -O3 -march=native -lm
FinanceBench Benchmark: Repo OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Repo OpenMP AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 9K 18K 27K 36K 45K SE +/- 9.32, N = 3 SE +/- 0.81, N = 3 SE +/- 64.93, N = 3 SE +/- 102.23, N = 3 SE +/- 3.94, N = 3 SE +/- 453.41, N = 14 33146.03 33178.50 33246.84 34979.29 34199.60 42399.81 1. (CXX) g++ options: -O3 -march=native -fopenmp
FinanceBench Benchmark: Bonds OpenMP OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Bonds OpenMP AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 16K 32K 48K 64K 80K SE +/- 242.64, N = 3 SE +/- 4.51, N = 3 SE +/- 10.95, N = 3 SE +/- 23.55, N = 3 SE +/- 42.62, N = 3 SE +/- 971.24, N = 3 51885.52 51900.43 51596.87 51770.51 51376.82 76805.58 1. (CXX) g++ options: -O3 -march=native -fopenmp
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.2 WAV To FLAC AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 3 6 9 12 15 SE +/- 0.006, N = 5 SE +/- 0.006, N = 5 SE +/- 0.007, N = 5 SE +/- 0.008, N = 5 SE +/- 0.006, N = 5 SE +/- 0.011, N = 5 9.280 7.979 7.854 8.567 8.709 8.534 -fvisibility=hidden -fvisibility=hidden -fvisibility=hidden 1. (CXX) g++ options: -O3 -march=native -logg -lm
Gcrypt Library OpenBenchmarking.org Seconds, Fewer Is Better Gcrypt Library 1.9 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 50 100 150 200 250 SE +/- 0.82, N = 3 SE +/- 0.28, N = 3 SE +/- 0.44, N = 3 SE +/- 0.32, N = 3 SE +/- 0.18, N = 3 SE +/- 0.54, N = 3 240.41 240.21 236.92 231.24 233.51 232.57 1. (CC) gcc options: -O3 -march=native -fvisibility=hidden
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 500 1000 1500 2000 2500 SE +/- 4.63, N = 3 SE +/- 12.41, N = 3 SE +/- 6.57, N = 3 SE +/- 1.20, N = 3 SE +/- 4.81, N = 3 SE +/- 1.20, N = 3 1929 1915 1993 2112 2161 2129 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 150 300 450 600 750 SE +/- 1.33, N = 3 SE +/- 2.60, N = 3 SE +/- 5.21, N = 3 SE +/- 6.43, N = 3 660 665 712 689 694 709 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 200 400 600 800 1000 SE +/- 0.58, N = 3 SE +/- 2.03, N = 3 617 613 614 807 809 806 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 300 600 900 1200 1500 SE +/- 1.53, N = 3 SE +/- 1.86, N = 3 SE +/- 0.88, N = 3 SE +/- 0.33, N = 3 SE +/- 1.53, N = 3 1057 1068 1076 1039 1082 1217 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 500 1000 1500 2000 2500 SE +/- 52.84, N = 15 SE +/- 27.29, N = 3 SE +/- 41.63, N = 12 SE +/- 14.93, N = 3 SE +/- 17.34, N = 3 SE +/- 18.77, N = 3 1866 2034 2136 1208 1188 1238 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 120 240 360 480 600 SE +/- 0.33, N = 3 SE +/- 1.00, N = 3 SE +/- 1.00, N = 3 466 463 457 544 550 547 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 200 400 600 800 1000 SE +/- 1.20, N = 3 SE +/- 0.88, N = 3 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 614 616 605 772 771 785 1. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
JPEG XL Input: PNG - Encode Speed: 5 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: PNG - Encode Speed: 5 AMD AOCC 3.0 Clang 11.0 Clang 12.0 20 40 60 80 100 SE +/- 0.41, N = 3 SE +/- 0.24, N = 3 SE +/- 0.17, N = 3 79.23 78.41 74.27 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
JPEG XL Input: PNG - Encode Speed: 7 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: PNG - Encode Speed: 7 AMD AOCC 3.0 Clang 11.0 Clang 12.0 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 11.37 12.01 12.15 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
JPEG XL Input: PNG - Encode Speed: 8 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: PNG - Encode Speed: 8 AMD AOCC 3.0 Clang 11.0 Clang 12.0 0.1845 0.369 0.5535 0.738 0.9225 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.81 0.80 0.82 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
JPEG XL Input: JPEG - Encode Speed: 5 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: JPEG - Encode Speed: 5 AMD AOCC 3.0 Clang 11.0 Clang 12.0 15 30 45 60 75 SE +/- 0.17, N = 3 SE +/- 0.20, N = 3 SE +/- 0.14, N = 3 65.57 65.58 66.66 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
JPEG XL Input: JPEG - Encode Speed: 7 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: JPEG - Encode Speed: 7 AMD AOCC 3.0 Clang 11.0 Clang 12.0 15 30 45 60 75 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.16, N = 3 65.68 65.43 66.38 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
JPEG XL Input: JPEG - Encode Speed: 8 OpenBenchmarking.org MP/s, More Is Better JPEG XL 0.3.3 Input: JPEG - Encode Speed: 8 AMD AOCC 3.0 Clang 11.0 Clang 12.0 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 27.29 27.24 28.13 1. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -fPIE -pie -pthread -ldl
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2 4 6 8 10 SE +/- 0.008, N = 3 SE +/- 0.021, N = 3 SE +/- 0.003, N = 3 SE +/- 0.006, N = 3 SE +/- 0.005, N = 3 SE +/- 0.019, N = 3 8.142 8.250 8.256 7.231 7.473 7.011 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr 1. (CC) gcc options: -O3 -pipe -march=native -lncurses -lm
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 0 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 12 24 36 48 60 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 48.13 47.89 47.88 51.45 51.03 52.22 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 2 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 7 14 21 28 35 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 25.60 25.47 25.18 27.39 27.10 27.78 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 6 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 3 6 9 12 15 SE +/- 0.016, N = 3 SE +/- 0.022, N = 3 SE +/- 0.014, N = 3 SE +/- 0.032, N = 3 SE +/- 0.052, N = 3 SE +/- 0.031, N = 3 9.725 9.536 9.510 10.417 10.291 10.399 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.8233 1.6466 2.4699 3.2932 4.1165 SE +/- 0.004, N = 3 SE +/- 0.010, N = 3 SE +/- 0.014, N = 3 SE +/- 0.022, N = 3 SE +/- 0.002, N = 3 SE +/- 0.016, N = 3 3.543 3.429 3.361 3.643 3.607 3.659 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 6, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 6, Lossless AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.22, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 25.78 26.03 25.22 26.91 27.06 29.08 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 10, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.9.0 Encoder Speed: 10, Lossless AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2 4 6 8 10 SE +/- 0.022, N = 3 SE +/- 0.011, N = 3 SE +/- 0.013, N = 3 SE +/- 0.007, N = 3 SE +/- 0.017, N = 3 SE +/- 0.022, N = 3 5.948 5.879 5.746 6.107 6.149 6.131 1. (CXX) g++ options: -O3 -fPIC -lm
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 13 26 39 52 65 SE +/- 0.04, N = 3 SE +/- 0.33, N = 3 SE +/- 0.12, N = 3 SE +/- 0.23, N = 3 SE +/- 0.16, N = 3 SE +/- 0.19, N = 3 41.64 38.71 41.78 58.90 57.24 60.20 1. (CXX) g++ options: -O3 -march=native -fopenmp -ljpeg -lz -lm
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 1 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 13M 26M 39M 52M 65M SE +/- 47026.00, N = 3 SE +/- 40360.87, N = 3 SE +/- 790005.27, N = 3 SE +/- 6887.99, N = 3 SE +/- 318169.94, N = 3 SE +/- 870702.21, N = 3 57411333 56307000 55663000 62467333 60886333 61404000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 400M 800M 1200M 1600M 2000M SE +/- 2130988.29, N = 3 SE +/- 1331665.62, N = 3 SE +/- 2255610.29, N = 3 SE +/- 15763988.50, N = 3 SE +/- 17297784.06, N = 3 SE +/- 4864497.23, N = 3 1609633333 1578400000 1564833333 1718000000 1679800000 1721900000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
Liquid-DSP Threads: 64 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 700M 1400M 2100M 2800M 3500M SE +/- 1234233.91, N = 3 SE +/- 2452436.43, N = 3 SE +/- 6045475.81, N = 3 SE +/- 4643753.27, N = 3 SE +/- 1154700.54, N = 3 SE +/- 2961043.36, N = 3 3100400000 3051366667 3070633333 2942866667 2989400000 2940466667 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 800M 1600M 2400M 3200M 4000M SE +/- 1543084.93, N = 3 SE +/- 1559202.08, N = 3 SE +/- 883804.91, N = 3 SE +/- 1679616.36, N = 3 SE +/- 6016181.88, N = 3 SE +/- 3384441.53, N = 3 3606466667 3596533333 3643766667 3005033333 3055766667 3012066667 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 11.0.1 GCC 9.3 12 24 36 48 60 SE +/- 0.48, N = 3 SE +/- 0.33, N = 3 SE +/- 0.80, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.73, N = 4 SE +/- 0.77, N = 4 53.77 52.35 52.07 50.93 52.87 51.32 53.83 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 11.0.1 GCC 9.3 3K 6K 9K 12K 15K SE +/- 73.30, N = 3 SE +/- 15.91, N = 3 SE +/- 71.01, N = 3 SE +/- 60.82, N = 3 SE +/- 42.32, N = 3 SE +/- 34.44, N = 4 SE +/- 37.19, N = 4 13562.5 13840.3 13911.5 13715.0 13906.1 13882.2 13793.4 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 11.0.1 GCC 9.3 12 24 36 48 60 SE +/- 0.26, N = 3 SE +/- 0.46, N = 3 SE +/- 0.42, N = 3 SE +/- 0.74, N = 3 SE +/- 0.72, N = 4 SE +/- 0.65, N = 3 SE +/- 0.65, N = 5 50.32 49.01 48.50 48.47 52.36 51.17 51.97 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 11.0.1 GCC 9.3 3K 6K 9K 12K 15K SE +/- 33.89, N = 3 SE +/- 23.21, N = 3 SE +/- 65.90, N = 3 SE +/- 46.50, N = 3 SE +/- 6.60, N = 4 SE +/- 62.74, N = 3 SE +/- 17.75, N = 5 13561.5 13927.9 13926.5 13698.7 13806.6 13857.4 13895.3 1. (CC) gcc options: -O3
Ngspice Circuit: C2670 OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 30 60 90 120 150 SE +/- 0.22, N = 3 SE +/- 0.06, N = 3 SE +/- 0.53, N = 3 SE +/- 0.48, N = 3 SE +/- 1.53, N = 3 SE +/- 1.32, N = 3 103.93 103.83 118.87 103.60 103.01 101.54 -lstdc++ -lstdc++ -lstdc++ -lstdc++ 1. (CC) gcc options: -O3 -march=native -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
Ngspice Circuit: C7552 OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 1.37, N = 3 SE +/- 1.11, N = 6 SE +/- 0.12, N = 3 SE +/- 0.43, N = 3 SE +/- 0.60, N = 3 91.99 90.53 95.96 90.43 90.26 89.09 -lstdc++ -lstdc++ -lstdc++ -lstdc++ 1. (CC) gcc options: -O3 -march=native -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.2653 0.5306 0.7959 1.0612 1.3265 SE +/- 0.00160, N = 3 SE +/- 0.00127, N = 3 SE +/- 0.00199, N = 3 SE +/- 0.00296, N = 3 SE +/- 0.00349, N = 3 1.03899 1.08011 1.07701 1.17894 1.17486 -fopenmp=libomp - MIN: 0.99 -fopenmp=libomp - MIN: 1.03 -fopenmp=libomp - MIN: 1.04 -fopenmp - MIN: 1.12 -fopenmp - MIN: 1.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.8264 1.6528 2.4792 3.3056 4.132 SE +/- 0.02018, N = 3 SE +/- 0.04735, N = 3 SE +/- 0.01639, N = 3 SE +/- 0.02637, N = 3 SE +/- 0.03246, N = 3 3.41583 3.52787 3.28507 3.61144 3.67278 -fopenmp=libomp - MIN: 3.24 -fopenmp=libomp - MIN: 3.29 -fopenmp=libomp - MIN: 3.15 -fopenmp - MIN: 3.37 -fopenmp - MIN: 3.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.2694 0.5388 0.8082 1.0776 1.347 SE +/- 0.00668, N = 3 SE +/- 0.00395, N = 3 SE +/- 0.00286, N = 3 SE +/- 0.00438, N = 3 SE +/- 0.00597, N = 3 1.04484 1.07577 1.07507 1.19747 1.17434 -fopenmp=libomp - MIN: 0.83 -fopenmp=libomp - MIN: 0.86 -fopenmp=libomp - MIN: 0.87 -fopenmp - MIN: 0.98 -fopenmp - MIN: 0.96 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.1598 0.3196 0.4794 0.6392 0.799 SE +/- 0.000764, N = 3 SE +/- 0.008914, N = 3 SE +/- 0.011383, N = 3 SE +/- 0.003112, N = 3 SE +/- 0.003317, N = 3 0.554231 0.594729 0.710124 0.646252 0.654010 -fopenmp=libomp - MIN: 0.5 -fopenmp=libomp - MIN: 0.53 -fopenmp=libomp - MIN: 0.64 -fopenmp - MIN: 0.6 -fopenmp - MIN: 0.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.2748 0.5496 0.8244 1.0992 1.374 SE +/- 0.000645, N = 3 SE +/- 0.000480, N = 3 SE +/- 0.018279, N = 4 SE +/- 0.001247, N = 3 SE +/- 0.001032, N = 3 0.833921 0.841169 1.221320 0.870784 0.869308 -fopenmp=libomp - MIN: 0.81 -fopenmp=libomp - MIN: 0.82 -fopenmp=libomp - MIN: 1.13 -fopenmp - MIN: 0.84 -fopenmp - MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 2 4 6 8 10 SE +/- 0.00485, N = 3 SE +/- 0.00568, N = 3 SE +/- 0.00123, N = 3 SE +/- 0.03687, N = 3 SE +/- 0.02683, N = 3 1.37059 1.45757 1.44425 7.23686 7.19213 -fopenmp=libomp - MIN: 1.28 -fopenmp=libomp - MIN: 1.35 -fopenmp=libomp - MIN: 1.34 -fopenmp - MIN: 6.18 -fopenmp - MIN: 6.14 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.6758 1.3516 2.0274 2.7032 3.379 SE +/- 0.00564, N = 3 SE +/- 0.02389, N = 3 SE +/- 0.02100, N = 3 SE +/- 0.00883, N = 3 SE +/- 0.00845, N = 3 2.28755 2.31859 2.36797 3.00341 2.99759 -fopenmp=libomp - MIN: 1.91 -fopenmp=libomp - MIN: 1.92 -fopenmp=libomp - MIN: 2.01 -fopenmp - MIN: 2.35 -fopenmp - MIN: 2.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.4581 0.9162 1.3743 1.8324 2.2905 SE +/- 0.00195, N = 3 SE +/- 0.00118, N = 3 SE +/- 0.01922, N = 12 SE +/- 0.00384, N = 3 SE +/- 0.01150, N = 3 1.59597 1.60540 2.03606 1.64268 1.66260 -fopenmp=libomp - MIN: 1.54 -fopenmp=libomp - MIN: 1.55 -fopenmp=libomp - MIN: 1.81 -fopenmp - MIN: 1.58 -fopenmp - MIN: 1.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.1355 0.271 0.4065 0.542 0.6775 SE +/- 0.000365, N = 3 SE +/- 0.001652, N = 3 SE +/- 0.002843, N = 3 SE +/- 0.001964, N = 3 SE +/- 0.001469, N = 3 0.459724 0.489278 0.491940 0.602155 0.599140 -fopenmp=libomp - MIN: 0.44 -fopenmp=libomp - MIN: 0.46 -fopenmp=libomp - MIN: 0.47 -fopenmp - MIN: 0.57 -fopenmp - MIN: 0.56 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.177 0.354 0.531 0.708 0.885 SE +/- 0.001713, N = 3 SE +/- 0.001200, N = 3 SE +/- 0.004246, N = 3 SE +/- 0.002532, N = 3 SE +/- 0.002405, N = 3 0.773233 0.779101 0.779776 0.782476 0.786762 -fopenmp=libomp - MIN: 0.72 -fopenmp=libomp - MIN: 0.73 -fopenmp=libomp - MIN: 0.73 -fopenmp - MIN: 0.73 -fopenmp - MIN: 0.75 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 300 600 900 1200 1500 SE +/- 5.94, N = 3 SE +/- 9.46, N = 3 SE +/- 3.92, N = 3 SE +/- 3.72, N = 3 SE +/- 4.44, N = 3 1267.18 1276.04 1302.70 1382.41 1357.29 -fopenmp=libomp - MIN: 1248.35 -fopenmp=libomp - MIN: 1249.65 -fopenmp=libomp - MIN: 1289.86 -fopenmp - MIN: 1360.58 -fopenmp - MIN: 1335.63 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 140 280 420 560 700 SE +/- 0.53, N = 3 SE +/- 0.83, N = 3 SE +/- 9.50, N = 3 SE +/- 0.61, N = 3 SE +/- 0.64, N = 3 544.10 563.20 593.97 659.27 658.66 -fopenmp=libomp - MIN: 532.32 -fopenmp=libomp - MIN: 550.23 -fopenmp=libomp - MIN: 570.44 -fopenmp - MIN: 642.67 -fopenmp - MIN: 639.86 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 300 600 900 1200 1500 SE +/- 1.97, N = 3 SE +/- 7.11, N = 3 SE +/- 3.61, N = 3 SE +/- 1.75, N = 3 SE +/- 3.05, N = 3 1259.59 1277.62 1307.49 1379.51 1358.56 -fopenmp=libomp - MIN: 1247.29 -fopenmp=libomp - MIN: 1252.39 -fopenmp=libomp - MIN: 1293.38 -fopenmp - MIN: 1361.6 -fopenmp - MIN: 1337.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 140 280 420 560 700 SE +/- 0.90, N = 3 SE +/- 0.25, N = 3 SE +/- 1.89, N = 3 SE +/- 0.83, N = 3 SE +/- 1.25, N = 3 544.31 562.97 590.18 658.28 659.19 -fopenmp=libomp - MIN: 531.9 -fopenmp=libomp - MIN: 551.49 -fopenmp=libomp - MIN: 575.41 -fopenmp - MIN: 639.78 -fopenmp - MIN: 642.05 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.085 0.17 0.255 0.34 0.425 SE +/- 0.000492, N = 3 SE +/- 0.000247, N = 3 SE +/- 0.000321, N = 3 SE +/- 0.004341, N = 3 SE +/- 0.000576, N = 3 0.301885 0.315522 0.313689 0.377733 0.376992 -fopenmp=libomp - MIN: 0.29 -fopenmp=libomp - MIN: 0.3 -fopenmp=libomp - MIN: 0.3 -fopenmp - MIN: 0.36 -fopenmp - MIN: 0.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 300 600 900 1200 1500 SE +/- 0.58, N = 3 SE +/- 9.75, N = 3 SE +/- 1.78, N = 3 SE +/- 2.65, N = 3 SE +/- 4.57, N = 3 1268.08 1271.91 1305.10 1375.71 1356.91 -fopenmp=libomp - MIN: 1257.35 -fopenmp=libomp - MIN: 1252.33 -fopenmp=libomp - MIN: 1294.76 -fopenmp - MIN: 1355.68 -fopenmp - MIN: 1335.04 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 140 280 420 560 700 SE +/- 0.62, N = 3 SE +/- 0.10, N = 3 SE +/- 3.02, N = 3 SE +/- 1.86, N = 3 SE +/- 0.52, N = 3 544.60 563.25 597.48 658.04 657.88 -fopenmp=libomp - MIN: 532.91 -fopenmp=libomp - MIN: 551.31 -fopenmp=libomp - MIN: 580.8 -fopenmp - MIN: 635.78 -fopenmp - MIN: 638.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.2638 0.5276 0.7914 1.0552 1.319 SE +/- 0.004625, N = 3 SE +/- 0.006530, N = 3 SE +/- 0.004576, N = 3 SE +/- 0.005622, N = 3 SE +/- 0.003430, N = 3 1.170440 1.151400 1.172580 0.788192 0.717782 -fopenmp=libomp - MIN: 1.11 -fopenmp=libomp - MIN: 1.09 -fopenmp=libomp - MIN: 1.12 -fopenmp - MIN: 0.74 -fopenmp - MIN: 0.67 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: yolov4 - Device: OpenMP CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 80 160 240 320 400 SE +/- 2.50, N = 3 SE +/- 1.42, N = 3 SE +/- 4.15, N = 4 SE +/- 0.17, N = 3 SE +/- 0.50, N = 3 386 346 333 351 351 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: bertsquad-10 - Device: OpenMP CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 110 220 330 440 550 SE +/- 10.39, N = 12 SE +/- 5.55, N = 3 SE +/- 10.30, N = 12 SE +/- 0.87, N = 3 SE +/- 4.64, N = 12 459 471 498 505 495 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: fcn-resnet101-11 - Device: OpenMP CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 30 60 90 120 150 SE +/- 0.50, N = 3 SE +/- 0.29, N = 3 SE +/- 0.50, N = 3 SE +/- 0.17, N = 3 SE +/- 0.44, N = 3 122 108 112 115 116 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: shufflenet-v2-10 - Device: OpenMP CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 2K 4K 6K 8K 10K SE +/- 171.77, N = 3 SE +/- 102.76, N = 8 SE +/- 88.25, N = 12 SE +/- 7.52, N = 3 SE +/- 138.76, N = 3 11325 9797 9904 10197 9419 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.6 Model: super-resolution-10 - Device: OpenMP CPU AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 1200 2400 3600 4800 6000 SE +/- 174.98, N = 12 SE +/- 169.87, N = 9 SE +/- 126.29, N = 12 SE +/- 17.50, N = 3 SE +/- 2.40, N = 3 4383 4523 4456 5559 5183 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -ldl -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2 4 6 8 10 SE +/- 0.002, N = 5 SE +/- 0.013, N = 5 SE +/- 0.003, N = 5 SE +/- 0.002, N = 5 SE +/- 0.002, N = 5 7.392 7.567 7.469 7.381 7.504 -fvisibility=hidden -fvisibility=hidden -fvisibility=hidden 1. (CXX) g++ options: -O3 -march=native -logg -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 1 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Only Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 5K 10K 15K 20K 25K SE +/- 289.16, N = 3 SE +/- 303.43, N = 3 SE +/- 118.05, N = 3 SE +/- 281.76, N = 15 SE +/- 41.57, N = 3 24943 24310 24845 23661 23895 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 1 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Only - Average Latency Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.0095 0.019 0.0285 0.038 0.0475 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 15 SE +/- 0.000, N = 3 0.040 0.041 0.040 0.042 0.042 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 1 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Write Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 700 1400 2100 2800 3500 SE +/- 14.62, N = 3 SE +/- 3.48, N = 3 SE +/- 11.40, N = 3 SE +/- 28.00, N = 3 SE +/- 4.79, N = 3 3312 3281 3369 3383 3298 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 1 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 1 - Mode: Read Write - Average Latency Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.0686 0.1372 0.2058 0.2744 0.343 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 0.302 0.305 0.297 0.296 0.303 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 100 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Only Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 200K 400K 600K 800K 1000K SE +/- 1740.88, N = 3 SE +/- 720.87, N = 3 SE +/- 183.22, N = 3 SE +/- 1514.63, N = 3 SE +/- 1623.23, N = 3 1069367 1069022 1076357 1090824 1057125 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 100 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Only - Average Latency Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.0214 0.0428 0.0642 0.0856 0.107 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.094 0.094 0.093 0.092 0.095 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 250 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 200K 400K 600K 800K 1000K SE +/- 13844.42, N = 3 SE +/- 6289.60, N = 3 SE +/- 8859.63, N = 3 SE +/- 8885.95, N = 3 SE +/- 8843.08, N = 3 1065506 1071209 1089731 1090160 1067486 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Only - Average Latency Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.0529 0.1058 0.1587 0.2116 0.2645 SE +/- 0.003, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 0.235 0.234 0.230 0.230 0.235 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 100 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Write Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 13K 26K 39K 52K 65K SE +/- 400.92, N = 3 SE +/- 162.92, N = 3 SE +/- 469.64, N = 3 SE +/- 899.82, N = 3 SE +/- 994.44, N = 3 61616 62319 58894 56369 59364 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 100 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 100 - Mode: Read Write - Average Latency Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.3998 0.7996 1.1994 1.5992 1.999 SE +/- 0.011, N = 3 SE +/- 0.004, N = 3 SE +/- 0.013, N = 3 SE +/- 0.029, N = 3 SE +/- 0.028, N = 3 1.626 1.607 1.701 1.777 1.688 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 250 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Write Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 12K 24K 36K 48K 60K SE +/- 883.12, N = 3 SE +/- 702.52, N = 15 SE +/- 591.89, N = 7 SE +/- 211.73, N = 3 SE +/- 396.40, N = 3 54488 56684 53019 53102 53825 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 100 - Clients: 250 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 100 - Clients: 250 - Mode: Read Write - Average Latency Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 1.0645 2.129 3.1935 4.258 5.3225 SE +/- 0.074, N = 3 SE +/- 0.054, N = 15 SE +/- 0.052, N = 7 SE +/- 0.021, N = 3 SE +/- 0.034, N = 3 4.603 4.431 4.731 4.722 4.657 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
POV-Ray Trace Time OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 3 6 9 12 15 SE +/- 0.026, N = 3 SE +/- 0.032, N = 3 SE +/- 0.041, N = 3 SE +/- 0.049, N = 3 SE +/- 0.053, N = 3 9.494 9.408 9.296 9.570 9.968 1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -pthread -lSDL -lXpm -lSM -lICE -lX11 -lIlmImf -lImath -lHalf -lIex -lIexMath -lIlmThread -lpthread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 9.3 600 1200 1800 2400 3000 SE +/- 2.28, N = 3 SE +/- 1.01, N = 3 SE +/- 1.92, N = 3 SE +/- 1.62, N = 3 SE +/- 2.06, N = 3 SE +/- 4.53, N = 3 2725.7 2640.2 2653.8 2657.8 2392.6 2338.9 1. (CXX) g++ options: -O3 -march=native -rdynamic
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 700 1400 2100 2800 3500 SE +/- 1.29, N = 3 SE +/- 15.12, N = 3 SE +/- 1.11, N = 3 SE +/- 6.50, N = 3 SE +/- 5.19, N = 3 SE +/- 5.86, N = 3 3298.29 3319.34 3190.62 3235.94 3182.35 3229.22 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 150 300 450 600 750 SE +/- 0.18, N = 3 SE +/- 0.40, N = 3 SE +/- 0.40, N = 3 SE +/- 1.71, N = 3 SE +/- 0.29, N = 3 SE +/- 0.14, N = 3 690.94 674.86 675.13 682.87 647.82 668.10 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 90 180 270 360 450 SE +/- 0.70, N = 3 SE +/- 0.67, N = 3 SE +/- 0.46, N = 3 SE +/- 0.25, N = 3 SE +/- 1.03, N = 3 SE +/- 0.66, N = 3 398.96 399.16 363.85 388.98 388.88 384.03 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 1000 2000 3000 4000 5000 SE +/- 5.98, N = 3 SE +/- 3.87, N = 3 SE +/- 10.41, N = 3 SE +/- 0.86, N = 3 SE +/- 0.39, N = 3 SE +/- 1.69, N = 3 4594.27 4590.37 4280.22 3820.77 3462.66 3765.88 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2K 4K 6K 8K 10K SE +/- 0.22, N = 3 SE +/- 77.81, N = 3 SE +/- 7.16, N = 3 SE +/- 33.93, N = 3 SE +/- 25.06, N = 3 SE +/- 28.39, N = 3 9021.83 9146.88 8848.40 9248.89 9263.55 9178.97 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 500 1000 1500 2000 2500 SE +/- 0.02, N = 3 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 1785.45 1785.42 1785.50 2038.15 2148.84 2149.15 1. (CC) gcc options: -O3 -march=native -lm
SecureMark Benchmark: SecureMark-TLS OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 60K 120K 180K 240K 300K SE +/- 251.99, N = 3 SE +/- 407.86, N = 3 SE +/- 1778.47, N = 3 SE +/- 1024.96, N = 3 SE +/- 675.55, N = 3 SE +/- 537.86, N = 3 264637 260119 265204 242700 243861 238935 1. (CC) gcc options: -pedantic -O3
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: Kostya AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.6233 1.2466 1.8699 2.4932 3.1165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.73 2.68 2.75 2.77 2.75 1. (CXX) g++ options: -O3 -march=native -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: LargeRandom AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.2115 0.423 0.6345 0.846 1.0575 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.82 0.81 0.84 0.90 0.94 1. (CXX) g++ options: -O3 -march=native -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: PartialTweets AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 1.035 2.07 3.105 4.14 5.175 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.33 4.41 4.60 4.02 3.93 1. (CXX) g++ options: -O3 -march=native -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.8.2 Throughput Test: DistinctUserID AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 1.0395 2.079 3.1185 4.158 5.1975 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 4.47 4.41 4.62 4.13 3.98 1. (CXX) g++ options: -O3 -march=native -pthread
SVT-AV1 Encoder Mode: Enc Mode 0 - Input: 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 0 - Input: 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.0412 0.0824 0.1236 0.1648 0.206 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.183 0.181 0.183 0.169 0.176 0.129 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
SVT-AV1 Encoder Mode: Enc Mode 4 - Input: 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 4 - Input: 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 3 6 9 12 15 SE +/- 0.189, N = 3 SE +/- 0.164, N = 4 SE +/- 0.170, N = 3 SE +/- 0.111, N = 9 SE +/- 0.139, N = 3 SE +/- 0.086, N = 3 11.690 11.821 11.474 11.230 11.905 9.325 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
SVT-AV1 Encoder Mode: Enc Mode 8 - Input: 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.8 Encoder Mode: Enc Mode 8 - Input: 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.46, N = 3 SE +/- 0.10, N = 3 SE +/- 1.05, N = 3 SE +/- 0.18, N = 3 SE +/- 0.83, N = 3 116.49 117.39 118.07 109.70 110.70 92.98 1. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 9 18 27 36 45 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 SE +/- 0.17, N = 3 SE +/- 0.05, N = 3 SE +/- 0.17, N = 3 SE +/- 0.18, N = 3 40.95 41.01 41.09 39.03 38.86 38.41 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 80 160 240 320 400 SE +/- 1.09, N = 3 SE +/- 3.43, N = 3 SE +/- 1.56, N = 3 SE +/- 1.54, N = 3 SE +/- 1.51, N = 3 SE +/- 1.20, N = 3 343.85 346.89 345.30 330.53 329.32 322.42 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 140 280 420 560 700 SE +/- 3.03, N = 3 SE +/- 5.55, N = 3 SE +/- 3.01, N = 3 SE +/- 2.42, N = 3 SE +/- 5.75, N = 3 SE +/- 3.83, N = 3 638.10 652.74 643.58 615.62 611.73 605.50 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 110 220 330 440 550 SE +/- 2.67, N = 3 SE +/- 0.23, N = 3 SE +/- 1.37, N = 3 SE +/- 0.24, N = 3 SE +/- 1.15, N = 3 SE +/- 0.82, N = 3 476.95 481.05 487.43 472.61 472.32 463.12 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 110 220 330 440 550 SE +/- 1.94, N = 3 SE +/- 1.76, N = 3 SE +/- 0.73, N = 3 SE +/- 2.08, N = 3 SE +/- 1.13, N = 3 SE +/- 0.32, N = 3 478.62 482.02 488.23 477.67 478.16 464.57 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 80 160 240 320 400 SE +/- 2.72, N = 3 SE +/- 1.91, N = 3 SE +/- 1.11, N = 3 SE +/- 0.47, N = 3 SE +/- 0.70, N = 3 SE +/- 3.83, N = 3 373.89 373.99 372.49 364.12 366.39 354.21 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Tachyon Total Time OpenBenchmarking.org Seconds, Fewer Is Better Tachyon 0.99b6 Total Time AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 16.06 16.41 16.05 16.15 15.50 15.68 1. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread
Timed MrBayes Analysis Primate Phylogeny Analysis OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.26, N = 3 SE +/- 0.98, N = 3 SE +/- 0.98, N = 3 SE +/- 1.09, N = 3 SE +/- 1.29, N = 4 SE +/- 0.33, N = 3 SE +/- 0.16, N = 3 86.74 88.62 89.12 93.63 93.66 89.43 89.16 -flto -mabm -mabm -mabm 1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -O3 -std=c99 -pedantic -march=native -lm
toyBrot Fractal Generator Implementation: TBB OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: TBB AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 9.3 1500 3000 4500 6000 7500 SE +/- 52.54, N = 3 SE +/- 67.11, N = 7 SE +/- 87.21, N = 3 SE +/- 86.43, N = 3 SE +/- 67.68, N = 3 SE +/- 74.84, N = 3 6945 6247 6780 7085 5181 5107 -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -flto -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc 1. (CXX) g++ options: -O3 -march=native -lpthread
toyBrot Fractal Generator Implementation: OpenMP OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: OpenMP AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 1600 3200 4800 6400 8000 SE +/- 22.73, N = 3 SE +/- 20.42, N = 3 SE +/- 14.89, N = 3 SE +/- 2.60, N = 3 SE +/- 3.18, N = 3 7477 7029 7507 5524 5451 1. (CXX) g++ options: -O3 -march=native -lpthread -lm -lgcc -lgcc_s -lc
toyBrot Fractal Generator Implementation: C++ Tasks OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Tasks AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 9.3 1600 3200 4800 6400 8000 SE +/- 41.46, N = 3 SE +/- 7.31, N = 3 SE +/- 33.67, N = 3 SE +/- 17.21, N = 3 SE +/- 31.52, N = 3 SE +/- 49.08, N = 3 7189 6836 7437 7367 5610 5414 -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -flto -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc 1. (CXX) g++ options: -O3 -march=native -lpthread
toyBrot Fractal Generator Implementation: C++ Threads OpenBenchmarking.org ms, Fewer Is Better toyBrot Fractal Generator 2020-11-18 Implementation: C++ Threads AMD AOCC 3.0 Clang 11.0 Clang 12.0 Clang 12.0 LTO GCC 10.3 GCC 9.3 1500 3000 4500 6000 7500 SE +/- 24.26, N = 3 SE +/- 25.04, N = 3 SE +/- 30.90, N = 3 SE +/- 15.06, N = 3 SE +/- 6.12, N = 3 SE +/- 8.33, N = 3 7144 6395 7220 7143 5383 5142 -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc -flto -lm -lgcc -lgcc_s -lc -lm -lgcc -lgcc_s -lc 1. (CXX) g++ options: -O3 -march=native -lpthread
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 400K 800K 1200K 1600K 2000K SE +/- 2098.00, N = 5 SE +/- 2852.59, N = 5 SE +/- 1798.40, N = 5 SE +/- 956.77, N = 5 SE +/- 1626.80, N = 5 SE +/- 760.80, N = 5 1697846 1638265 1570966 1467179 1494250 1446372 1. (CC) gcc options: -O3 -march=native
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 300 600 900 1200 1500 SE +/- 32.29, N = 12 SE +/- 36.50, N = 15 SE +/- 15.30, N = 12 SE +/- 101.07, N = 12 SE +/- 25.34, N = 15 SE +/- 25.85, N = 15 531.00 495.00 471.00 1065.60 1210.00 1217.00 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 300 600 900 1200 1500 SE +/- 26.90, N = 12 SE +/- 34.43, N = 15 SE +/- 15.69, N = 12 SE +/- 132.58, N = 12 SE +/- 62.40, N = 15 SE +/- 2.85, N = 15 326.0 412.0 357.0 1350.0 1496.0 813.0 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 140 280 420 560 700 SE +/- 37.59, N = 12 SE +/- 38.96, N = 15 SE +/- 35.24, N = 12 SE +/- 53.43, N = 12 SE +/- 2.60, N = 15 SE +/- 0.80, N = 15 477.00 462.00 434.00 592.97 649.00 636.00 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 400 800 1200 1600 2000 SE +/- 9.88, N = 12 SE +/- 8.32, N = 15 SE +/- 15.32, N = 11 SE +/- 131.59, N = 12 SE +/- 2.67, N = 15 SE +/- 9.19, N = 15 1944.0 1877.0 604.0 1461.2 1599.0 1587.0 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 500 1000 1500 2000 2500 SE +/- 3.59, N = 12 SE +/- 1.59, N = 15 SE +/- 20.06, N = 12 SE +/- 194.35, N = 12 SE +/- 2.74, N = 15 SE +/- 2.06, N = 15 1017.0 1043.0 878.0 2158.4 2359.0 1521.0 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 300 600 900 1200 1500 SE +/- 2.61, N = 12 SE +/- 1.49, N = 15 SE +/- 17.06, N = 12 SE +/- 95.41, N = 12 SE +/- 1.87, N = 15 SE +/- 1.59, N = 15 1165.00 933.00 819.00 1056.42 1153.00 1133.00 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 15 30 45 60 75 SE +/- 3.44, N = 12 SE +/- 3.65, N = 15 SE +/- 2.22, N = 12 SE +/- 5.30, N = 12 SE +/- 3.83, N = 15 SE +/- 4.17, N = 15 55.2 51.2 69.1 56.2 63.9 65.0 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 200 400 600 800 1000 SE +/- 1.94, N = 12 SE +/- 1.41, N = 14 SE +/- 4.04, N = 12 SE +/- 66.49, N = 12 SE +/- 2.88, N = 15 SE +/- 2.10, N = 14 783.0 677.0 626.0 741.4 794.0 798.0 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.04, N = 12 SE +/- 0.06, N = 15 SE +/- 0.05, N = 12 SE +/- 1.05, N = 12 SE +/- 0.29, N = 15 SE +/- 0.16, N = 15 84.0 83.6 48.6 98.7 100.5 98.5 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.07, N = 12 SE +/- 0.03, N = 15 SE +/- 0.56, N = 12 SE +/- 0.59, N = 12 SE +/- 0.08, N = 15 SE +/- 0.07, N = 15 78.8 79.3 65.7 94.4 95.0 95.3 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.05, N = 12 SE +/- 0.02, N = 15 SE +/- 0.09, N = 12 SE +/- 0.62, N = 12 SE +/- 0.08, N = 15 90.0 88.3 51.9 104.0 104.0 100.9 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 20 40 60 80 100 SE +/- 0.08, N = 12 SE +/- 0.02, N = 14 SE +/- 0.07, N = 12 SE +/- 0.60, N = 12 SE +/- 0.05, N = 15 SE +/- 0.05, N = 15 84.4 84.0 73.0 98.5 99.3 97.9 -fopenmp=libomp -fopenmp=libomp -fopenmp=libomp -fopenmp -fopenmp -fopenmp 1. (CXX) g++ options: -O3 -march=native -rdynamic -lOpenCL
WebP Image Encode Encode Settings: Default OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Default AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.3143 0.6286 0.9429 1.2572 1.5715 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 1.351 1.336 1.331 1.372 1.386 1.397 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 0.5117 1.0234 1.5351 2.0468 2.5585 SE +/- 0.001, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 SE +/- 0.010, N = 3 SE +/- 0.007, N = 3 SE +/- 0.005, N = 3 2.262 2.240 2.199 2.225 2.274 2.273 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.13, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.13, N = 3 19.13 18.57 19.02 18.88 18.31 19.30 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 2 4 6 8 10 SE +/- 0.009, N = 3 SE +/- 0.018, N = 3 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 SE +/- 0.021, N = 3 SE +/- 0.009, N = 3 6.578 6.243 6.309 7.078 7.003 7.053 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 9 18 27 36 45 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 38.34 37.73 38.45 38.55 37.95 39.07 -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16
WebP2 Image Encode Encode Settings: Default OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Default AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 0.6566 1.3132 1.9698 2.6264 3.283 SE +/- 0.010, N = 3 SE +/- 0.031, N = 3 SE +/- 0.027, N = 3 SE +/- 0.032, N = 7 SE +/- 0.038, N = 3 2.816 2.743 2.739 2.918 2.778 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
WebP2 Image Encode Encode Settings: Quality 75, Compression Effort 7 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 75, Compression Effort 7 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 30 60 90 120 150 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 SE +/- 0.16, N = 3 SE +/- 0.10, N = 3 109.81 109.64 109.53 116.66 118.45 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
WebP2 Image Encode Encode Settings: Quality 95, Compression Effort 7 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 95, Compression Effort 7 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 50 100 150 200 250 SE +/- 0.17, N = 3 SE +/- 0.66, N = 3 SE +/- 0.07, N = 3 SE +/- 0.46, N = 3 SE +/- 1.32, N = 3 205.03 203.63 207.01 215.57 220.94 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
WebP2 Image Encode Encode Settings: Quality 100, Compression Effort 5 OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Compression Effort 5 AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 2 4 6 8 10 SE +/- 0.028, N = 3 SE +/- 0.022, N = 3 SE +/- 0.006, N = 3 SE +/- 0.017, N = 3 SE +/- 0.017, N = 3 7.403 7.366 6.690 6.934 6.753 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
WebP2 Image Encode Encode Settings: Quality 100, Lossless Compression OpenBenchmarking.org Seconds, Fewer Is Better WebP2 Image Encode 20210126 Encode Settings: Quality 100, Lossless Compression AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 9.3 90 180 270 360 450 SE +/- 0.39, N = 3 SE +/- 0.17, N = 3 SE +/- 0.49, N = 3 SE +/- 3.10, N = 3 SE +/- 1.92, N = 3 382.99 392.85 374.04 406.03 388.95 1. (CXX) g++ options: -O3 -march=native -fno-rtti -rdynamic -lpthread -ljpeg -lgif -lwebp -lwebpdemux
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 7 14 21 28 35 SE +/- 0.13, N = 3 SE +/- 0.25, N = 3 SE +/- 0.23, N = 3 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 30.44 29.94 30.32 28.60 28.79 28.91 1. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p AMD AOCC 3.0 Clang 11.0 Clang 12.0 GCC 10.3 GCC 11.0.1 GCC 9.3 16 32 48 64 80 SE +/- 0.63, N = 3 SE +/- 0.49, N = 3 SE +/- 0.49, N = 3 SE +/- 0.32, N = 3 SE +/- 0.56, N = 3 SE +/- 0.26, N = 3 73.51 73.36 74.00 72.60 71.79 72.14 1. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma
Phoronix Test Suite v10.8.4