AMD EPYC 7773X GCC / Clang / AOCC compiler benchmarking by Michael Larabel for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2204117-NE-EPYC7773X86 AMD EPYC 7773X Compilers - Phoronix Test Suite AMD EPYC 7773X Compilers AMD EPYC 7773X GCC / Clang / AOCC compiler benchmarking by Michael Larabel for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2204117-NE-EPYC7773X86&export=txt&grs&rdt .
AMD EPYC 7773X Compilers Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution GCC 11.2 Clang 14.0 AMD AOCC 3.2 2 x AMD EPYC 7773X 64-Core @ 2.20GHz (128 Cores / 256 Threads) AMD DAYTONA_X (TYM1008C BIOS) AMD Starship/Matisse 16 x 32 GB DDR4-3200MT/s 36ASF4G72PZ-3G2E2 800GB INTEL SSDPF21Q800GB ASPEED VE228 2 x Mellanox MT27710 Ubuntu 22.04 5.17.0-051700rc8-generic (x86_64) GNOME Shell 42.0 X Server 1.2.204 GCC 11.2.0 ext4 1920x1080 Clang 14.0.0-1ubuntu1 Clang 13.0.0 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Environment Details - CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto" Compiler Details - GCC 11.2: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - AMD AOCC 3.2: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3 Disk Details - GCC 11.2: NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa001228 Python Details - Python 3.10.4 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
AMD EPYC 7773X Compilers etcpak: DXT1 lczero: Eigen x265: Bosphorus 4K svt-hevc: 10 - Bosphorus 1080p etcpak: ETC2 jpegxl-decode: 1 coremark: CoreMark Size 666 - Iterations Per Second jpegxl: PNG - 8 liquid-dsp: 256 - 256 - 57 xmrig: Wownero - 1M openssl: SHA256 tscp: AI Chess Performance webp: Quality 100, Lossless, Highest Compression svt-av1: Preset 4 - Bosphorus 4K astcenc: Thorough svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p encode-mp3: WAV To MP3 webp: Quality 100, Highest Compression lczero: BLAS svt-av1: Preset 12 - Bosphorus 4K tjbench: Decompression Throughput avifenc: 6 webp: Quality 100, Lossless jpegxl-decode: All toktx: Zstd Compression 19 svt-hevc: 7 - Bosphorus 1080p lammps: Rhodopsin Protein avifenc: 6, Lossless svt-av1: Preset 10 - Bosphorus 4K xmrig: Monero - 1M quantlib: astcenc: Exhaustive avifenc: 2 draco: Lion kvazaar: Bosphorus 4K - Medium webp: Quality 100 aobench: 2048 x 2048 - Total Time avifenc: 0 toktx: UASTC 3 + Zstd Compression 19 kvazaar: Bosphorus 4K - Very Fast openjpeg: NASA Curiosity Panorama M34 toktx: UASTC 3 webp: Default liquid-dsp: 128 - 256 - 57 primesieve: 1e12 Prime Number Generation encode-flac: WAV To FLAC toktx: Zstd Compression 9 lammps: 20k Atoms toktx: UASTC 4 + Zstd Compression 19 openssl: RSA4096 compress-zstd: 19 - Compression Speed openssl: RSA4096 graphics-magick: Resizing graphics-magick: Rotate jpegxl: JPEG - 7 compress-zstd: 19 - Decompression Speed GCC 11.2 Clang 14.0 AMD AOCC 3.2 844.045 4187 19.37 376.34 134.043 46.70 4447513.211801 0.72 6127133333 42309.7 156404481030 1094141 50.642 4.082 6.4222 330.00 8.988 8.798 4159 140.461 163.486990 5.020 24.561 564.28 21.441 288.97 28.468 8.318 107.652 41100.5 2125.3 5.8792 48.775 5952 31.52 2.906 49.744 88.757 9.015 43.64 362483 4.604 1.695 5825600000 2.613 21.516 3.839 35.972 35.583 1770209.9 98.8 26996.8 164 537 71.13 2269.9 1926.467 5107 21.31 437.24 166.392 53.78 3705497.015609 0.67 6229033333 40922.7 170350109150 1231667 45.089 4.005 5.9580 362.42 9.986 8.585 4224 129.325 150.720021 4.911 23.868 605.18 22.989 303.09 29.947 7.976 104.559 40338.2 2240.3 5.7737 47.512 6176 33.21 2.871 50.303 87.251 9.389 45.28 349040 4.738 1.701 5867933333 2.611 21.834 3.886 36.010 35.531 1774479.0 99.0 26924.9 93 450 78.02 2086.0 2012.565 5570 25.03 472.24 158.347 57.14 4242648.798194 0.79 7010600000 36998.0 176385148413 1172438 45.901 4.483 5.7378 367.81 9.782 8.026 4551 132.782 160.291046 4.635 22.759 599.49 21.813 308.11 30.353 7.822 110.839 38790.7 2251.4 5.5686 46.271 5859 32.92 2.759 47.941 84.815 9.071 45.36 352848 4.684 1.660 5949000000 2.564 21.804 3.832 36.316 35.386 1768233.2 98.7 26972.4 268 616 82.75 2311.1 OpenBenchmarking.org
Etcpak Configuration: DXT1 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: DXT1 GCC 11.2 Clang 14.0 AMD AOCC 3.2 400 800 1200 1600 2000 SE +/- 2.54, N = 3 SE +/- 29.34, N = 15 SE +/- 23.71, N = 15 844.05 1926.47 2012.57 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen GCC 11.2 Clang 14.0 AMD AOCC 3.2 1200 2400 3600 4800 6000 SE +/- 48.22, N = 3 SE +/- 57.43, N = 9 SE +/- 67.86, N = 9 4187 5107 5570 1. (CXX) g++ options: -flto -O3 -march=native -pthread
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K GCC 11.2 Clang 14.0 AMD AOCC 3.2 6 12 18 24 30 SE +/- 0.21, N = 15 SE +/- 0.21, N = 15 SE +/- 0.18, N = 3 19.37 21.31 25.03 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic -lpthread -lrt -ldl -lnuma
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p GCC 11.2 Clang 14.0 AMD AOCC 3.2 100 200 300 400 500 SE +/- 3.15, N = 13 SE +/- 5.37, N = 3 SE +/- 3.33, N = 3 376.34 437.24 472.24 1. (CC) gcc options: -O3 -march=native -flto -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
Etcpak Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 0.7 Configuration: ETC2 GCC 11.2 Clang 14.0 AMD AOCC 3.2 40 80 120 160 200 SE +/- 1.46, N = 3 SE +/- 0.04, N = 3 SE +/- 1.75, N = 4 134.04 166.39 158.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread
JPEG XL Decoding libjxl CPU Threads: 1 OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding libjxl 0.6.1 CPU Threads: 1 GCC 11.2 Clang 14.0 AMD AOCC 3.2 13 26 39 52 65 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.15, N = 3 46.70 53.78 57.14
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second GCC 11.2 Clang 14.0 AMD AOCC 3.2 1000K 2000K 3000K 4000K 5000K SE +/- 9967.53, N = 3 SE +/- 34949.41, N = 3 SE +/- 14710.43, N = 3 4447513.21 3705497.02 4242648.80 1. (CC) gcc options: -O2 -O3 -march=native -flto -lrt" -lrt
JPEG XL libjxl Input: PNG - Encode Speed: 8 OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 8 GCC 11.2 Clang 14.0 AMD AOCC 3.2 0.1778 0.3556 0.5334 0.7112 0.889 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.72 0.67 0.79 -Xclang -mrelax-all -Xclang -mrelax-all 1. (CXX) g++ options: -O3 -march=native -flto -funwind-tables -O2 -fPIE -pie
Liquid-DSP Threads: 256 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 256 - Buffer Length: 256 - Filter Length: 57 GCC 11.2 Clang 14.0 AMD AOCC 3.2 1500M 3000M 4500M 6000M 7500M SE +/- 1604507.54, N = 3 SE +/- 1414606.34, N = 3 SE +/- 1311487.70, N = 3 6127133333 6229033333 7010600000 1. (CC) gcc options: -O3 -march=native -flto -pthread -lm -lc -lliquid
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M GCC 11.2 Clang 14.0 AMD AOCC 3.2 9K 18K 27K 36K 45K SE +/- 63.43, N = 3 SE +/- 526.98, N = 3 SE +/- 352.95, N = 3 42309.7 40922.7 36998.0 -static-libgcc -static-libstdc++ -funroll-loops -funroll-loops 1. (CXX) g++ options: -O3 -march=native -flto -fexceptions -fno-rtti -maes -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 GCC 11.2 Clang 14.0 AMD AOCC 3.2 40000M 80000M 120000M 160000M 200000M SE +/- 330832787.20, N = 3 SE +/- 326147908.01, N = 3 SE +/- 144731144.77, N = 3 156404481030 170350109150 176385148413 -Qunused-arguments -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -flto -lssl -lcrypto -ldl
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 11.2 Clang 14.0 AMD AOCC 3.2 300K 600K 900K 1200K 1500K SE +/- 2612.05, N = 5 SE +/- 4052.15, N = 5 SE +/- 4599.34, N = 5 1094141 1231667 1172438 1. (CC) gcc options: -O3 -march=native -flto
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression GCC 11.2 Clang 14.0 AMD AOCC 3.2 11 22 33 44 55 SE +/- 0.51, N = 5 SE +/- 0.11, N = 3 SE +/- 0.02, N = 3 50.64 45.09 45.90 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -flto -lm -lpng16 -ljpeg
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.9 Encoder Mode: Preset 4 - Input: Bosphorus 4K GCC 11.2 Clang 14.0 AMD AOCC 3.2 1.0087 2.0174 3.0261 4.0348 5.0435 SE +/- 0.036, N = 3 SE +/- 0.015, N = 3 SE +/- 0.052, N = 3 4.082 4.005 4.483 1. (CXX) g++ options: -O3 -march=native -flto -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 3.2 Preset: Thorough GCC 11.2 Clang 14.0 AMD AOCC 3.2 2 4 6 8 10 SE +/- 0.0328, N = 3 SE +/- 0.0253, N = 3 SE +/- 0.0468, N = 15 6.4222 5.9580 5.7378 1. (CXX) g++ options: -O3 -march=native -flto -pthread
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p GCC 11.2 Clang 14.0 AMD AOCC 3.2 80 160 240 320 400 SE +/- 1.46, N = 3 SE +/- 3.50, N = 3 SE +/- 0.54, N = 3 330.00 362.42 367.81 1. (CC) gcc options: -O3 -fcommon -march=native -flto -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 GCC 11.2 Clang 14.0 AMD AOCC 3.2 3 6 9 12 15 SE +/- 0.024, N = 3 SE +/- 0.017, N = 3 SE +/- 0.020, N = 3 8.988 9.986 9.782 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr 1. (CC) gcc options: -O3 -pipe -march=native -flto -lncurses -lm
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression GCC 11.2 Clang 14.0 AMD AOCC 3.2 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.013, N = 3 SE +/- 0.005, N = 3 8.798 8.585 8.026 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -flto -lm -lpng16 -ljpeg
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS GCC 11.2 Clang 14.0 AMD AOCC 3.2 1000 2000 3000 4000 5000 SE +/- 38.97, N = 3 SE +/- 23.81, N = 3 SE +/- 44.95, N = 9 4159 4224 4551 1. (CXX) g++ options: -flto -O3 -march=native -pthread
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.9 Encoder Mode: Preset 12 - Input: Bosphorus 4K GCC 11.2 Clang 14.0 AMD AOCC 3.2 30 60 90 120 150 SE +/- 1.52, N = 5 SE +/- 0.12, N = 3 SE +/- 1.72, N = 3 140.46 129.33 132.78 1. (CXX) g++ options: -O3 -march=native -flto -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput GCC 11.2 Clang 14.0 AMD AOCC 3.2 40 80 120 160 200 SE +/- 0.17, N = 3 SE +/- 0.18, N = 3 SE +/- 0.62, N = 3 163.49 150.72 160.29 1. (CC) gcc options: -O3 -march=native -flto -rdynamic -lm
libavif avifenc Encoder Speed: 6 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 6 GCC 11.2 Clang 14.0 AMD AOCC 3.2 1.1295 2.259 3.3885 4.518 5.6475 SE +/- 0.046, N = 7 SE +/- 0.057, N = 4 SE +/- 0.047, N = 3 5.020 4.911 4.635 1. (CXX) g++ options: -O3 -fPIC -march=native -flto -lm
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless GCC 11.2 Clang 14.0 AMD AOCC 3.2 6 12 18 24 30 SE +/- 0.22, N = 3 SE +/- 0.22, N = 15 SE +/- 0.01, N = 3 24.56 23.87 22.76 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -flto -lm -lpng16 -ljpeg
JPEG XL Decoding libjxl CPU Threads: All OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding libjxl 0.6.1 CPU Threads: All GCC 11.2 Clang 14.0 AMD AOCC 3.2 130 260 390 520 650 SE +/- 7.30, N = 3 SE +/- 1.33, N = 3 SE +/- 6.36, N = 3 564.28 605.18 599.49
KTX-Software toktx Settings: Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 19 GCC 11.2 Clang 14.0 AMD AOCC 3.2 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 21.44 22.99 21.81
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p GCC 11.2 Clang 14.0 AMD AOCC 3.2 70 140 210 280 350 SE +/- 2.37, N = 3 SE +/- 1.03, N = 3 SE +/- 2.88, N = 3 288.97 303.09 308.11 1. (CC) gcc options: -O3 -march=native -flto -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein GCC 11.2 Clang 14.0 AMD AOCC 3.2 7 14 21 28 35 SE +/- 0.40, N = 15 SE +/- 0.30, N = 15 SE +/- 0.44, N = 15 28.47 29.95 30.35 1. (CXX) g++ options: -O3 -march=native -flto -lm
libavif avifenc Encoder Speed: 6, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 6, Lossless GCC 11.2 Clang 14.0 AMD AOCC 3.2 2 4 6 8 10 SE +/- 0.057, N = 3 SE +/- 0.038, N = 3 SE +/- 0.109, N = 3 8.318 7.976 7.822 1. (CXX) g++ options: -O3 -fPIC -march=native -flto -lm
SVT-AV1 Encoder Mode: Preset 10 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.9 Encoder Mode: Preset 10 - Input: Bosphorus 4K GCC 11.2 Clang 14.0 AMD AOCC 3.2 20 40 60 80 100 SE +/- 0.35, N = 3 SE +/- 0.85, N = 3 SE +/- 0.22, N = 3 107.65 104.56 110.84 1. (CXX) g++ options: -O3 -march=native -flto -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M GCC 11.2 Clang 14.0 AMD AOCC 3.2 9K 18K 27K 36K 45K SE +/- 47.93, N = 3 SE +/- 142.04, N = 3 SE +/- 276.55, N = 3 41100.5 40338.2 38790.7 -static-libgcc -static-libstdc++ -funroll-loops -funroll-loops 1. (CXX) g++ options: -O3 -march=native -flto -fexceptions -fno-rtti -maes -Ofast -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 GCC 11.2 Clang 14.0 AMD AOCC 3.2 500 1000 1500 2000 2500 SE +/- 9.30, N = 3 SE +/- 5.25, N = 3 SE +/- 14.43, N = 3 2125.3 2240.3 2251.4 1. (CXX) g++ options: -O3 -march=native -rdynamic
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 3.2 Preset: Exhaustive GCC 11.2 Clang 14.0 AMD AOCC 3.2 1.3228 2.6456 3.9684 5.2912 6.614 SE +/- 0.0090, N = 3 SE +/- 0.0041, N = 3 SE +/- 0.0070, N = 3 5.8792 5.7737 5.5686 1. (CXX) g++ options: -O3 -march=native -flto -pthread
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 2 GCC 11.2 Clang 14.0 AMD AOCC 3.2 11 22 33 44 55 SE +/- 0.13, N = 3 SE +/- 0.27, N = 3 SE +/- 0.17, N = 3 48.78 47.51 46.27 1. (CXX) g++ options: -O3 -fPIC -march=native -flto -lm
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Lion GCC 11.2 Clang 14.0 AMD AOCC 3.2 1300 2600 3900 5200 6500 SE +/- 7.94, N = 3 SE +/- 17.33, N = 3 SE +/- 8.33, N = 3 5952 6176 5859 1. (CXX) g++ options: -O3 -march=native -flto
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium GCC 11.2 Clang 14.0 AMD AOCC 3.2 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 31.52 33.21 32.92 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -flto -lm -lrt
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 GCC 11.2 Clang 14.0 AMD AOCC 3.2 0.6539 1.3078 1.9617 2.6156 3.2695 SE +/- 0.029, N = 3 SE +/- 0.029, N = 15 SE +/- 0.004, N = 3 2.906 2.871 2.759 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -flto -lm -lpng16 -ljpeg
AOBench Size: 2048 x 2048 - Total Time OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time GCC 11.2 Clang 14.0 AMD AOCC 3.2 11 22 33 44 55 SE +/- 0.13, N = 3 SE +/- 0.25, N = 3 SE +/- 0.41, N = 3 49.74 50.30 47.94 1. (CC) gcc options: -lm -O3 -march=native -flto
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 0 GCC 11.2 Clang 14.0 AMD AOCC 3.2 20 40 60 80 100 SE +/- 0.43, N = 3 SE +/- 0.37, N = 3 SE +/- 0.48, N = 3 88.76 87.25 84.82 1. (CXX) g++ options: -O3 -fPIC -march=native -flto -lm
KTX-Software toktx Settings: UASTC 3 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 GCC 11.2 Clang 14.0 AMD AOCC 3.2 3 6 9 12 15 SE +/- 0.100, N = 3 SE +/- 0.066, N = 15 SE +/- 0.024, N = 3 9.015 9.389 9.071
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast GCC 11.2 Clang 14.0 AMD AOCC 3.2 10 20 30 40 50 SE +/- 0.56, N = 3 SE +/- 0.50, N = 15 SE +/- 0.64, N = 15 43.64 45.28 45.36 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -flto -lm -lrt
OpenJPEG Encode: NASA Curiosity Panorama M34 OpenBenchmarking.org ms, Fewer Is Better OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 GCC 11.2 Clang 14.0 AMD AOCC 3.2 80K 160K 240K 320K 400K SE +/- 3417.00, N = 3 SE +/- 935.91, N = 3 SE +/- 3533.23, N = 15 362483 349040 352848 1. (CXX) g++ options: -O3 -march=native -flto -rdynamic
KTX-Software toktx Settings: UASTC 3 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 GCC 11.2 Clang 14.0 AMD AOCC 3.2 1.0661 2.1322 3.1983 4.2644 5.3305 SE +/- 0.054, N = 3 SE +/- 0.045, N = 15 SE +/- 0.059, N = 3 4.604 4.738 4.684
WebP Image Encode Encode Settings: Default OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Default GCC 11.2 Clang 14.0 AMD AOCC 3.2 0.3827 0.7654 1.1481 1.5308 1.9135 SE +/- 0.001, N = 3 SE +/- 0.015, N = 15 SE +/- 0.003, N = 3 1.695 1.701 1.660 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -flto -lm -lpng16 -ljpeg
Liquid-DSP Threads: 128 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 GCC 11.2 Clang 14.0 AMD AOCC 3.2 1300M 2600M 3900M 5200M 6500M SE +/- 7813023.32, N = 3 SE +/- 2434018.17, N = 3 SE +/- 3257811.13, N = 3 5825600000 5867933333 5949000000 1. (CC) gcc options: -O3 -march=native -flto -pthread -lm -lc -lliquid
Primesieve 1e12 Prime Number Generation OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 7.7 1e12 Prime Number Generation GCC 11.2 Clang 14.0 AMD AOCC 3.2 0.5879 1.1758 1.7637 2.3516 2.9395 SE +/- 0.029, N = 5 SE +/- 0.030, N = 15 SE +/- 0.028, N = 5 2.613 2.611 2.564 1. (CXX) g++ options: -O3 -march=native -flto
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.3 WAV To FLAC GCC 11.2 Clang 14.0 AMD AOCC 3.2 5 10 15 20 25 SE +/- 0.02, N = 5 SE +/- 0.07, N = 5 SE +/- 0.04, N = 5 21.52 21.83 21.80 -fvisibility=hidden 1. (CXX) g++ options: -O3 -march=native -flto -logg -lm
KTX-Software toktx Settings: Zstd Compression 9 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 9 GCC 11.2 Clang 14.0 AMD AOCC 3.2 0.8744 1.7488 2.6232 3.4976 4.372 SE +/- 0.052, N = 3 SE +/- 0.036, N = 3 SE +/- 0.021, N = 3 3.839 3.886 3.832
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms GCC 11.2 Clang 14.0 AMD AOCC 3.2 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.14, N = 3 SE +/- 0.19, N = 3 35.97 36.01 36.32 1. (CXX) g++ options: -O3 -march=native -flto -lm
KTX-Software toktx Settings: UASTC 4 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 4 + Zstd Compression 19 GCC 11.2 Clang 14.0 AMD AOCC 3.2 8 16 24 32 40 SE +/- 0.46, N = 3 SE +/- 0.09, N = 3 SE +/- 0.17, N = 3 35.58 35.53 35.39
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC 11.2 Clang 14.0 AMD AOCC 3.2 400K 800K 1200K 1600K 2000K SE +/- 368.08, N = 3 SE +/- 686.16, N = 3 SE +/- 526.74, N = 3 1770209.9 1774479.0 1768233.2 -Qunused-arguments -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -flto -lssl -lcrypto -ldl
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed GCC 11.2 Clang 14.0 AMD AOCC 3.2 20 40 60 80 100 SE +/- 1.34, N = 3 SE +/- 1.18, N = 4 SE +/- 1.21, N = 3 98.8 99.0 98.7 1. (CC) gcc options: -O3 -march=native -flto -pthread -lz -llzma
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC 11.2 Clang 14.0 AMD AOCC 3.2 6K 12K 18K 24K 30K SE +/- 8.41, N = 3 SE +/- 26.21, N = 3 SE +/- 9.62, N = 3 26996.8 26924.9 26972.4 -Qunused-arguments -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -flto -lssl -lcrypto -ldl
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing GCC 11.2 Clang 14.0 AMD AOCC 3.2 60 120 180 240 300 SE +/- 10.44, N = 15 SE +/- 1.33, N = 3 SE +/- 3.79, N = 3 164 93 268 1. (CC) gcc options: -fopenmp -O3 -march=native -flto -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate GCC 11.2 Clang 14.0 AMD AOCC 3.2 130 260 390 520 650 SE +/- 7.17, N = 3 SE +/- 8.14, N = 15 SE +/- 3.76, N = 3 537 450 616 1. (CC) gcc options: -fopenmp -O3 -march=native -flto -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
JPEG XL libjxl Input: JPEG - Encode Speed: 7 OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 GCC 11.2 Clang 14.0 AMD AOCC 3.2 20 40 60 80 100 SE +/- 1.50, N = 15 SE +/- 1.11, N = 3 SE +/- 0.90, N = 4 71.13 78.02 82.75 -Xclang -mrelax-all -Xclang -mrelax-all 1. (CXX) g++ options: -O3 -march=native -flto -funwind-tables -O2 -fPIE -pie
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed GCC 11.2 Clang 14.0 AMD AOCC 3.2 500 1000 1500 2000 2500 SE +/- 4.90, N = 3 SE +/- 64.15, N = 4 SE +/- 6.09, N = 3 2269.9 2086.0 2311.1 1. (CC) gcc options: -O3 -march=native -flto -pthread -lz -llzma
Phoronix Test Suite v10.8.4