AMD EPYC 9654 GCC 13 development compiler benchmarks by Michael Larabel for a future article.
Znver4 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Znver4 + Prefer AVX-512 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" CFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Znver3 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=znver3 -flto" CFLAGS="-O3 -march=znver3 -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Znver3 + AVX-512 Processor: 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.04, Kernel: 5.19.0-21-generic (x86_64), Desktop: GNOME Shell 43.1, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 13.0.0 20230103, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" CFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
GCC 13 Compiler Benchmarks AMD EPYC Genoa OpenBenchmarking.org Phoronix Test Suite 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1002E BIOS) AMD Device 14a4 1520GB 800GB INTEL SSDPF21Q800GB ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.04 5.19.0-21-generic (x86_64) GNOME Shell 43.1 X Server 1.21.1.4 1.3.224 GCC 13.0.0 20230103 ext4 1920x1080 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution GCC 13 Compiler Benchmarks AMD EPYC Genoa Performance System Logs - Transparent Huge Pages: madvise - Znver4: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto" - Znver4 + Prefer AVX-512: CXXFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" CFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" - Znver3: CXXFLAGS="-O3 -march=znver3 -flto" CFLAGS="-O3 -march=znver3 -flto" - Znver3 + AVX-512: CXXFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" CFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" - --disable-multilib --enable-checking=release - Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110d - Python 3.10.9 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 Result Overview Phoronix Test Suite 100% 109% 118% 127% Cpuminer-Opt CP2K Molecular Dynamics GROMACS Kripke GraphicsMagick ASTC Encoder GPAW SVT-AV1 WebP Image Encode JPEG XL Decoding libjxl Coremark Stargate Digital Audio Workstation libavif avifenc Zstd Compression Kvazaar JPEG XL libjxl simdjson Ngspice miniBUDE Liquid-DSP SecureMark oneDNN LAMMPS Molecular Dynamics Simulator QuantLib PJSIP ACES DGEMM OpenSSL SMHasher
GCC 13 Compiler Benchmarks AMD EPYC Genoa quantlib: minibude: OpenMP - BM1 minibude: OpenMP - BM1 minibude: OpenMP - BM2 minibude: OpenMP - BM2 cp2k: Fayalite-FIST smhasher: FarmHash32 x86_64 AVX smhasher: FarmHash32 x86_64 AVX smhasher: t1ha0_aes_avx2 x86_64 smhasher: t1ha0_aes_avx2 x86_64 smhasher: MeowHash x86_64 AES-NI smhasher: MeowHash x86_64 AES-NI lammps: 20k Atoms lammps: Rhodopsin Protein simdjson: Kostya simdjson: TopTweet simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed jpegxl: PNG - 90 jpegxl: JPEG - 90 jpegxl: PNG - 100 jpegxl: JPEG - 100 jpegxl-decode: 1 jpegxl-decode: All webp: Default webp: Quality 100 webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Ultra Fast svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K mt-dgemm: Sustained Floating-Point Rate coremark: CoreMark Size 666 - Iterations Per Second stargate: 96000 - 1024 stargate: 192000 - 1024 pjsip: INVITE pjsip: OPTIONS, Stateful pjsip: OPTIONS, Stateless avifenc: 0 avifenc: 2 avifenc: 6 avifenc: 6, Lossless avifenc: 10, Lossless onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU ngspice: C2670 ngspice: C7552 cpuminer-opt: Magi cpuminer-opt: x25x cpuminer-opt: scrypt cpuminer-opt: Deepcoin cpuminer-opt: Garlicoin cpuminer-opt: Skeincoin cpuminer-opt: LBC, LBRY Credits cpuminer-opt: Quad SHA-256, Pyrite cpuminer-opt: Triple SHA-256, Onecoin securemark: SecureMark-TLS openssl: SHA256 openssl: RSA4096 openssl: RSA4096 liquid-dsp: 128 - 256 - 57 liquid-dsp: 256 - 256 - 57 liquid-dsp: 384 - 256 - 57 astcenc: Medium astcenc: Thorough astcenc: Exhaustive gromacs: MPI CPU - water_GMX50_bare gpaw: Carbon Nanotube kripke: Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 3096.9 5362.542 214.502 6603.684 264.147 1174.665 40565.07 26.494 102354.57 20.524 54272.14 44.977 55.618 51.585 4.17 6.96 1.27 6.63 6.53 102.1 3584.1 40.8 3708.8 9.83 9.48 0.83 0.73 48.64 277.20 18.95 11.54 1.45 3.25 0.57 2862 673 1359 2234 88 1024 1180 64.35 70.71 78.69 5.392 93.756 210.386 196.115 70.052952 7694653.899016 4.331230 2.783717 5200 9237 335767 61.122 34.104 2.347 4.398 3.541 14.8168 0.886642 21.1359 4.26600 0.380486 0.937164 0.274769 2020.24 0.443891 2.33363 0.658101 2405.63 2134.85 2479.72 0.500579 95.066 92.102 8440.79 8042.88 4790.11 159147 72413 2014770 1065487 2251067 3306643 296548 265326713587 44490.1 2939503.5 6990866667 9789500000 11249333333 493.2248 118.7392 12.9412 19.493 22.350 261562280 3112.6 5275.426 211.017 6638.007 265.520 1263.707 40563.96 26.503 102399.74 20.806 54297.04 44.950 55.739 52.287 4.19 6.91 1.24 6.43 6.86 105.3 3581.6 42.5 3685.7 9.81 9.43 0.82 0.75 48.21 272.31 18.97 11.54 1.47 3.23 0.58 2681 656 1314 2150 87 1013 1167 64.05 70.81 74.98 5.423 94.340 196.804 208.643 70.298282 7861097.856640 4.408290 2.867233 5084 9288 336791 61.073 33.902 2.317 4.462 3.572 14.2928 0.881303 23.5341 4.33366 0.388483 0.95414 0.274662 2123.35 0.445532 2.29880 0.650067 2359.29 2070.13 2444.52 0.483979 95.599 93.453 8467.24 8217.70 4782.74 162242 72130 2009367 1085827 2323995 3301217 294122 266230124070 44301.8 2924586.4 6977633333 9809800000 11270666667 420.6992 118.9932 13.0936 19.437 22.819 254648533 3120.5 5351.944 214.078 6607.091 264.284 1211.126 40565.72 26.494 102351.87 20.533 54281.94 44.957 55.721 51.455 4.20 6.61 1.25 6.74 6.46 104.4 3574.9 39.8 3695.5 9.61 9.18 0.82 0.74 48.22 269.59 18.99 11.48 1.47 3.64 0.58 2563 645 1285 1837 89 1018 1134 64.27 72.01 78.01 5.360 95.116 222.902 219.570 70.378065 7640546.272074 4.356446 2.820079 5149 9226 336885 61.658 33.911 2.331 4.437 3.581 14.6179 0.907941 19.5885 4.52440 0.392765 0.947013 0.279655 2108.46 0.442748 2.31858 0.656342 2442.35 2093.75 2418.28 0.510939 95.175 92.943 8490.73 6116.97 2959.15 164993 49523 1414047 497020 1378987 3255253 294057 266464361193 44435.3 2938488.5 6940833333 9735666667 11176000000 459.6924 116.0912 12.2221 18.231 21.721 263812847 3114.9 5365.612 214.625 6653.864 266.155 1407.125 40559.33 26.494 102403.92 20.527 54284.65 44.968 56.036 51.816 4.16 6.73 1.25 6.68 6.61 102.9 3594.8 43.9 3684.7 9.86 9.27 0.81 0.71 47.02 266.53 18.85 11.38 1.45 3.11 0.58 2826 605 1321 2208 86 975 1062 63.80 70.71 76.76 5.374 92.964 206.676 210.426 70.186467 7871273.932762 4.413379 2.868373 5132 9236 336615 62.684 34.230 2.406 4.514 3.647 14.25021 0.902303 22.9431 4.33658 0.392875 0.936374 0.276965 2099.43 0.431299 2.36166 0.647087 2377.17 2103.41 2531.42 0.504842 95.186 93.524 8355.75 7941.38 4763.91 160157 72837 2004990 1067743 2264747 3323680 296575 266899089453 44499.3 2935372.1 6999233333 9813700000 11301666667 511.5248 117.6910 13.0305 19.087 22.823 271735708 OpenBenchmarking.org
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 700 1400 2100 2800 3500 SE +/- 1.99, N = 3 SE +/- 1.71, N = 3 SE +/- 2.74, N = 3 SE +/- 4.52, N = 3 3096.9 3112.6 3114.9 3120.5 1. (CXX) g++ options: -O3 -march=native -rdynamic
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 1200 2400 3600 4800 6000 SE +/- 62.65, N = 4 SE +/- 62.89, N = 3 SE +/- 51.63, N = 3 SE +/- 53.26, N = 3 5275.43 5351.94 5362.54 5365.61 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 50 100 150 200 250 SE +/- 2.51, N = 4 SE +/- 2.52, N = 3 SE +/- 2.07, N = 3 SE +/- 2.13, N = 3 211.02 214.08 214.50 214.63 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 1400 2800 4200 5600 7000 SE +/- 43.78, N = 3 SE +/- 18.35, N = 3 SE +/- 18.27, N = 3 SE +/- 26.43, N = 3 6603.68 6607.09 6638.01 6653.86 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 60 120 180 240 300 SE +/- 1.75, N = 3 SE +/- 0.73, N = 3 SE +/- 0.73, N = 3 SE +/- 1.06, N = 3 264.15 264.28 265.52 266.16 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
SMHasher SMHasher is a hash function tester supporting various algorithms and able to make use of AVX and other modern CPU instruction set extensions. Learn more via the OpenBenchmarking.org test page.
Result
OpenBenchmarking.org MiB/sec, More Is Better SMHasher 2022-08-22 Hash: FarmHash32 x86_64 AVX Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 9K 18K 27K 36K 45K SE +/- 1.84, N = 3 SE +/- 1.44, N = 3 SE +/- 0.95, N = 3 SE +/- 0.13, N = 3 40559.33 40563.96 40565.07 40565.72 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects
cycles/hash
OpenBenchmarking.org cycles/hash, Fewer Is Better SMHasher 2022-08-22 Hash: FarmHash32 x86_64 AVX Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 Znver4 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 26.50 26.49 26.49 26.49 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -march=native -flto -flto=auto -fno-fat-lto-objects
Result
OpenBenchmarking.org MiB/sec, More Is Better SMHasher 2022-08-22 Hash: t1ha0_aes_avx2 x86_64 Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 20K 40K 60K 80K 100K SE +/- 8.85, N = 3 SE +/- 20.00, N = 3 SE +/- 11.55, N = 3 SE +/- 19.58, N = 3 102351.87 102354.57 102399.74 102403.92 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects
cycles/hash
OpenBenchmarking.org cycles/hash, Fewer Is Better SMHasher 2022-08-22 Hash: t1ha0_aes_avx2 x86_64 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 Znver4 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 20.81 20.53 20.53 20.52 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -march=native -flto -flto=auto -fno-fat-lto-objects
Result
OpenBenchmarking.org MiB/sec, More Is Better SMHasher 2022-08-22 Hash: MeowHash x86_64 AES-NI Znver4 Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 12K 24K 36K 48K 60K SE +/- 7.42, N = 3 SE +/- 9.98, N = 3 SE +/- 10.52, N = 3 SE +/- 9.47, N = 3 54272.14 54281.94 54284.65 54297.04 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -march=native -flto -flto=auto -fno-fat-lto-objects
cycles/hash
OpenBenchmarking.org cycles/hash, Fewer Is Better SMHasher 2022-08-22 Hash: MeowHash x86_64 AES-NI Znver4 Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 44.98 44.97 44.96 44.95 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -march=native -flto -flto=auto -fno-fat-lto-objects
LAMMPS Molecular Dynamics Simulator LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 13 26 39 52 65 SE +/- 0.15, N = 3 SE +/- 0.04, N = 3 SE +/- 0.21, N = 3 SE +/- 0.08, N = 3 55.62 55.72 55.74 56.04 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -lm -ldl
OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: Rhodopsin Protein Znver3 Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 12 24 36 48 60 SE +/- 0.43, N = 3 SE +/- 0.21, N = 3 SE +/- 0.40, N = 10 SE +/- 0.48, N = 3 51.46 51.59 51.82 52.29 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CXX) g++ options: -O3 -flto -lm -ldl
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: Kostya Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 0.945 1.89 2.835 3.78 4.725 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.16 4.17 4.19 4.20 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: TopTweet Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 6.61 6.73 6.91 6.96 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: LargeRandom Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 Znver4 0.2858 0.5716 0.8574 1.1432 1.429 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.24 1.25 1.25 1.27 -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: PartialTweets Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 Znver3 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 6.43 6.63 6.68 6.74 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: DistinctUserID Znver3 Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 6.46 6.53 6.61 6.86 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CXX) g++ options: -O3 -flto
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed Znver4 Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 20 40 60 80 100 SE +/- 1.03, N = 6 SE +/- 1.27, N = 3 SE +/- 0.91, N = 15 SE +/- 1.30, N = 3 102.1 102.9 104.4 105.3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 800 1600 2400 3200 4000 SE +/- 12.56, N = 15 SE +/- 36.60, N = 3 SE +/- 17.00, N = 6 SE +/- 32.49, N = 3 3574.9 3581.6 3584.1 3594.8 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 10 20 30 40 50 SE +/- 0.40, N = 15 SE +/- 0.40, N = 15 SE +/- 0.59, N = 15 SE +/- 0.79, N = 15 39.8 40.8 42.5 43.9 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 800 1600 2400 3200 4000 SE +/- 12.89, N = 15 SE +/- 13.90, N = 15 SE +/- 11.99, N = 15 SE +/- 12.59, N = 15 3684.7 3685.7 3695.5 3708.8 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.7 Input: PNG - Quality: 90 Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 9.61 9.81 9.83 9.86 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.7 Input: JPEG - Quality: 90 Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.13, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 9.18 9.27 9.43 9.48 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.7 Input: PNG - Quality: 100 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 0.1868 0.3736 0.5604 0.7472 0.934 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.81 0.82 0.82 0.83 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.7 Input: JPEG - Quality: 100 Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 0.1688 0.3376 0.5064 0.6752 0.844 SE +/- 0.01, N = 9 SE +/- 0.01, N = 9 SE +/- 0.01, N = 6 SE +/- 0.01, N = 3 0.71 0.73 0.74 0.75 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic
JPEG XL Decoding libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. The JPEG XL encoding/decoding is done using the libjxl codebase. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding libjxl 0.7 CPU Threads: 1 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 11 22 33 44 55 SE +/- 0.25, N = 3 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 47.02 48.21 48.22 48.64
OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding libjxl 0.7 CPU Threads: All Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 Znver4 60 120 180 240 300 SE +/- 1.24, N = 3 SE +/- 1.29, N = 3 SE +/- 0.30, N = 3 SE +/- 1.41, N = 3 266.53 269.59 272.31 277.20
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Default Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 18.85 18.95 18.97 18.99 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff -march=native -ljpeg -march=native -ljpeg -ltiff -march=znver3 -lgif -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lm -lpng16
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100 Znver3 + AVX-512 Znver3 Znver4 Znver4 + Prefer AVX-512 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 11.38 11.48 11.54 11.54 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff -march=znver3 -lgif -ltiff -march=native -ljpeg -march=native -ljpeg -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lm -lpng16
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 0.3308 0.6616 0.9924 1.3232 1.654 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.45 1.45 1.47 1.47 -march=native -ljpeg -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff -march=native -ljpeg -ltiff -march=znver3 -lgif -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lm -lpng16
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Highest Compression Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 0.819 1.638 2.457 3.276 4.095 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.11 3.23 3.25 3.64 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff -march=native -ljpeg -ltiff -march=native -ljpeg -march=znver3 -lgif -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lm -lpng16
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless, Highest Compression Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 0.1305 0.261 0.3915 0.522 0.6525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.57 0.58 0.58 0.58 -march=native -ljpeg -march=native -ljpeg -ltiff -march=znver3 -lgif -ltiff -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lm -lpng16
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Swirl Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 600 1200 1800 2400 3000 SE +/- 36.67, N = 3 SE +/- 8.09, N = 3 SE +/- 28.99, N = 3 SE +/- 26.19, N = 7 2563 2681 2826 2862 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Rotate Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 Znver4 150 300 450 600 750 SE +/- 3.84, N = 3 SE +/- 1.20, N = 3 SE +/- 1.00, N = 3 SE +/- 1.73, N = 3 605 645 656 673 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native -march=native 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 300 600 900 1200 1500 SE +/- 10.73, N = 3 SE +/- 13.17, N = 3 SE +/- 1.15, N = 3 SE +/- 6.56, N = 3 1285 1314 1321 1359 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 500 1000 1500 2000 2500 SE +/- 19.19, N = 3 SE +/- 5.00, N = 3 SE +/- 14.95, N = 3 SE +/- 2.00, N = 3 1837 2150 2208 2234 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Resizing Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 20 40 60 80 100 SE +/- 0.90, N = 15 SE +/- 0.94, N = 15 SE +/- 1.00, N = 15 SE +/- 1.27, N = 15 86 87 88 89 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Noise-Gaussian Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 200 400 600 800 1000 SE +/- 11.43, N = 15 SE +/- 11.39, N = 15 SE +/- 5.13, N = 3 SE +/- 6.60, N = 15 975 1013 1018 1024 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: HWB Color Space Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 Znver4 300 600 900 1200 1500 SE +/- 1.86, N = 3 SE +/- 11.42, N = 15 SE +/- 15.31, N = 15 SE +/- 4.84, N = 3 1062 1134 1167 1180 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native -march=native 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 14 28 42 56 70 SE +/- 0.10, N = 3 SE +/- 0.46, N = 3 SE +/- 0.36, N = 3 SE +/- 0.52, N = 3 63.80 64.05 64.27 64.35 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 16 32 48 64 80 SE +/- 0.83, N = 3 SE +/- 0.87, N = 4 SE +/- 0.92, N = 3 SE +/- 0.28, N = 3 70.71 70.71 70.81 72.01 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 Znver4 20 40 60 80 100 SE +/- 0.73, N = 15 SE +/- 0.64, N = 15 SE +/- 0.62, N = 3 SE +/- 1.01, N = 3 74.98 76.76 78.01 78.69 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 4 - Input: Bosphorus 4K Znver3 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 1.2202 2.4404 3.6606 4.8808 6.101 SE +/- 0.012, N = 3 SE +/- 0.037, N = 3 SE +/- 0.015, N = 3 SE +/- 0.043, N = 3 5.360 5.374 5.392 5.423 -march=znver3 -march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 4K Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 20 40 60 80 100 SE +/- 0.52, N = 3 SE +/- 0.33, N = 3 SE +/- 0.32, N = 3 SE +/- 0.46, N = 3 92.96 93.76 94.34 95.12 -march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -mavx512f -mavx512bw -mavx512dq -flto -march=native -mno-avx -mavx2
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 12 - Input: Bosphorus 4K Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 Znver3 50 100 150 200 250 SE +/- 1.78, N = 3 SE +/- 3.02, N = 15 SE +/- 4.58, N = 15 SE +/- 5.39, N = 15 196.80 206.68 210.39 222.90 -march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -march=native -flto -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 4K Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 50 100 150 200 250 SE +/- 3.33, N = 15 SE +/- 3.20, N = 15 SE +/- 2.81, N = 15 SE +/- 4.56, N = 12 196.12 208.64 210.43 219.57 -march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -march=native -flto -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
ACES DGEMM This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 16 32 48 64 80 SE +/- 0.10, N = 3 SE +/- 0.28, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 70.05 70.19 70.30 70.38 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CC) gcc options: -O3 -march=native -fopenmp -flto
Coremark This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 2M 4M 6M 8M 10M SE +/- 62802.17, N = 3 SE +/- 11460.97, N = 3 SE +/- 83462.96, N = 3 SE +/- 2823.14, N = 3 7640546.27 7694653.90 7861097.86 7871273.93 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O2 -O3 -flto -lrt" -lrt
Stargate Digital Audio Workstation Stargate is an open-source, cross-platform digital audio workstation (DAW) software package with "a unique and carefully curated experience" with scalability from old systems up through modern multi-core systems. Stargate is GPLv3 licensed and makes use of Qt5 (PyQt5) for its user-interface. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 0.993 1.986 2.979 3.972 4.965 SE +/- 0.006484, N = 3 SE +/- 0.016378, N = 3 SE +/- 0.013873, N = 3 SE +/- 0.011844, N = 3 4.331230 4.356446 4.408290 4.413379 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 0.6454 1.2908 1.9362 2.5816 3.227 SE +/- 0.009628, N = 3 SE +/- 0.014903, N = 3 SE +/- 0.003508, N = 3 SE +/- 0.000776, N = 3 2.783717 2.820079 2.867233 2.868373 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
PJSIP PJSIP is a free and open source multimedia communication library written in C language implementing standard based protocols such as SIP, SDP, RTP, STUN, TURN, and ICE. It combines signaling protocol (SIP) with rich multimedia framework and NAT traversal functionality into high level API that is portable and suitable for almost any type of systems ranging from desktops, embedded systems, to mobile handsets. This test profile is making use of pjsip-perf with both the client/server on teh system. More details on the PJSIP benchmark at https://www.pjsip.org/high-performance-sip.htm Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: INVITE Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 Znver4 1100 2200 3300 4400 5500 SE +/- 51.36, N = 5 SE +/- 24.85, N = 3 SE +/- 15.00, N = 3 SE +/- 19.91, N = 3 5084 5132 5149 5200 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native 1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateful Znver3 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 2K 4K 6K 8K 10K SE +/- 72.95, N = 3 SE +/- 36.35, N = 3 SE +/- 23.25, N = 3 SE +/- 32.42, N = 3 9226 9236 9237 9288 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateless Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 70K 140K 210K 280K 350K SE +/- 2531.21, N = 3 SE +/- 3613.75, N = 15 SE +/- 5240.59, N = 15 SE +/- 4433.83, N = 12 335767 336615 336791 336885 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto
libavif avifenc This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 0 Znver3 + AVX-512 Znver3 Znver4 Znver4 + Prefer AVX-512 14 28 42 56 70 SE +/- 0.02, N = 3 SE +/- 0.30, N = 3 SE +/- 0.13, N = 3 SE +/- 0.07, N = 3 62.68 61.66 61.12 61.07 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native -march=native 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 2 Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 8 16 24 32 40 SE +/- 0.15, N = 3 SE +/- 0.26, N = 3 SE +/- 0.25, N = 3 SE +/- 0.19, N = 3 34.23 34.10 33.91 33.90 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6 Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 0.5414 1.0828 1.6242 2.1656 2.707 SE +/- 0.005, N = 3 SE +/- 0.014, N = 3 SE +/- 0.010, N = 3 SE +/- 0.002, N = 3 2.406 2.347 2.331 2.317 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6, Lossless Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 1.0157 2.0314 3.0471 4.0628 5.0785 SE +/- 0.010, N = 3 SE +/- 0.064, N = 3 SE +/- 0.027, N = 3 SE +/- 0.052, N = 4 4.514 4.462 4.437 4.398 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 10, Lossless Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 Znver4 0.8206 1.6412 2.4618 3.2824 4.103 SE +/- 0.038, N = 3 SE +/- 0.016, N = 3 SE +/- 0.017, N = 3 SE +/- 0.012, N = 3 3.647 3.581 3.572 3.541 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native -march=native 1. (CXX) g++ options: -O3 -fPIC -flto -lm
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.34, N = 12 SE +/- 0.24, N = 15 SE +/- 0.42, N = 12 14.82 14.62 14.29 14.25 MIN: 9.81 -march=znver3 - MIN: 8.56 MIN: 8.38 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 6.05 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Znver3 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 0.2043 0.4086 0.6129 0.8172 1.0215 SE +/- 0.011617, N = 3 SE +/- 0.002136, N = 3 SE +/- 0.004516, N = 3 SE +/- 0.004696, N = 3 0.907941 0.902303 0.886642 0.881303 -march=znver3 - MIN: 0.75 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.75 MIN: 0.76 MIN: 0.74 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 Znver3 6 12 18 24 30 SE +/- 0.56, N = 12 SE +/- 0.76, N = 15 SE +/- 0.95, N = 15 SE +/- 0.82, N = 15 23.53 22.94 21.14 19.59 MIN: 10.06 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 9.35 MIN: 9.61 -march=znver3 - MIN: 9.21 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 1.018 2.036 3.054 4.072 5.09 SE +/- 0.04256, N = 15 SE +/- 0.03566, N = 9 SE +/- 0.04436, N = 15 SE +/- 0.04728, N = 15 4.52440 4.33658 4.33366 4.26600 -march=znver3 - MIN: 3.03 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2.98 MIN: 2.77 MIN: 2.83 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 Znver4 0.0884 0.1768 0.2652 0.3536 0.442 SE +/- 0.004423, N = 3 SE +/- 0.000630, N = 3 SE +/- 0.003460, N = 3 SE +/- 0.000892, N = 3 0.392875 0.392765 0.388483 0.380486 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.28 -march=znver3 - MIN: 0.28 MIN: 0.28 MIN: 0.28 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 0.2147 0.4294 0.6441 0.8588 1.0735 SE +/- 0.009537, N = 3 SE +/- 0.009578, N = 5 SE +/- 0.012915, N = 3 SE +/- 0.011238, N = 4 0.954140 0.947013 0.937164 0.936374 MIN: 0.78 -march=znver3 - MIN: 0.79 MIN: 0.77 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.79 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Znver3 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 0.0629 0.1258 0.1887 0.2516 0.3145 SE +/- 0.002168, N = 15 SE +/- 0.003530, N = 3 SE +/- 0.001957, N = 12 SE +/- 0.001864, N = 3 0.279655 0.276965 0.274769 0.274662 -march=znver3 - MIN: 0.23 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.24 MIN: 0.24 MIN: 0.24 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 Znver4 500 1000 1500 2000 2500 SE +/- 26.64, N = 12 SE +/- 26.49, N = 3 SE +/- 18.84, N = 15 SE +/- 16.34, N = 3 2123.35 2108.46 2099.43 2020.24 MIN: 1873.09 -march=znver3 - MIN: 1945.83 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1917.53 MIN: 1878.54 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU Znver4 + Prefer AVX-512 Znver4 Znver3 Znver3 + AVX-512 0.1002 0.2004 0.3006 0.4008 0.501 SE +/- 0.000431, N = 3 SE +/- 0.001884, N = 3 SE +/- 0.002906, N = 3 SE +/- 0.004441, N = 15 0.445532 0.443891 0.442748 0.431299 MIN: 0.34 MIN: 0.37 -march=znver3 - MIN: 0.34 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.33 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 0.5314 1.0628 1.5942 2.1256 2.657 SE +/- 0.00584, N = 3 SE +/- 0.00839, N = 3 SE +/- 0.00532, N = 3 SE +/- 0.00909, N = 3 2.36166 2.33363 2.31858 2.29880 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1.92 MIN: 1.95 -march=znver3 - MIN: 1.91 MIN: 1.92 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 0.1481 0.2962 0.4443 0.5924 0.7405 SE +/- 0.005902, N = 3 SE +/- 0.006925, N = 4 SE +/- 0.001081, N = 3 SE +/- 0.003927, N = 3 0.658101 0.656342 0.650067 0.647087 MIN: 0.53 -march=znver3 - MIN: 0.54 MIN: 0.53 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.56 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Znver3 Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 500 1000 1500 2000 2500 SE +/- 32.77, N = 12 SE +/- 20.65, N = 15 SE +/- 26.39, N = 15 SE +/- 31.29, N = 15 2442.35 2405.63 2377.17 2359.29 -march=znver3 - MIN: 2123.49 MIN: 2158.4 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2080.2 MIN: 2066.84 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Znver4 Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 500 1000 1500 2000 2500 SE +/- 9.82, N = 3 SE +/- 15.62, N = 15 SE +/- 27.68, N = 15 SE +/- 16.85, N = 3 2134.85 2103.41 2093.75 2070.13 MIN: 2008.7 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1883.21 -march=znver3 - MIN: 1826.26 MIN: 1938.32 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 500 1000 1500 2000 2500 SE +/- 16.53, N = 3 SE +/- 24.81, N = 3 SE +/- 25.52, N = 3 SE +/- 24.11, N = 15 2531.42 2479.72 2444.52 2418.28 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2281.87 MIN: 2315.89 MIN: 2258.45 -march=znver3 - MIN: 2149.67 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU Znver3 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 0.115 0.23 0.345 0.46 0.575 SE +/- 0.006281, N = 3 SE +/- 0.006958, N = 3 SE +/- 0.004211, N = 3 SE +/- 0.003087, N = 3 0.510939 0.504842 0.500579 0.483979 -march=znver3 - MIN: 0.39 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.39 MIN: 0.39 MIN: 0.39 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 Znver4 20 40 60 80 100 SE +/- 0.17, N = 3 SE +/- 0.24, N = 3 SE +/- 0.09, N = 3 SE +/- 0.76, N = 3 95.60 95.19 95.18 95.07 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native 1. (CC) gcc options: -O3 -flto -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 20 40 60 80 100 SE +/- 0.28, N = 3 SE +/- 0.14, N = 3 SE +/- 0.28, N = 3 SE +/- 1.00, N = 3 93.52 93.45 92.94 92.10 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CC) gcc options: -O3 -flto -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
Cpuminer-Opt Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Magi Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 2K 4K 6K 8K 10K SE +/- 50.94, N = 3 SE +/- 47.28, N = 3 SE +/- 64.85, N = 3 SE +/- 51.27, N = 3 8355.75 8440.79 8467.24 8490.73 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: x25x Znver3 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 2K 4K 6K 8K 10K SE +/- 17.26, N = 3 SE +/- 15.17, N = 3 SE +/- 74.42, N = 3 SE +/- 90.39, N = 3 6116.97 7941.38 8042.88 8217.70 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: scrypt Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 1000 2000 3000 4000 5000 SE +/- 0.45, N = 3 SE +/- 1.99, N = 3 SE +/- 1.22, N = 3 SE +/- 0.45, N = 3 2959.15 4763.91 4782.74 4790.11 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Deepcoin Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 40K 80K 120K 160K 200K SE +/- 81.72, N = 3 SE +/- 880.18, N = 3 SE +/- 1703.12, N = 5 SE +/- 218.28, N = 3 159147 160157 162242 164993 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Garlicoin Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 16K 32K 48K 64K 80K SE +/- 66.92, N = 3 SE +/- 295.35, N = 3 SE +/- 461.75, N = 3 SE +/- 742.93, N = 3 49523 72130 72413 72837 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Skeincoin Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 400K 800K 1200K 1600K 2000K SE +/- 5094.00, N = 3 SE +/- 24878.62, N = 3 SE +/- 7108.55, N = 3 SE +/- 10323.18, N = 3 1414047 2004990 2009367 2014770 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: LBC, LBRY Credits Znver3 Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 200K 400K 600K 800K 1000K SE +/- 3568.28, N = 3 SE +/- 829.54, N = 3 SE +/- 1800.04, N = 3 SE +/- 12123.53, N = 3 497020 1065487 1067743 1085827 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Quad SHA-256, Pyrite Znver3 Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 500K 1000K 1500K 2000K 2500K SE +/- 5899.09, N = 3 SE +/- 10306.20, N = 3 SE +/- 22716.96, N = 3 SE +/- 25368.27, N = 4 1378987 2251067 2264747 2323995 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Triple SHA-256, Onecoin Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 700K 1400K 2100K 2800K 3500K SE +/- 4313.22, N = 3 SE +/- 22336.69, N = 3 SE +/- 26496.25, N = 3 SE +/- 26057.45, N = 3 3255253 3301217 3306643 3323680 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 60K 120K 180K 240K 300K SE +/- 1276.92, N = 3 SE +/- 640.61, N = 3 SE +/- 383.28, N = 3 SE +/- 380.50, N = 3 294057 294122 296548 296575 1. (CC) gcc options: -pedantic -O3
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 60000M 120000M 180000M 240000M 300000M SE +/- 150345864.28, N = 3 SE +/- 172236238.05, N = 3 SE +/- 18811489.95, N = 3 SE +/- 143443542.20, N = 3 265326713587 266230124070 266464361193 266899089453 -march=native -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 10K 20K 30K 40K 50K SE +/- 151.57, N = 3 SE +/- 22.49, N = 3 SE +/- 8.03, N = 3 SE +/- 0.44, N = 3 44301.8 44435.3 44490.1 44499.3 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 Znver4 600K 1200K 1800K 2400K 3000K SE +/- 10617.65, N = 3 SE +/- 1641.03, N = 3 SE +/- 228.34, N = 3 SE +/- 140.73, N = 3 2924586.4 2935372.1 2938488.5 2939503.5 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native 1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 1500M 3000M 4500M 6000M 7500M SE +/- 6548367.06, N = 3 SE +/- 6145549.43, N = 3 SE +/- 3192874.01, N = 3 SE +/- 4658445.14, N = 3 6940833333 6977633333 6990866667 6999233333 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 256 - Buffer Length: 256 - Filter Length: 57 Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 2000M 4000M 6000M 8000M 10000M SE +/- 24004606.04, N = 3 SE +/- 5651843.36, N = 3 SE +/- 8533658.85, N = 3 SE +/- 4697162.26, N = 3 9735666667 9789500000 9809800000 9813700000 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 384 - Buffer Length: 256 - Filter Length: 57 Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 2000M 4000M 6000M 8000M 10000M SE +/- 10066445.91, N = 3 SE +/- 7356025.50, N = 3 SE +/- 7218802.61, N = 3 SE +/- 2403700.85, N = 3 11176000000 11249333333 11270666667 11301666667 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Medium Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 110 220 330 440 550 SE +/- 0.56, N = 3 SE +/- 6.17, N = 3 SE +/- 3.36, N = 13 SE +/- 5.68, N = 3 420.70 459.69 493.22 511.52 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough Znver3 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 30 60 90 120 150 SE +/- 0.10, N = 3 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 116.09 117.69 118.74 118.99 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Exhaustive Znver3 Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 12.22 12.94 13.03 13.09 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CXX) g++ options: -O3 -flto -pthread
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 18.23 19.09 19.44 19.49 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CXX) g++ options: -O3 -flto
GPAW GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better GPAW 22.1 Input: Carbon Nanotube Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.28, N = 3 SE +/- 0.15, N = 3 SE +/- 0.26, N = 4 22.82 22.82 22.35 21.72 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CC) gcc options: -shared -fwrapv -O2 -O3 -flto -lxc -lblas -lmpi
Kripke Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. Kripke is developed by LLNL. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 Znver4 + Prefer AVX-512 Znver4 Znver3 Znver3 + AVX-512 60M 120M 180M 240M 300M SE +/- 2000945.99, N = 3 SE +/- 3293132.01, N = 15 SE +/- 2950522.72, N = 15 SE +/- 3107958.27, N = 12 254648533 261562280 263812847 271735708 -march=native -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -fopenmp
Znver4 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 3 January 2023 08:40 by user phoronix.
Znver4 + Prefer AVX-512 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" CFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 3 January 2023 18:42 by user phoronix.
Znver3 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=znver3 -flto" CFLAGS="-O3 -march=znver3 -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 4 January 2023 05:51 by user phoronix.
Znver3 + AVX-512 Processor: 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.04, Kernel: 5.19.0-21-generic (x86_64), Desktop: GNOME Shell 43.1, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 13.0.0 20230103, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" CFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 4 January 2023 13:46 by user phoronix.