AMD EPYC 9654 GCC 13 development compiler benchmarks by Michael Larabel for a future article.
Znver4 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Znver4 + Prefer AVX-512 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" CFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Znver3 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=znver3 -flto" CFLAGS="-O3 -march=znver3 -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Znver3 + AVX-512 Processor: 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.04, Kernel: 5.19.0-21-generic (x86_64), Desktop: GNOME Shell 43.1, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 13.0.0 20230103, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" CFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
GCC 13 Compiler Benchmarks AMD EPYC Genoa OpenBenchmarking.org Phoronix Test Suite 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1002E BIOS) AMD Device 14a4 1520GB 800GB INTEL SSDPF21Q800GB ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.04 5.19.0-21-generic (x86_64) GNOME Shell 43.1 X Server 1.21.1.4 1.3.224 GCC 13.0.0 20230103 ext4 1920x1080 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution GCC 13 Compiler Benchmarks AMD EPYC Genoa Performance System Logs - Transparent Huge Pages: madvise - Znver4: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto" - Znver4 + Prefer AVX-512: CXXFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" CFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" - Znver3: CXXFLAGS="-O3 -march=znver3 -flto" CFLAGS="-O3 -march=znver3 -flto" - Znver3 + AVX-512: CXXFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" CFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" - --disable-multilib --enable-checking=release - Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110d - Python 3.10.9 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 Result Overview Phoronix Test Suite 100% 109% 118% 127% Cpuminer-Opt CP2K Molecular Dynamics GROMACS Kripke GraphicsMagick ASTC Encoder GPAW SVT-AV1 WebP Image Encode JPEG XL Decoding libjxl Coremark Stargate Digital Audio Workstation libavif avifenc Zstd Compression Kvazaar JPEG XL libjxl simdjson Ngspice miniBUDE Liquid-DSP SecureMark oneDNN LAMMPS Molecular Dynamics Simulator QuantLib PJSIP ACES DGEMM OpenSSL SMHasher
GCC 13 Compiler Benchmarks AMD EPYC Genoa smhasher: FarmHash32 x86_64 AVX smhasher: t1ha0_aes_avx2 x86_64 smhasher: MeowHash x86_64 AES-NI ngspice: C2670 ngspice: C7552 stargate: 96000 - 1024 stargate: 192000 - 1024 astcenc: Medium astcenc: Thorough astcenc: Exhaustive jpegxl-decode: 1 jpegxl-decode: All jpegxl: PNG - 90 jpegxl: JPEG - 90 jpegxl: PNG - 100 jpegxl: JPEG - 100 webp: Default webp: Quality 100 webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression securemark: SecureMark-TLS quantlib: minibude: OpenMP - BM1 minibude: OpenMP - BM1 minibude: OpenMP - BM2 minibude: OpenMP - BM2 gromacs: MPI CPU - water_GMX50_bare lammps: 20k Atoms lammps: Rhodopsin Protein onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU mt-dgemm: Sustained Floating-Point Rate kripke: gpaw: Carbon Nanotube cp2k: Fayalite-FIST coremark: CoreMark Size 666 - Iterations Per Second compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed cpuminer-opt: Magi cpuminer-opt: x25x cpuminer-opt: scrypt cpuminer-opt: Deepcoin cpuminer-opt: Garlicoin cpuminer-opt: Skeincoin cpuminer-opt: LBC, LBRY Credits cpuminer-opt: Quad SHA-256, Pyrite cpuminer-opt: Triple SHA-256, Onecoin kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Ultra Fast graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K avifenc: 0 avifenc: 2 avifenc: 6 avifenc: 6, Lossless avifenc: 10, Lossless liquid-dsp: 128 - 256 - 57 liquid-dsp: 256 - 256 - 57 liquid-dsp: 384 - 256 - 57 openssl: SHA256 openssl: RSA4096 openssl: RSA4096 simdjson: Kostya simdjson: TopTweet simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID pjsip: INVITE pjsip: OPTIONS, Stateful pjsip: OPTIONS, Stateless smhasher: FarmHash32 x86_64 AVX smhasher: t1ha0_aes_avx2 x86_64 smhasher: MeowHash x86_64 AES-NI Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 40565.07 102354.57 54272.14 95.066 92.102 4.331230 2.783717 493.2248 118.7392 12.9412 48.64 277.20 9.83 9.48 0.83 0.73 18.95 11.54 1.45 3.25 0.57 296548 3096.9 5362.542 214.502 6603.684 264.147 19.493 55.618 51.585 14.8168 0.886642 21.1359 4.26600 0.380486 0.937164 0.274769 2020.24 0.443891 2.33363 0.658101 2405.63 2134.85 2479.72 0.500579 70.052952 261562280 22.350 1174.665 7694653.899016 102.1 3584.1 40.8 3708.8 8440.79 8042.88 4790.11 159147 72413 2014770 1065487 2251067 3306643 64.35 70.71 78.69 2862 673 1359 2234 88 1024 1180 5.392 93.756 210.386 196.115 61.122 34.104 2.347 4.398 3.541 6990866667 9789500000 11249333333 265326713587 44490.1 2939503.5 4.17 6.96 1.27 6.63 6.53 5200 9237 335767 26.494 20.524 44.977 40563.96 102399.74 54297.04 95.599 93.453 4.408290 2.867233 420.6992 118.9932 13.0936 48.21 272.31 9.81 9.43 0.82 0.75 18.97 11.54 1.47 3.23 0.58 294122 3112.6 5275.426 211.017 6638.007 265.520 19.437 55.739 52.287 14.2928 0.881303 23.5341 4.33366 0.388483 0.95414 0.274662 2123.35 0.445532 2.29880 0.650067 2359.29 2070.13 2444.52 0.483979 70.298282 254648533 22.819 1263.707 7861097.856640 105.3 3581.6 42.5 3685.7 8467.24 8217.70 4782.74 162242 72130 2009367 1085827 2323995 3301217 64.05 70.81 74.98 2681 656 1314 2150 87 1013 1167 5.423 94.340 196.804 208.643 61.073 33.902 2.317 4.462 3.572 6977633333 9809800000 11270666667 266230124070 44301.8 2924586.4 4.19 6.91 1.24 6.43 6.86 5084 9288 336791 26.503 20.806 44.950 40565.72 102351.87 54281.94 95.175 92.943 4.356446 2.820079 459.6924 116.0912 12.2221 48.22 269.59 9.61 9.18 0.82 0.74 18.99 11.48 1.47 3.64 0.58 294057 3120.5 5351.944 214.078 6607.091 264.284 18.231 55.721 51.455 14.6179 0.907941 19.5885 4.52440 0.392765 0.947013 0.279655 2108.46 0.442748 2.31858 0.656342 2442.35 2093.75 2418.28 0.510939 70.378065 263812847 21.721 1211.126 7640546.272074 104.4 3574.9 39.8 3695.5 8490.73 6116.97 2959.15 164993 49523 1414047 497020 1378987 3255253 64.27 72.01 78.01 2563 645 1285 1837 89 1018 1134 5.360 95.116 222.902 219.570 61.658 33.911 2.331 4.437 3.581 6940833333 9735666667 11176000000 266464361193 44435.3 2938488.5 4.20 6.61 1.25 6.74 6.46 5149 9226 336885 26.494 20.533 44.957 40559.33 102403.92 54284.65 95.186 93.524 4.413379 2.868373 511.5248 117.6910 13.0305 47.02 266.53 9.86 9.27 0.81 0.71 18.85 11.38 1.45 3.11 0.58 296575 3114.9 5365.612 214.625 6653.864 266.155 19.087 56.036 51.816 14.25021 0.902303 22.9431 4.33658 0.392875 0.936374 0.276965 2099.43 0.431299 2.36166 0.647087 2377.17 2103.41 2531.42 0.504842 70.186467 271735708 22.823 1407.125 7871273.932762 102.9 3594.8 43.9 3684.7 8355.75 7941.38 4763.91 160157 72837 2004990 1067743 2264747 3323680 63.80 70.71 76.76 2826 605 1321 2208 86 975 1062 5.374 92.964 206.676 210.426 62.684 34.230 2.406 4.514 3.647 6999233333 9813700000 11301666667 266899089453 44499.3 2935372.1 4.16 6.73 1.25 6.68 6.61 5132 9236 336615 26.494 20.527 44.968 OpenBenchmarking.org
SMHasher SMHasher is a hash function tester supporting various algorithms and able to make use of AVX and other modern CPU instruction set extensions. Learn more via the OpenBenchmarking.org test page.
Result
OpenBenchmarking.org MiB/sec, More Is Better SMHasher 2022-08-22 Hash: FarmHash32 x86_64 AVX Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 9K 18K 27K 36K 45K SE +/- 0.13, N = 3 SE +/- 0.95, N = 3 SE +/- 1.44, N = 3 SE +/- 1.84, N = 3 40565.72 40565.07 40563.96 40559.33 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects
cycles/hash
OpenBenchmarking.org cycles/hash, Fewer Is Better SMHasher 2022-08-22 Hash: FarmHash32 x86_64 AVX Znver4 Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 26.49 26.49 26.49 26.50 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -march=native -flto -flto=auto -fno-fat-lto-objects
Result
OpenBenchmarking.org MiB/sec, More Is Better SMHasher 2022-08-22 Hash: t1ha0_aes_avx2 x86_64 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 20K 40K 60K 80K 100K SE +/- 19.58, N = 3 SE +/- 11.55, N = 3 SE +/- 20.00, N = 3 SE +/- 8.85, N = 3 102403.92 102399.74 102354.57 102351.87 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects
cycles/hash
OpenBenchmarking.org cycles/hash, Fewer Is Better SMHasher 2022-08-22 Hash: t1ha0_aes_avx2 x86_64 Znver4 Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 20.52 20.53 20.53 20.81 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -march=native -flto -flto=auto -fno-fat-lto-objects
Result
OpenBenchmarking.org MiB/sec, More Is Better SMHasher 2022-08-22 Hash: MeowHash x86_64 AES-NI Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 Znver4 12K 24K 36K 48K 60K SE +/- 9.47, N = 3 SE +/- 10.52, N = 3 SE +/- 9.98, N = 3 SE +/- 7.42, N = 3 54297.04 54284.65 54281.94 54272.14 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -march=native -flto -flto=auto -fno-fat-lto-objects
cycles/hash
OpenBenchmarking.org cycles/hash, Fewer Is Better SMHasher 2022-08-22 Hash: MeowHash x86_64 AES-NI Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 Znver4 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 44.95 44.96 44.97 44.98 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -march=native -flto -flto=auto -fno-fat-lto-objects
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 Znver4 Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 20 40 60 80 100 SE +/- 0.76, N = 3 SE +/- 0.09, N = 3 SE +/- 0.24, N = 3 SE +/- 0.17, N = 3 95.07 95.18 95.19 95.60 -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -O3 -flto -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 20 40 60 80 100 SE +/- 1.00, N = 3 SE +/- 0.28, N = 3 SE +/- 0.14, N = 3 SE +/- 0.28, N = 3 92.10 92.94 93.45 93.52 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O3 -flto -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
Stargate Digital Audio Workstation Stargate is an open-source, cross-platform digital audio workstation (DAW) software package with "a unique and carefully curated experience" with scalability from old systems up through modern multi-core systems. Stargate is GPLv3 licensed and makes use of Qt5 (PyQt5) for its user-interface. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 0.993 1.986 2.979 3.972 4.965 SE +/- 0.011844, N = 3 SE +/- 0.013873, N = 3 SE +/- 0.016378, N = 3 SE +/- 0.006484, N = 3 4.413379 4.408290 4.356446 4.331230 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 0.6454 1.2908 1.9362 2.5816 3.227 SE +/- 0.000776, N = 3 SE +/- 0.003508, N = 3 SE +/- 0.014903, N = 3 SE +/- 0.009628, N = 3 2.868373 2.867233 2.820079 2.783717 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Medium Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 110 220 330 440 550 SE +/- 5.68, N = 3 SE +/- 3.36, N = 13 SE +/- 6.17, N = 3 SE +/- 0.56, N = 3 511.52 493.22 459.69 420.70 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 Znver3 30 60 90 120 150 SE +/- 0.08, N = 3 SE +/- 0.12, N = 3 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 118.99 118.74 117.69 116.09 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Exhaustive Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 Znver3 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 13.09 13.03 12.94 12.22 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -pthread
JPEG XL Decoding libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. The JPEG XL encoding/decoding is done using the libjxl codebase. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding libjxl 0.7 CPU Threads: 1 Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.25, N = 3 48.64 48.22 48.21 47.02
OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding libjxl 0.7 CPU Threads: All Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 60 120 180 240 300 SE +/- 1.41, N = 3 SE +/- 0.30, N = 3 SE +/- 1.29, N = 3 SE +/- 1.24, N = 3 277.20 272.31 269.59 266.53
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.7 Input: PNG - Quality: 90 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 9.86 9.83 9.81 9.61 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.7 Input: JPEG - Quality: 90 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.13, N = 3 SE +/- 0.03, N = 3 9.48 9.43 9.27 9.18 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.7 Input: PNG - Quality: 100 Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 0.1868 0.3736 0.5604 0.7472 0.934 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.83 0.82 0.82 0.81 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.7 Input: JPEG - Quality: 100 Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 0.1688 0.3376 0.5064 0.6752 0.844 SE +/- 0.01, N = 3 SE +/- 0.01, N = 6 SE +/- 0.01, N = 9 SE +/- 0.01, N = 9 0.75 0.74 0.73 0.71 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Default Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.07, N = 3 18.99 18.97 18.95 18.85 -march=znver3 -lgif -ltiff -march=native -ljpeg -ltiff -march=native -ljpeg -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lpng16 -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100 Znver4 + Prefer AVX-512 Znver4 Znver3 Znver3 + AVX-512 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.54 11.54 11.48 11.38 -march=native -ljpeg -ltiff -march=native -ljpeg -march=znver3 -lgif -ltiff -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lm -lpng16
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 0.3308 0.6616 0.9924 1.3232 1.654 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.47 1.47 1.45 1.45 -march=znver3 -lgif -ltiff -march=native -ljpeg -ltiff -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff -march=native -ljpeg 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lpng16 -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Highest Compression Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 0.819 1.638 2.457 3.276 4.095 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.64 3.25 3.23 3.11 -march=znver3 -lgif -ltiff -march=native -ljpeg -march=native -ljpeg -ltiff -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lpng16 -lm
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Lossless, Highest Compression Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 Znver4 0.1305 0.261 0.3915 0.522 0.6525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.58 0.58 0.58 0.57 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff -march=znver3 -lgif -ltiff -march=native -ljpeg -ltiff -march=native -ljpeg 1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lm -lpng16
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 60K 120K 180K 240K 300K SE +/- 380.50, N = 3 SE +/- 383.28, N = 3 SE +/- 640.61, N = 3 SE +/- 1276.92, N = 3 296575 296548 294122 294057 1. (CC) gcc options: -pedantic -O3
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 700 1400 2100 2800 3500 SE +/- 4.52, N = 3 SE +/- 2.74, N = 3 SE +/- 1.71, N = 3 SE +/- 1.99, N = 3 3120.5 3114.9 3112.6 3096.9 1. (CXX) g++ options: -O3 -march=native -rdynamic
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 1200 2400 3600 4800 6000 SE +/- 53.26, N = 3 SE +/- 51.63, N = 3 SE +/- 62.89, N = 3 SE +/- 62.65, N = 4 5365.61 5362.54 5351.94 5275.43 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 50 100 150 200 250 SE +/- 2.13, N = 3 SE +/- 2.07, N = 3 SE +/- 2.52, N = 3 SE +/- 2.51, N = 4 214.63 214.50 214.08 211.02 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 1400 2800 4200 5600 7000 SE +/- 26.43, N = 3 SE +/- 18.27, N = 3 SE +/- 18.35, N = 3 SE +/- 43.78, N = 3 6653.86 6638.01 6607.09 6603.68 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 60 120 180 240 300 SE +/- 1.06, N = 3 SE +/- 0.73, N = 3 SE +/- 0.73, N = 3 SE +/- 1.75, N = 3 266.16 265.52 264.28 264.15 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 19.49 19.44 19.09 18.23 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto
LAMMPS Molecular Dynamics Simulator LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 13 26 39 52 65 SE +/- 0.08, N = 3 SE +/- 0.21, N = 3 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 56.04 55.74 55.72 55.62 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CXX) g++ options: -O3 -flto -lm -ldl
OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: Rhodopsin Protein Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 Znver3 12 24 36 48 60 SE +/- 0.48, N = 3 SE +/- 0.40, N = 10 SE +/- 0.21, N = 3 SE +/- 0.43, N = 3 52.29 51.82 51.59 51.46 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -lm -ldl
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 4 8 12 16 20 SE +/- 0.42, N = 12 SE +/- 0.24, N = 15 SE +/- 0.34, N = 12 SE +/- 0.11, N = 3 14.25 14.29 14.62 14.82 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 6.05 MIN: 8.38 -march=znver3 - MIN: 8.56 MIN: 9.81 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 Znver3 0.2043 0.4086 0.6129 0.8172 1.0215 SE +/- 0.004696, N = 3 SE +/- 0.004516, N = 3 SE +/- 0.002136, N = 3 SE +/- 0.011617, N = 3 0.881303 0.886642 0.902303 0.907941 MIN: 0.74 MIN: 0.76 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.75 -march=znver3 - MIN: 0.75 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU Znver3 Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 6 12 18 24 30 SE +/- 0.82, N = 15 SE +/- 0.95, N = 15 SE +/- 0.76, N = 15 SE +/- 0.56, N = 12 19.59 21.14 22.94 23.53 -march=znver3 - MIN: 9.21 MIN: 9.61 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 9.35 MIN: 10.06 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 1.018 2.036 3.054 4.072 5.09 SE +/- 0.04728, N = 15 SE +/- 0.04436, N = 15 SE +/- 0.03566, N = 9 SE +/- 0.04256, N = 15 4.26600 4.33366 4.33658 4.52440 MIN: 2.83 MIN: 2.77 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2.98 -march=znver3 - MIN: 3.03 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 0.0884 0.1768 0.2652 0.3536 0.442 SE +/- 0.000892, N = 3 SE +/- 0.003460, N = 3 SE +/- 0.000630, N = 3 SE +/- 0.004423, N = 3 0.380486 0.388483 0.392765 0.392875 MIN: 0.28 MIN: 0.28 -march=znver3 - MIN: 0.28 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.28 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 0.2147 0.4294 0.6441 0.8588 1.0735 SE +/- 0.011238, N = 4 SE +/- 0.012915, N = 3 SE +/- 0.009578, N = 5 SE +/- 0.009537, N = 3 0.936374 0.937164 0.947013 0.954140 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.79 MIN: 0.77 -march=znver3 - MIN: 0.79 MIN: 0.78 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 Znver3 0.0629 0.1258 0.1887 0.2516 0.3145 SE +/- 0.001864, N = 3 SE +/- 0.001957, N = 12 SE +/- 0.003530, N = 3 SE +/- 0.002168, N = 15 0.274662 0.274769 0.276965 0.279655 MIN: 0.24 MIN: 0.24 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.24 -march=znver3 - MIN: 0.23 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Znver4 Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 500 1000 1500 2000 2500 SE +/- 16.34, N = 3 SE +/- 18.84, N = 15 SE +/- 26.49, N = 3 SE +/- 26.64, N = 12 2020.24 2099.43 2108.46 2123.35 MIN: 1878.54 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1917.53 -march=znver3 - MIN: 1945.83 MIN: 1873.09 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU Znver3 + AVX-512 Znver3 Znver4 Znver4 + Prefer AVX-512 0.1002 0.2004 0.3006 0.4008 0.501 SE +/- 0.004441, N = 15 SE +/- 0.002906, N = 3 SE +/- 0.001884, N = 3 SE +/- 0.000431, N = 3 0.431299 0.442748 0.443891 0.445532 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.33 -march=znver3 - MIN: 0.34 MIN: 0.37 MIN: 0.34 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 0.5314 1.0628 1.5942 2.1256 2.657 SE +/- 0.00909, N = 3 SE +/- 0.00532, N = 3 SE +/- 0.00839, N = 3 SE +/- 0.00584, N = 3 2.29880 2.31858 2.33363 2.36166 MIN: 1.92 -march=znver3 - MIN: 1.91 MIN: 1.95 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1.92 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 Znver4 0.1481 0.2962 0.4443 0.5924 0.7405 SE +/- 0.003927, N = 3 SE +/- 0.001081, N = 3 SE +/- 0.006925, N = 4 SE +/- 0.005902, N = 3 0.647087 0.650067 0.656342 0.658101 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.56 MIN: 0.53 -march=znver3 - MIN: 0.54 MIN: 0.53 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 Znver3 500 1000 1500 2000 2500 SE +/- 31.29, N = 15 SE +/- 26.39, N = 15 SE +/- 20.65, N = 15 SE +/- 32.77, N = 12 2359.29 2377.17 2405.63 2442.35 MIN: 2066.84 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2080.2 MIN: 2158.4 -march=znver3 - MIN: 2123.49 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 Znver4 500 1000 1500 2000 2500 SE +/- 16.85, N = 3 SE +/- 27.68, N = 15 SE +/- 15.62, N = 15 SE +/- 9.82, N = 3 2070.13 2093.75 2103.41 2134.85 MIN: 1938.32 -march=znver3 - MIN: 1826.26 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1883.21 MIN: 2008.7 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 500 1000 1500 2000 2500 SE +/- 24.11, N = 15 SE +/- 25.52, N = 3 SE +/- 24.81, N = 3 SE +/- 16.53, N = 3 2418.28 2444.52 2479.72 2531.42 -march=znver3 - MIN: 2149.67 MIN: 2258.45 MIN: 2315.89 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2281.87 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 Znver3 0.115 0.23 0.345 0.46 0.575 SE +/- 0.003087, N = 3 SE +/- 0.004211, N = 3 SE +/- 0.006958, N = 3 SE +/- 0.006281, N = 3 0.483979 0.500579 0.504842 0.510939 MIN: 0.39 MIN: 0.39 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.39 -march=znver3 - MIN: 0.39 1. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
ACES DGEMM This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 16 32 48 64 80 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.28, N = 3 SE +/- 0.10, N = 3 70.38 70.30 70.19 70.05 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O3 -march=native -fopenmp -flto
Kripke Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. Kripke is developed by LLNL. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 Znver3 + AVX-512 Znver3 Znver4 Znver4 + Prefer AVX-512 60M 120M 180M 240M 300M SE +/- 3107958.27, N = 12 SE +/- 2950522.72, N = 15 SE +/- 3293132.01, N = 15 SE +/- 2000945.99, N = 3 271735708 263812847 261562280 254648533 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native -march=native 1. (CXX) g++ options: -O3 -flto -fopenmp
GPAW GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better GPAW 22.1 Input: Carbon Nanotube Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 5 10 15 20 25 SE +/- 0.26, N = 4 SE +/- 0.15, N = 3 SE +/- 0.28, N = 3 SE +/- 0.07, N = 3 21.72 22.35 22.82 22.82 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -shared -fwrapv -O2 -O3 -flto -lxc -lblas -lmpi
Coremark This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 2M 4M 6M 8M 10M SE +/- 2823.14, N = 3 SE +/- 83462.96, N = 3 SE +/- 11460.97, N = 3 SE +/- 62802.17, N = 3 7871273.93 7861097.86 7694653.90 7640546.27 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CC) gcc options: -O2 -O3 -flto -lrt" -lrt
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 Znver4 20 40 60 80 100 SE +/- 1.30, N = 3 SE +/- 0.91, N = 15 SE +/- 1.27, N = 3 SE +/- 1.03, N = 6 105.3 104.4 102.9 102.1 -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 800 1600 2400 3200 4000 SE +/- 32.49, N = 3 SE +/- 17.00, N = 6 SE +/- 36.60, N = 3 SE +/- 12.56, N = 15 3594.8 3584.1 3581.6 3574.9 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 10 20 30 40 50 SE +/- 0.79, N = 15 SE +/- 0.59, N = 15 SE +/- 0.40, N = 15 SE +/- 0.40, N = 15 43.9 42.5 40.8 39.8 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 800 1600 2400 3200 4000 SE +/- 12.59, N = 15 SE +/- 11.99, N = 15 SE +/- 13.90, N = 15 SE +/- 12.89, N = 15 3708.8 3695.5 3685.7 3684.7 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -O3 -flto -pthread -lz -llzma
Cpuminer-Opt Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Magi Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 2K 4K 6K 8K 10K SE +/- 51.27, N = 3 SE +/- 64.85, N = 3 SE +/- 47.28, N = 3 SE +/- 50.94, N = 3 8490.73 8467.24 8440.79 8355.75 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: x25x Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 Znver3 2K 4K 6K 8K 10K SE +/- 90.39, N = 3 SE +/- 74.42, N = 3 SE +/- 15.17, N = 3 SE +/- 17.26, N = 3 8217.70 8042.88 7941.38 6116.97 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: scrypt Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 1000 2000 3000 4000 5000 SE +/- 0.45, N = 3 SE +/- 1.22, N = 3 SE +/- 1.99, N = 3 SE +/- 0.45, N = 3 4790.11 4782.74 4763.91 2959.15 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Deepcoin Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 40K 80K 120K 160K 200K SE +/- 218.28, N = 3 SE +/- 1703.12, N = 5 SE +/- 880.18, N = 3 SE +/- 81.72, N = 3 164993 162242 160157 159147 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Garlicoin Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 16K 32K 48K 64K 80K SE +/- 742.93, N = 3 SE +/- 461.75, N = 3 SE +/- 295.35, N = 3 SE +/- 66.92, N = 3 72837 72413 72130 49523 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Skeincoin Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 400K 800K 1200K 1600K 2000K SE +/- 10323.18, N = 3 SE +/- 7108.55, N = 3 SE +/- 24878.62, N = 3 SE +/- 5094.00, N = 3 2014770 2009367 2004990 1414047 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: LBC, LBRY Credits Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 Znver3 200K 400K 600K 800K 1000K SE +/- 12123.53, N = 3 SE +/- 1800.04, N = 3 SE +/- 829.54, N = 3 SE +/- 3568.28, N = 3 1085827 1067743 1065487 497020 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Quad SHA-256, Pyrite Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 Znver3 500K 1000K 1500K 2000K 2500K SE +/- 25368.27, N = 4 SE +/- 22716.96, N = 3 SE +/- 10306.20, N = 3 SE +/- 5899.09, N = 3 2323995 2264747 2251067 1378987 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.org kH/s, More Is Better Cpuminer-Opt 3.20.3 Algorithm: Triple SHA-256, Onecoin Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 700K 1400K 2100K 2800K 3500K SE +/- 26057.45, N = 3 SE +/- 26496.25, N = 3 SE +/- 22336.69, N = 3 SE +/- 4313.22, N = 3 3323680 3306643 3301217 3255253 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 14 28 42 56 70 SE +/- 0.52, N = 3 SE +/- 0.36, N = 3 SE +/- 0.46, N = 3 SE +/- 0.10, N = 3 64.35 64.27 64.05 63.80 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 16 32 48 64 80 SE +/- 0.28, N = 3 SE +/- 0.92, N = 3 SE +/- 0.87, N = 4 SE +/- 0.83, N = 3 72.01 70.81 70.71 70.71 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Znver4 Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 20 40 60 80 100 SE +/- 1.01, N = 3 SE +/- 0.62, N = 3 SE +/- 0.64, N = 15 SE +/- 0.73, N = 15 78.69 78.01 76.76 74.98 -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Swirl Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 600 1200 1800 2400 3000 SE +/- 26.19, N = 7 SE +/- 28.99, N = 3 SE +/- 8.09, N = 3 SE +/- 36.67, N = 3 2862 2826 2681 2563 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Rotate Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 150 300 450 600 750 SE +/- 1.73, N = 3 SE +/- 1.00, N = 3 SE +/- 1.20, N = 3 SE +/- 3.84, N = 3 673 656 645 605 -march=native -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 300 600 900 1200 1500 SE +/- 6.56, N = 3 SE +/- 1.15, N = 3 SE +/- 13.17, N = 3 SE +/- 10.73, N = 3 1359 1321 1314 1285 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver3 500 1000 1500 2000 2500 SE +/- 2.00, N = 3 SE +/- 14.95, N = 3 SE +/- 5.00, N = 3 SE +/- 19.19, N = 3 2234 2208 2150 1837 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Resizing Znver3 Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 20 40 60 80 100 SE +/- 1.27, N = 15 SE +/- 1.00, N = 15 SE +/- 0.94, N = 15 SE +/- 0.90, N = 15 89 88 87 86 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Noise-Gaussian Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 200 400 600 800 1000 SE +/- 6.60, N = 15 SE +/- 5.13, N = 3 SE +/- 11.39, N = 15 SE +/- 11.43, N = 15 1024 1018 1013 975 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: HWB Color Space Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 300 600 900 1200 1500 SE +/- 4.84, N = 3 SE +/- 15.31, N = 15 SE +/- 11.42, N = 15 SE +/- 1.86, N = 3 1180 1167 1134 1062 -march=native -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 4 - Input: Bosphorus 4K Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 Znver3 1.2202 2.4404 3.6606 4.8808 6.101 SE +/- 0.043, N = 3 SE +/- 0.015, N = 3 SE +/- 0.037, N = 3 SE +/- 0.012, N = 3 5.423 5.392 5.374 5.360 -march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -march=native -flto -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 4K Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 20 40 60 80 100 SE +/- 0.46, N = 3 SE +/- 0.32, N = 3 SE +/- 0.33, N = 3 SE +/- 0.52, N = 3 95.12 94.34 93.76 92.96 -march=znver3 -march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 12 - Input: Bosphorus 4K Znver3 Znver4 Znver3 + AVX-512 Znver4 + Prefer AVX-512 50 100 150 200 250 SE +/- 5.39, N = 15 SE +/- 4.58, N = 15 SE +/- 3.02, N = 15 SE +/- 1.78, N = 3 222.90 210.39 206.68 196.80 -march=znver3 -march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 4K Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 50 100 150 200 250 SE +/- 4.56, N = 12 SE +/- 2.81, N = 15 SE +/- 3.20, N = 15 SE +/- 3.33, N = 15 219.57 210.43 208.64 196.12 -march=znver3 -march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
libavif avifenc This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 0 Znver4 + Prefer AVX-512 Znver4 Znver3 Znver3 + AVX-512 14 28 42 56 70 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 SE +/- 0.30, N = 3 SE +/- 0.02, N = 3 61.07 61.12 61.66 62.68 -march=native -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 2 Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 8 16 24 32 40 SE +/- 0.19, N = 3 SE +/- 0.25, N = 3 SE +/- 0.26, N = 3 SE +/- 0.15, N = 3 33.90 33.91 34.10 34.23 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6 Znver4 + Prefer AVX-512 Znver3 Znver4 Znver3 + AVX-512 0.5414 1.0828 1.6242 2.1656 2.707 SE +/- 0.002, N = 3 SE +/- 0.010, N = 3 SE +/- 0.014, N = 3 SE +/- 0.005, N = 3 2.317 2.331 2.347 2.406 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6, Lossless Znver4 Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 1.0157 2.0314 3.0471 4.0628 5.0785 SE +/- 0.052, N = 4 SE +/- 0.027, N = 3 SE +/- 0.064, N = 3 SE +/- 0.010, N = 3 4.398 4.437 4.462 4.514 -march=native -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -fPIC -flto -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 10, Lossless Znver4 Znver4 + Prefer AVX-512 Znver3 Znver3 + AVX-512 0.8206 1.6412 2.4618 3.2824 4.103 SE +/- 0.012, N = 3 SE +/- 0.017, N = 3 SE +/- 0.016, N = 3 SE +/- 0.038, N = 3 3.541 3.572 3.581 3.647 -march=native -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -fPIC -flto -lm
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 Znver3 1500M 3000M 4500M 6000M 7500M SE +/- 4658445.14, N = 3 SE +/- 3192874.01, N = 3 SE +/- 6145549.43, N = 3 SE +/- 6548367.06, N = 3 6999233333 6990866667 6977633333 6940833333 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 256 - Buffer Length: 256 - Filter Length: 57 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 2000M 4000M 6000M 8000M 10000M SE +/- 4697162.26, N = 3 SE +/- 8533658.85, N = 3 SE +/- 5651843.36, N = 3 SE +/- 24004606.04, N = 3 9813700000 9809800000 9789500000 9735666667 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 384 - Buffer Length: 256 - Filter Length: 57 Znver3 + AVX-512 Znver4 + Prefer AVX-512 Znver4 Znver3 2000M 4000M 6000M 8000M 10000M SE +/- 2403700.85, N = 3 SE +/- 7218802.61, N = 3 SE +/- 7356025.50, N = 3 SE +/- 10066445.91, N = 3 11301666667 11270666667 11249333333 11176000000 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native -march=znver3 1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 Znver4 60000M 120000M 180000M 240000M 300000M SE +/- 143443542.20, N = 3 SE +/- 18811489.95, N = 3 SE +/- 172236238.05, N = 3 SE +/- 150345864.28, N = 3 266899089453 266464361193 266230124070 265326713587 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native -march=native 1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 Znver3 + AVX-512 Znver4 Znver3 Znver4 + Prefer AVX-512 10K 20K 30K 40K 50K SE +/- 0.44, N = 3 SE +/- 8.03, N = 3 SE +/- 22.49, N = 3 SE +/- 151.57, N = 3 44499.3 44490.1 44435.3 44301.8 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 -march=native 1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 Znver4 Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 600K 1200K 1800K 2400K 3000K SE +/- 140.73, N = 3 SE +/- 228.34, N = 3 SE +/- 1641.03, N = 3 SE +/- 10617.65, N = 3 2939503.5 2938488.5 2935372.1 2924586.4 -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: Kostya Znver3 Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 0.945 1.89 2.835 3.78 4.725 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 4.20 4.19 4.17 4.16 -march=znver3 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: TopTweet Znver4 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver3 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 6.96 6.91 6.73 6.61 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: LargeRandom Znver4 Znver3 + AVX-512 Znver3 Znver4 + Prefer AVX-512 0.2858 0.5716 0.8574 1.1432 1.429 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.27 1.25 1.25 1.24 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 -march=native 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: PartialTweets Znver3 Znver3 + AVX-512 Znver4 Znver4 + Prefer AVX-512 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 6.74 6.68 6.63 6.43 -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=native 1. (CXX) g++ options: -O3 -flto
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: DistinctUserID Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 Znver3 2 4 6 8 10 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 6.86 6.61 6.53 6.46 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native -march=znver3 1. (CXX) g++ options: -O3 -flto
PJSIP PJSIP is a free and open source multimedia communication library written in C language implementing standard based protocols such as SIP, SDP, RTP, STUN, TURN, and ICE. It combines signaling protocol (SIP) with rich multimedia framework and NAT traversal functionality into high level API that is portable and suitable for almost any type of systems ranging from desktops, embedded systems, to mobile handsets. This test profile is making use of pjsip-perf with both the client/server on teh system. More details on the PJSIP benchmark at https://www.pjsip.org/high-performance-sip.htm Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: INVITE Znver4 Znver3 Znver3 + AVX-512 Znver4 + Prefer AVX-512 1100 2200 3300 4400 5500 SE +/- 19.91, N = 3 SE +/- 15.00, N = 3 SE +/- 24.85, N = 3 SE +/- 51.36, N = 5 5200 5149 5132 5084 -march=native -march=znver3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateful Znver4 + Prefer AVX-512 Znver4 Znver3 + AVX-512 Znver3 2K 4K 6K 8K 10K SE +/- 32.42, N = 3 SE +/- 23.25, N = 3 SE +/- 36.35, N = 3 SE +/- 72.95, N = 3 9288 9237 9236 9226 -march=native -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=znver3 1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto
OpenBenchmarking.org Responses Per Second, More Is Better PJSIP 2.11 Method: OPTIONS, Stateless Znver3 Znver4 + Prefer AVX-512 Znver3 + AVX-512 Znver4 70K 140K 210K 280K 350K SE +/- 4433.83, N = 12 SE +/- 5240.59, N = 15 SE +/- 3613.75, N = 15 SE +/- 2531.21, N = 3 336885 336791 336615 335767 -march=znver3 -march=native -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -march=native 1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto
Znver4 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 3 January 2023 08:40 by user phoronix.
Znver4 + Prefer AVX-512 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" CFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 3 January 2023 18:42 by user phoronix.
Znver3 Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=znver3 -flto" CFLAGS="-O3 -march=znver3 -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 4 January 2023 05:51 by user phoronix.
Znver3 + AVX-512 Processor: 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.04, Kernel: 5.19.0-21-generic (x86_64), Desktop: GNOME Shell 43.1, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 13.0.0 20230103, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" CFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto"Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.9Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 4 January 2023 13:46 by user phoronix.