GCC 13 Compiler Benchmarks AMD EPYC Genoa

AMD EPYC 9654 GCC 13 development compiler benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2301045-NE-GCC13DEVE69&grs&sro.

GCC 13 Compiler Benchmarks AMD EPYC GenoaProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionZnver4Znver4 + Prefer AVX-512Znver3Znver3 + AVX-5122 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads)AMD Titanite_4G (RTI1002E BIOS)AMD Device 14a41520GB800GB INTEL SSDPF21Q800GBASPEEDVGA HDMIBroadcom NetXtreme BCM5720 PCIeUbuntu 23.045.19.0-21-generic (x86_64)GNOME Shell 43.1X Server 1.21.1.41.3.224GCC 13.0.0 20230103ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- Znver4: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"- Znver4 + Prefer AVX-512: CXXFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" CFLAGS="-O3 -march=native -mprefer-vector-width=512 -flto" - Znver3: CXXFLAGS="-O3 -march=znver3 -flto" CFLAGS="-O3 -march=znver3 -flto"- Znver3 + AVX-512: CXXFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" CFLAGS="-O3 -march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -flto" Compiler Details- --disable-multilib --enable-checking=releaseProcessor Details- Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Details- Python 3.10.9Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

GCC 13 Compiler Benchmarks AMD EPYC Genoacpuminer-opt: LBC, LBRY Creditscpuminer-opt: Quad SHA-256, Pyritecpuminer-opt: scryptcpuminer-opt: Garlicoincpuminer-opt: Skeincoincpuminer-opt: x25xgraphics-magick: Enhancedastcenc: Mediumcp2k: Fayalite-FISTwebp: Quality 100, Highest Compressiongraphics-magick: Swirlgraphics-magick: Rotategraphics-magick: HWB Color Spaceastcenc: Exhaustivegromacs: MPI CPU - water_GMX50_barekripke: simdjson: DistinctUserIDonednn: IP Shapes 3D - bf16bf16bf16 - CPUgraphics-magick: Sharpenjpegxl: JPEG - 100onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUsimdjson: TopTweetonednn: Recurrent Neural Network Training - u8s8f32 - CPUgpaw: Carbon Nanotubegraphics-magick: Noise-Gaussiankvazaar: Bosphorus 4K - Ultra Fastsimdjson: PartialTweetsonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUjpegxl-decode: Allavifenc: 6cpuminer-opt: Deepcoinonednn: Recurrent Neural Network Inference - u8s8f32 - CPUgraphics-magick: Resizingjpegxl-decode: 1onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUjpegxl: JPEG - 90onednn: Convolution Batch Shapes Auto - u8s8f32 - CPUcompress-zstd: 19 - Compression Speedonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUstargate: 192000 - 1024onednn: IP Shapes 3D - u8s8f32 - CPUcoremark: CoreMark Size 666 - Iterations Per Secondavifenc: 10, Losslessonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUavifenc: 0avifenc: 6, Losslessjpegxl: PNG - 90astcenc: Thoroughjpegxl: PNG - 100simdjson: LargeRandsvt-av1: Preset 8 - Bosphorus 4Kpjsip: INVITEcpuminer-opt: Triple SHA-256, Onecoinonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUstargate: 96000 - 1024kvazaar: Bosphorus 4K - Very Fastonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUwebp: Quality 100, Lossless, Highest Compressionminibude: OpenMP - BM1minibude: OpenMP - BM1onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUlammps: Rhodopsin Proteincpuminer-opt: Magingspice: C7552webp: Quality 100webp: Quality 100, Losslesssvt-av1: Preset 4 - Bosphorus 4Kliquid-dsp: 384 - 256 - 57avifenc: 2simdjson: Kostyakvazaar: Bosphorus 4K - Mediumsecuremark: SecureMark-TLSliquid-dsp: 128 - 256 - 57liquid-dsp: 256 - 256 - 57quantlib: minibude: OpenMP - BM2minibude: OpenMP - BM2lammps: 20k Atomswebp: Defaultpjsip: OPTIONS, Statefulcompress-zstd: 19, Long Mode - Decompression Speedopenssl: SHA256ngspice: C2670compress-zstd: 19 - Decompression Speedopenssl: RSA4096mt-dgemm: Sustained Floating-Point Rateopenssl: RSA4096smhasher: t1ha0_aes_avx2 x86_64smhasher: MeowHash x86_64 AES-NIsmhasher: FarmHash32 x86_64 AVXonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUpjsip: OPTIONS, Statelesssvt-av1: Preset 13 - Bosphorus 4Ksvt-av1: Preset 12 - Bosphorus 4Kcompress-zstd: 19, Long Mode - Compression Speedsmhasher: MeowHash x86_64 AES-NIsmhasher: t1ha0_aes_avx2 x86_64smhasher: FarmHash32 x86_64 AVXZnver4Znver4 + Prefer AVX-512Znver3Znver3 + AVX-512106548722510674790.117241320147708042.882234493.22481174.6653.252862673118012.941219.4932615622806.534.2660013590.730.5005796.962020.2422.350102478.696.632479.72277.202.3471591472405.638848.640.4438919.480.380486102.12134.852.7837170.8866427694653.8990163.5412.3336361.1224.3989.83118.73920.831.2793.756520033066430.9371644.33123070.710.2747690.57214.5025362.5420.65810151.5858440.7992.10211.541.455.3921124933333334.1044.1764.35296548699086666797895000003096.9264.1476603.68455.61818.9592373708.826532671358795.0663584.12939503.570.05295244490.1102354.5754272.1440565.0721.135914.8168335767196.115210.38640.844.97720.52426.494108582723239954782.747213020093678217.702150420.69921263.7073.232681656116713.093619.4372546485336.864.3336613140.750.4839796.912123.3522.819101374.986.432444.52272.312.3171622422359.298748.210.4455329.430.388483105.32070.132.8672330.8813037861097.8566403.5722.2988061.0734.4629.81118.99320.821.2494.340508433012170.954144.40829070.810.2746620.58211.0175275.4260.65006752.2878467.2493.45311.541.475.4231127066666733.9024.1964.05294122697763333398098000003112.6265.5206638.00755.73918.9792883685.726623012407095.5993581.62924586.470.29828244301.8102399.7454297.0440563.9623.534114.2928336791208.643196.80442.544.95020.80626.50349702013789872959.154952314140476116.971837459.69241211.1263.642563645113412.222118.2312638128476.464.5244012850.740.5109396.612108.4621.721101878.016.742418.28269.592.3311649932442.358948.220.4427489.180.392765104.42093.752.8200790.9079417640546.2720743.5812.3185861.6584.4379.61116.09120.821.2595.116514932552530.9470134.35644672.010.2796550.58214.0785351.9440.65634251.4558490.7392.94311.481.475.3601117600000033.9114.2064.27294057694083333397356666673120.5264.2846607.09155.72118.9992263695.526646436119395.1753574.92938488.570.37806544435.3102351.8754281.9440565.7219.588514.6179336885219.570222.90239.844.95720.53326.494106774322647474763.917283720049907941.382208511.52481407.1253.112826605106213.030519.0872717357086.614.3365813210.710.5048426.732099.4322.82397576.766.682531.42266.532.4061601572377.178647.020.4312999.270.392875102.92103.412.8683730.9023037871273.9327623.6472.3616662.6844.5149.86117.69100.811.2592.964513233236800.9363744.41337970.710.2769650.58214.6255365.6120.64708751.8168355.7593.52411.381.455.3741130166666734.2304.1663.80296575699923333398137000003114.9266.1556653.86456.03618.8592363684.726689908945395.1863594.82935372.170.18646744499.3102403.9254284.6540559.3322.943114.25021336615210.426206.67643.944.96820.52726.494OpenBenchmarking.org

Cpuminer-Opt

Algorithm: LBC, LBRY Credits

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: LBC, LBRY CreditsZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512200K400K600K800K1000KSE +/- 3568.28, N = 3SE +/- 1800.04, N = 3SE +/- 829.54, N = 3SE +/- 12123.53, N = 3497020106774310654871085827-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

Cpuminer-Opt

Algorithm: Quad SHA-256, Pyrite

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Quad SHA-256, PyriteZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512500K1000K1500K2000K2500KSE +/- 5899.09, N = 3SE +/- 22716.96, N = 3SE +/- 10306.20, N = 3SE +/- 25368.27, N = 41378987226474722510672323995-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

Cpuminer-Opt

Algorithm: scrypt

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: scryptZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51210002000300040005000SE +/- 0.45, N = 3SE +/- 1.99, N = 3SE +/- 0.45, N = 3SE +/- 1.22, N = 32959.154763.914790.114782.74-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

Cpuminer-Opt

Algorithm: Garlicoin

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: GarlicoinZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51216K32K48K64K80KSE +/- 66.92, N = 3SE +/- 742.93, N = 3SE +/- 461.75, N = 3SE +/- 295.35, N = 349523728377241372130-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

Cpuminer-Opt

Algorithm: Skeincoin

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: SkeincoinZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512400K800K1200K1600K2000KSE +/- 5094.00, N = 3SE +/- 24878.62, N = 3SE +/- 10323.18, N = 3SE +/- 7108.55, N = 31414047200499020147702009367-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

Cpuminer-Opt

Algorithm: x25x

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: x25xZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5122K4K6K8K10KSE +/- 17.26, N = 3SE +/- 15.17, N = 3SE +/- 74.42, N = 3SE +/- 90.39, N = 36116.977941.388042.888217.70-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: EnhancedZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5125001000150020002500SE +/- 19.19, N = 3SE +/- 14.95, N = 3SE +/- 2.00, N = 3SE +/- 5.00, N = 31837220822342150-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: MediumZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512110220330440550SE +/- 6.17, N = 3SE +/- 5.68, N = 3SE +/- 3.36, N = 13SE +/- 0.56, N = 3459.69511.52493.22420.70-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -pthread

CP2K Molecular Dynamics

Input: Fayalite-FIST

OpenBenchmarking.orgSeconds, Fewer Is BetterCP2K Molecular Dynamics 8.2Input: Fayalite-FISTZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512300600900120015001211.131407.131174.671263.71

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Highest CompressionZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.8191.6382.4573.2764.095SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 33.643.113.253.23-march=znver3 -lgif -ltiff-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff-march=native -ljpeg-march=native -ljpeg -ltiff1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lpng16 -lm

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SwirlZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5126001200180024003000SE +/- 36.67, N = 3SE +/- 28.99, N = 3SE +/- 26.19, N = 7SE +/- 8.09, N = 32563282628622681-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: RotateZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512150300450600750SE +/- 1.20, N = 3SE +/- 3.84, N = 3SE +/- 1.73, N = 3SE +/- 1.00, N = 3645605673656-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: HWB Color SpaceZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51230060090012001500SE +/- 11.42, N = 15SE +/- 1.86, N = 3SE +/- 4.84, N = 3SE +/- 15.31, N = 151134106211801167-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ExhaustiveZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5123691215SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.05, N = 312.2213.0312.9413.09-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -pthread

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_bareZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 318.2319.0919.4919.44-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto

Kripke

OpenBenchmarking.orgThroughput FoM, More Is BetterKripke 1.2.4Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51260M120M180M240M300MSE +/- 2950522.72, N = 15SE +/- 3107958.27, N = 12SE +/- 3293132.01, N = 15SE +/- 2000945.99, N = 3263812847271735708261562280254648533-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -fopenmp

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: DistinctUserIDZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512246810SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.07, N = 36.466.616.536.86-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121.0182.0363.0544.0725.09SE +/- 0.04256, N = 15SE +/- 0.03566, N = 9SE +/- 0.04728, N = 15SE +/- 0.04436, N = 154.524404.336584.266004.33366-march=znver3 - MIN: 3.03-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2.98MIN: 2.83MIN: 2.771. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SharpenZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51230060090012001500SE +/- 10.73, N = 3SE +/- 1.15, N = 3SE +/- 6.56, N = 3SE +/- 13.17, N = 31285132113591314-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

JPEG XL libjxl

Input: JPEG - Quality: 100

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.7Input: JPEG - Quality: 100Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.16880.33760.50640.67520.844SE +/- 0.01, N = 6SE +/- 0.01, N = 9SE +/- 0.01, N = 9SE +/- 0.01, N = 30.740.710.730.75-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.1150.230.3450.460.575SE +/- 0.006281, N = 3SE +/- 0.006958, N = 3SE +/- 0.004211, N = 3SE +/- 0.003087, N = 30.5109390.5048420.5005790.483979-march=znver3 - MIN: 0.39-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.39MIN: 0.39MIN: 0.391. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

simdjson

Throughput Test: TopTweet

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: TopTweetZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512246810SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.05, N = 3SE +/- 0.07, N = 36.616.736.966.91-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5125001000150020002500SE +/- 26.49, N = 3SE +/- 18.84, N = 15SE +/- 16.34, N = 3SE +/- 26.64, N = 122108.462099.432020.242123.35-march=znver3 - MIN: 1945.83-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1917.53MIN: 1878.54MIN: 1873.091. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 22.1Input: Carbon NanotubeZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512510152025SE +/- 0.26, N = 4SE +/- 0.07, N = 3SE +/- 0.15, N = 3SE +/- 0.28, N = 321.7222.8222.3522.82-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -shared -fwrapv -O2 -O3 -flto -lxc -lblas -lmpi

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: Noise-GaussianZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5122004006008001000SE +/- 5.13, N = 3SE +/- 11.43, N = 15SE +/- 6.60, N = 15SE +/- 11.39, N = 15101897510241013-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Ultra FastZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51220406080100SE +/- 0.62, N = 3SE +/- 0.64, N = 15SE +/- 1.01, N = 3SE +/- 0.73, N = 1578.0176.7678.6974.98-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: PartialTweetsZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512246810SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 36.746.686.636.43-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5125001000150020002500SE +/- 24.11, N = 15SE +/- 16.53, N = 3SE +/- 24.81, N = 3SE +/- 25.52, N = 32418.282531.422479.722444.52-march=znver3 - MIN: 2149.67-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2281.87MIN: 2315.89MIN: 2258.451. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

JPEG XL Decoding libjxl

CPU Threads: All

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding libjxl 0.7CPU Threads: AllZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51260120180240300SE +/- 1.29, N = 3SE +/- 1.24, N = 3SE +/- 1.41, N = 3SE +/- 0.30, N = 3269.59266.53277.20272.31

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.54141.08281.62422.16562.707SE +/- 0.010, N = 3SE +/- 0.005, N = 3SE +/- 0.014, N = 3SE +/- 0.002, N = 32.3312.4062.3472.317-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -fPIC -flto -lm

Cpuminer-Opt

Algorithm: Deepcoin

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: DeepcoinZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51240K80K120K160K200KSE +/- 218.28, N = 3SE +/- 880.18, N = 3SE +/- 81.72, N = 3SE +/- 1703.12, N = 5164993160157159147162242-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5125001000150020002500SE +/- 32.77, N = 12SE +/- 26.39, N = 15SE +/- 20.65, N = 15SE +/- 31.29, N = 152442.352377.172405.632359.29-march=znver3 - MIN: 2123.49-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 2080.2MIN: 2158.4MIN: 2066.841. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: ResizingZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51220406080100SE +/- 1.27, N = 15SE +/- 0.90, N = 15SE +/- 1.00, N = 15SE +/- 0.94, N = 1589868887-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -fopenmp -O3 -flto -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

JPEG XL Decoding libjxl

CPU Threads: 1

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding libjxl 0.7CPU Threads: 1Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121122334455SE +/- 0.12, N = 3SE +/- 0.25, N = 3SE +/- 0.12, N = 3SE +/- 0.12, N = 348.2247.0248.6448.21

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.10020.20040.30060.40080.501SE +/- 0.002906, N = 3SE +/- 0.004441, N = 15SE +/- 0.001884, N = 3SE +/- 0.000431, N = 30.4427480.4312990.4438910.445532-march=znver3 - MIN: 0.34-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.33MIN: 0.37MIN: 0.341. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

JPEG XL libjxl

Input: JPEG - Quality: 90

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.7Input: JPEG - Quality: 90Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5123691215SE +/- 0.03, N = 3SE +/- 0.13, N = 3SE +/- 0.05, N = 3SE +/- 0.03, N = 39.189.279.489.43-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.08840.17680.26520.35360.442SE +/- 0.000630, N = 3SE +/- 0.004423, N = 3SE +/- 0.000892, N = 3SE +/- 0.003460, N = 30.3927650.3928750.3804860.388483-march=znver3 - MIN: 0.28-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.28MIN: 0.28MIN: 0.281. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression SpeedZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51220406080100SE +/- 0.91, N = 15SE +/- 1.27, N = 3SE +/- 1.03, N = 6SE +/- 1.30, N = 3104.4102.9102.1105.3-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -pthread -lz -llzma

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5125001000150020002500SE +/- 27.68, N = 15SE +/- 15.62, N = 15SE +/- 9.82, N = 3SE +/- 16.85, N = 32093.752103.412134.852070.13-march=znver3 - MIN: 1826.26-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1883.21MIN: 2008.7MIN: 1938.321. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

Stargate Digital Audio Workstation

Sample Rate: 192000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 22.11.5Sample Rate: 192000 - Buffer Size: 1024Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.64541.29081.93622.58163.227SE +/- 0.014903, N = 3SE +/- 0.000776, N = 3SE +/- 0.009628, N = 3SE +/- 0.003508, N = 32.8200792.8683732.7837172.8672331. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.20430.40860.61290.81721.0215SE +/- 0.011617, N = 3SE +/- 0.002136, N = 3SE +/- 0.004516, N = 3SE +/- 0.004696, N = 30.9079410.9023030.8866420.881303-march=znver3 - MIN: 0.75-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.75MIN: 0.76MIN: 0.741. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per SecondZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5122M4M6M8M10MSE +/- 62802.17, N = 3SE +/- 2823.14, N = 3SE +/- 11460.97, N = 3SE +/- 83462.96, N = 37640546.277871273.937694653.907861097.86-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O2 -O3 -flto -lrt" -lrt

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 10, LosslessZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.82061.64122.46183.28244.103SE +/- 0.016, N = 3SE +/- 0.038, N = 3SE +/- 0.012, N = 3SE +/- 0.017, N = 33.5813.6473.5413.572-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -fPIC -flto -lm

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.53141.06281.59422.12562.657SE +/- 0.00532, N = 3SE +/- 0.00584, N = 3SE +/- 0.00839, N = 3SE +/- 0.00909, N = 32.318582.361662.333632.29880-march=znver3 - MIN: 1.91-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 1.92MIN: 1.95MIN: 1.921. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

libavif avifenc

Encoder Speed: 0

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 0Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121428425670SE +/- 0.30, N = 3SE +/- 0.02, N = 3SE +/- 0.13, N = 3SE +/- 0.07, N = 361.6662.6861.1261.07-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -fPIC -flto -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6, LosslessZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121.01572.03143.04714.06285.0785SE +/- 0.027, N = 3SE +/- 0.010, N = 3SE +/- 0.052, N = 4SE +/- 0.064, N = 34.4374.5144.3984.462-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -fPIC -flto -lm

JPEG XL libjxl

Input: PNG - Quality: 90

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.7Input: PNG - Quality: 90Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5123691215SE +/- 0.01, N = 3SE +/- 0.07, N = 3SE +/- 0.03, N = 3SE +/- 0.04, N = 39.619.869.839.81-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512306090120150SE +/- 0.10, N = 3SE +/- 0.06, N = 3SE +/- 0.12, N = 3SE +/- 0.08, N = 3116.09117.69118.74118.99-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -pthread

JPEG XL libjxl

Input: PNG - Quality: 100

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.7Input: PNG - Quality: 100Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.18680.37360.56040.74720.934SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.820.810.830.82-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -fno-rtti -funwind-tables -O2 -fPIE -pie -latomic

simdjson

Throughput Test: LargeRandom

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: LargeRandomZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.28580.57160.85741.14321.429SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.251.251.271.24-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.4Encoder Mode: Preset 8 - Input: Bosphorus 4KZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51220406080100SE +/- 0.46, N = 3SE +/- 0.52, N = 3SE +/- 0.33, N = 3SE +/- 0.32, N = 395.1292.9693.7694.34-march=znver3-march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

PJSIP

Method: INVITE

OpenBenchmarking.orgResponses Per Second, More Is BetterPJSIP 2.11Method: INVITEZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51211002200330044005500SE +/- 15.00, N = 3SE +/- 24.85, N = 3SE +/- 19.91, N = 3SE +/- 51.36, N = 55149513252005084-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto

Cpuminer-Opt

Algorithm: Triple SHA-256, Onecoin

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Triple SHA-256, OnecoinZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512700K1400K2100K2800K3500KSE +/- 4313.22, N = 3SE +/- 26057.45, N = 3SE +/- 26496.25, N = 3SE +/- 22336.69, N = 33255253332368033066433301217-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.21470.42940.64410.85881.0735SE +/- 0.009578, N = 5SE +/- 0.011238, N = 4SE +/- 0.012915, N = 3SE +/- 0.009537, N = 30.9470130.9363740.9371640.954140-march=znver3 - MIN: 0.79-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.79MIN: 0.77MIN: 0.781. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 22.11.5Sample Rate: 96000 - Buffer Size: 1024Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.9931.9862.9793.9724.965SE +/- 0.016378, N = 3SE +/- 0.011844, N = 3SE +/- 0.006484, N = 3SE +/- 0.013873, N = 34.3564464.4133794.3312304.4082901. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Very FastZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121632486480SE +/- 0.28, N = 3SE +/- 0.87, N = 4SE +/- 0.83, N = 3SE +/- 0.92, N = 372.0170.7170.7170.81-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.06290.12580.18870.25160.3145SE +/- 0.002168, N = 15SE +/- 0.003530, N = 3SE +/- 0.001957, N = 12SE +/- 0.001864, N = 30.2796550.2769650.2747690.274662-march=znver3 - MIN: 0.23-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.24MIN: 0.24MIN: 0.241. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Lossless, Highest CompressionZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.13050.2610.39150.5220.6525SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.580.580.570.58-march=znver3 -lgif -ltiff-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff-march=native -ljpeg-march=native -ljpeg -ltiff1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lpng16 -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51250100150200250SE +/- 2.52, N = 3SE +/- 2.13, N = 3SE +/- 2.07, N = 3SE +/- 2.51, N = 4214.08214.63214.50211.021. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51212002400360048006000SE +/- 62.89, N = 3SE +/- 53.26, N = 3SE +/- 51.63, N = 3SE +/- 62.65, N = 45351.945365.615362.545275.431. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.14810.29620.44430.59240.7405SE +/- 0.006925, N = 4SE +/- 0.003927, N = 3SE +/- 0.005902, N = 3SE +/- 0.001081, N = 30.6563420.6470870.6581010.650067-march=znver3 - MIN: 0.54-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 0.56MIN: 0.53MIN: 0.531. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin ProteinZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121224364860SE +/- 0.43, N = 3SE +/- 0.40, N = 10SE +/- 0.21, N = 3SE +/- 0.48, N = 351.4651.8251.5952.29-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lm -ldl

Cpuminer-Opt

Algorithm: Magi

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: MagiZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5122K4K6K8K10KSE +/- 51.27, N = 3SE +/- 50.94, N = 3SE +/- 47.28, N = 3SE +/- 64.85, N = 38490.738355.758440.798467.24-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lcurl -lz -lpthread -lssl -lcrypto -lgmp

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51220406080100SE +/- 0.28, N = 3SE +/- 0.28, N = 3SE +/- 1.00, N = 3SE +/- 0.14, N = 392.9493.5292.1093.45-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE

WebP Image Encode

Encode Settings: Quality 100

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5123691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 311.4811.3811.5411.54-march=znver3 -lgif -ltiff-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff-march=native -ljpeg-march=native -ljpeg -ltiff1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lpng16 -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, LosslessZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.33080.66160.99241.32321.654SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.471.451.451.47-march=znver3 -lgif -ltiff-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff-march=native -ljpeg-march=native -ljpeg -ltiff1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lpng16 -lm

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.4Encoder Mode: Preset 4 - Input: Bosphorus 4KZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121.22022.44043.66064.88086.101SE +/- 0.012, N = 3SE +/- 0.037, N = 3SE +/- 0.015, N = 3SE +/- 0.043, N = 35.3605.3745.3925.423-march=znver3-march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Liquid-DSP

Threads: 384 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 384 - Buffer Length: 256 - Filter Length: 57Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5122000M4000M6000M8000M10000MSE +/- 10066445.91, N = 3SE +/- 2403700.85, N = 3SE +/- 7356025.50, N = 3SE +/- 7218802.61, N = 311176000000113016666671124933333311270666667-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 2Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512816243240SE +/- 0.25, N = 3SE +/- 0.15, N = 3SE +/- 0.26, N = 3SE +/- 0.19, N = 333.9134.2334.1033.90-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -fPIC -flto -lm

simdjson

Throughput Test: Kostya

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: KostyaZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5120.9451.892.8353.784.725SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 34.204.164.174.19-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: MediumZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121428425670SE +/- 0.36, N = 3SE +/- 0.10, N = 3SE +/- 0.52, N = 3SE +/- 0.46, N = 364.2763.8064.3564.05-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -flto -lpthread -lm -lrt

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51260K120K180K240K300KSE +/- 1276.92, N = 3SE +/- 380.50, N = 3SE +/- 383.28, N = 3SE +/- 640.61, N = 32940572965752965482941221. (CC) gcc options: -pedantic -O3

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 128 - Buffer Length: 256 - Filter Length: 57Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121500M3000M4500M6000M7500MSE +/- 6548367.06, N = 3SE +/- 4658445.14, N = 3SE +/- 3192874.01, N = 3SE +/- 6145549.43, N = 36940833333699923333369908666676977633333-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 256 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 256 - Buffer Length: 256 - Filter Length: 57Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5122000M4000M6000M8000M10000MSE +/- 24004606.04, N = 3SE +/- 4697162.26, N = 3SE +/- 5651843.36, N = 3SE +/- 8533658.85, N = 39735666667981370000097895000009809800000-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -pthread -lm -lc -lliquid

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5127001400210028003500SE +/- 4.52, N = 3SE +/- 2.74, N = 3SE +/- 1.99, N = 3SE +/- 1.71, N = 33120.53114.93096.93112.61. (CXX) g++ options: -O3 -march=native -rdynamic

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51260120180240300SE +/- 0.73, N = 3SE +/- 1.06, N = 3SE +/- 1.75, N = 3SE +/- 0.73, N = 3264.28266.16264.15265.521. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51214002800420056007000SE +/- 18.35, N = 3SE +/- 26.43, N = 3SE +/- 43.78, N = 3SE +/- 18.27, N = 36607.096653.866603.686638.011. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121326395265SE +/- 0.04, N = 3SE +/- 0.08, N = 3SE +/- 0.15, N = 3SE +/- 0.21, N = 355.7256.0455.6255.74-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CXX) g++ options: -O3 -flto -lm -ldl

WebP Image Encode

Encode Settings: Default

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: DefaultZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512510152025SE +/- 0.02, N = 3SE +/- 0.07, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 318.9918.8518.9518.97-march=znver3 -lgif -ltiff-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -ljpeg -ltiff-march=native -ljpeg-march=native -ljpeg -ltiff1. (CC) gcc options: -fvisibility=hidden -O3 -flto -lpng16 -lm

PJSIP

Method: OPTIONS, Stateful

OpenBenchmarking.orgResponses Per Second, More Is BetterPJSIP 2.11Method: OPTIONS, StatefulZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5122K4K6K8K10KSE +/- 72.95, N = 3SE +/- 36.35, N = 3SE +/- 23.25, N = 3SE +/- 32.42, N = 39226923692379288-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression SpeedZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5128001600240032004000SE +/- 11.99, N = 15SE +/- 12.89, N = 15SE +/- 12.59, N = 15SE +/- 13.90, N = 153695.53684.73708.83685.7-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -pthread -lz -llzma

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA256Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51260000M120000M180000M240000M300000MSE +/- 18811489.95, N = 3SE +/- 143443542.20, N = 3SE +/- 150345864.28, N = 3SE +/- 172236238.05, N = 3266464361193266899089453265326713587266230124070-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51220406080100SE +/- 0.09, N = 3SE +/- 0.24, N = 3SE +/- 0.76, N = 3SE +/- 0.17, N = 395.1895.1995.0795.60-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression SpeedZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5128001600240032004000SE +/- 12.56, N = 15SE +/- 32.49, N = 3SE +/- 17.00, N = 6SE +/- 36.60, N = 33574.93594.83584.13581.6-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -pthread -lz -llzma

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512600K1200K1800K2400K3000KSE +/- 228.34, N = 3SE +/- 1641.03, N = 3SE +/- 140.73, N = 3SE +/- 10617.65, N = 32938488.52935372.12939503.52924586.4-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121632486480SE +/- 0.06, N = 3SE +/- 0.28, N = 3SE +/- 0.10, N = 3SE +/- 0.07, N = 370.3870.1970.0570.30-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi1. (CC) gcc options: -O3 -march=native -fopenmp -flto

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51210K20K30K40K50KSE +/- 22.49, N = 3SE +/- 0.44, N = 3SE +/- 8.03, N = 3SE +/- 151.57, N = 344435.344499.344490.144301.8-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -pthread -m64 -O3 -flto -lssl -lcrypto -ldl

SMHasher

Hash: t1ha0_aes_avx2 x86_64

OpenBenchmarking.orgMiB/sec, More Is BetterSMHasher 2022-08-22Hash: t1ha0_aes_avx2 x86_64Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51220K40K60K80K100KSE +/- 8.85, N = 3SE +/- 19.58, N = 3SE +/- 20.00, N = 3SE +/- 11.55, N = 3102351.87102403.92102354.57102399.74-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects

SMHasher

Hash: MeowHash x86_64 AES-NI

OpenBenchmarking.orgMiB/sec, More Is BetterSMHasher 2022-08-22Hash: MeowHash x86_64 AES-NIZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51212K24K36K48K60KSE +/- 9.98, N = 3SE +/- 10.52, N = 3SE +/- 7.42, N = 3SE +/- 9.47, N = 354281.9454284.6554272.1454297.04-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects

SMHasher

Hash: FarmHash32 x86_64 AVX

OpenBenchmarking.orgMiB/sec, More Is BetterSMHasher 2022-08-22Hash: FarmHash32 x86_64 AVXZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5129K18K27K36K45KSE +/- 0.13, N = 3SE +/- 1.84, N = 3SE +/- 0.95, N = 3SE +/- 1.44, N = 340565.7240559.3340565.0740563.96-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512612182430SE +/- 0.82, N = 15SE +/- 0.76, N = 15SE +/- 0.95, N = 15SE +/- 0.56, N = 1219.5922.9421.1423.53-march=znver3 - MIN: 9.21-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 9.35MIN: 9.61MIN: 10.061. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51248121620SE +/- 0.34, N = 12SE +/- 0.42, N = 12SE +/- 0.11, N = 3SE +/- 0.24, N = 1514.6214.2514.8214.29-march=znver3 - MIN: 8.56-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 6.05MIN: 9.81MIN: 8.381. (CXX) g++ options: -O3 -march=native -flto -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

PJSIP

Method: OPTIONS, Stateless

OpenBenchmarking.orgResponses Per Second, More Is BetterPJSIP 2.11Method: OPTIONS, StatelessZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51270K140K210K280K350KSE +/- 4433.83, N = 12SE +/- 3613.75, N = 15SE +/- 2531.21, N = 3SE +/- 5240.59, N = 15336885336615335767336791-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -lavformat -lavcodec -lswscale -lavutil -lstdc++ -lopus -lssl -lcrypto -lm -lrt -lpthread -lasound -O3 -flto

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.4Encoder Mode: Preset 13 - Input: Bosphorus 4KZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51250100150200250SE +/- 4.56, N = 12SE +/- 2.81, N = 15SE +/- 3.33, N = 15SE +/- 3.20, N = 15219.57210.43196.12208.64-march=znver3-march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 12 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.4Encoder Mode: Preset 12 - Input: Bosphorus 4KZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-51250100150200250SE +/- 5.39, N = 15SE +/- 3.02, N = 15SE +/- 4.58, N = 15SE +/- 1.78, N = 3222.90206.68210.39196.80-march=znver3-march=znver3 -mavx512cd -mavx512vl -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression SpeedZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121020304050SE +/- 0.40, N = 15SE +/- 0.79, N = 15SE +/- 0.40, N = 15SE +/- 0.59, N = 1539.843.940.842.5-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi-march=native-march=native1. (CC) gcc options: -O3 -flto -pthread -lz -llzma

SMHasher

Hash: MeowHash x86_64 AES-NI

OpenBenchmarking.orgcycles/hash, Fewer Is BetterSMHasher 2022-08-22Hash: MeowHash x86_64 AES-NIZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-5121020304050SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 344.9644.9744.9844.95-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects

SMHasher

Hash: t1ha0_aes_avx2 x86_64

OpenBenchmarking.orgcycles/hash, Fewer Is BetterSMHasher 2022-08-22Hash: t1ha0_aes_avx2 x86_64Znver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 320.5320.5320.5220.81-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects

SMHasher

Hash: FarmHash32 x86_64 AVX

OpenBenchmarking.orgcycles/hash, Fewer Is BetterSMHasher 2022-08-22Hash: FarmHash32 x86_64 AVXZnver3Znver3 + AVX-512Znver4Znver4 + Prefer AVX-512612182430SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 326.4926.4926.4926.50-march=znver3-march=znver3 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi1. (CXX) g++ options: -O3 -flto -march=native -flto=auto -fno-fat-lto-objects


Phoronix Test Suite v10.8.5