AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarks

Tests for a future article by Michael Larabel.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2212135-NE-AOCC40AMD35
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

AV1 3 Tests
BLAS (Basic Linear Algebra Sub-Routine) Tests 2 Tests
C/C++ Compiler Tests 10 Tests
CPU Massive 10 Tests
Creator Workloads 13 Tests
Cryptography 3 Tests
Encoding 8 Tests
HPC - High Performance Computing 7 Tests
Imaging 4 Tests
Machine Learning 3 Tests
MPI Benchmarks 2 Tests
Multi-Core 11 Tests
OpenMPI Tests 2 Tests
Programmer / Developer System Benchmarks 2 Tests
Python Tests 2 Tests
Scientific Computing 3 Tests
Software Defined Radio 2 Tests
Server 2 Tests
Server CPU Tests 6 Tests
Video Encoding 7 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
AOCC 4.0
December 12 2022
  4 Hours, 11 Minutes
GCC 12.2
December 13 2022
  4 Hours, 18 Minutes
Invert Hiding All Results Option
  4 Hours, 14 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarks - Phoronix Test Suite

AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarks

Tests for a future article by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/2212135-NE-AOCC40AMD35&export=pdf&grt&sro.

AOCC 4.0 AMD EPYC 9374F 2P Compiler BenchmarksProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionAOCC 4.0GCC 12.22 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads)AMD Titanite_4G (RTI1002E BIOS)AMD Device 14a41520GB800GB INTEL SSDPF21Q800GBASPEEDVGA HDMIBroadcom NetXtreme BCM5720 PCIeUbuntu 22.105.19.0-26-generic (x86_64)GNOME Shell 43.0X Server 1.21.1.41.3.224Clang 14.0.6ext41920x1080GCC 12.2.0OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Details- AOCC 4.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver4 - GCC 12.2: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110d Python Details- Python 3.10.7Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarksaom-av1: Speed 6 Two-Pass - Bosphorus 4Kaom-av1: Speed 9 Realtime - Bosphorus 4Kastcenc: Mediumastcenc: Thoroughcryptopp: Keyed Algorithmscryptopp: Unkeyed Algorithmsgpaw: Carbon Nanotubegraphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedjpegxl-decode: 1jpegxl-decode: Allkripke: kvazaar: Bosphorus 4K - Mediumkvazaar: Bosphorus 4K - Very Fastkvazaar: Bosphorus 4K - Ultra Fastlammps: 20k Atomslczero: BLASlczero: Eigenavifenc: 0avifenc: 2avifenc: 6avifenc: 6, Losslessavifenc: 10, Losslessliquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 57liquid-dsp: 128 - 256 - 57minibude: OpenMP - BM2minibude: OpenMP - BM2onednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUopenssl: SHA256openssl: RSA4096openssl: RSA4096securemark: SecureMark-TLSsimdjson: TopTweetsimdjson: PartialTweetssimdjson: DistinctUserIDsrsran: OFDM_Testsrsran: 4G PHY_DL_Test 100 PRB MIMO 64-QAMstargate: 96000 - 512stargate: 192000 - 512stargate: 96000 - 1024stargate: 192000 - 1024svt-av1: Preset 8 - Bosphorus 4Ksvt-av1: Preset 13 - Bosphorus 4Ksvt-hevc: 7 - Bosphorus 4Ksvt-hevc: 10 - Bosphorus 4Ksvt-vp9: VMAF Optimized - Bosphorus 4Ksvt-vp9: PSNR/SSIM Optimized - Bosphorus 4Ksvt-vp9: Visual Quality Optimized - Bosphorus 4Ktnn: CPU - MobileNet v2tnn: CPU - SqueezeNet v2tnn: CPU - SqueezeNet v1.1vpxenc: Speed 0 - Bosphorus 4Kvpxenc: Speed 5 - Bosphorus 4Kwebp: Defaultwebp: Quality 100, Highest Compressioncompress-zstd: 8 - Compression Speedcompress-zstd: 8 - Decompression Speedcompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression SpeedAOCC 4.0GCC 12.220.9837.39457.169364.0015796.930644534.93008332.6717601150139360.57297.4934010636734.2371.1782.7143.544119001698253.15429.8332.4214.3423.2562884000000555960000058688666674638.340185.5340.459440.2784670.3203021.722180.587671524.101292.7660.11238212286131274319492.81270745.63702628.549.179.55192833333476.64.9626983.0575095.4252323.41118796.552220.119147.93157.12193.13189.03160.53343.97454.618287.1947.9919.5122.784.866155.24722.5104.64123.117.6636.32403.531662.9653779.239837448.18682335.003732864140754.96280.4730888410732.4266.4682.0742.170111101170757.29231.9742.5734.7373.4222675133333528020000054516000004539.557181.5824.283281.568020.3997742.037030.637696869.007664.9850.25031811919870900019467.81266612.93395037.597.848.03189500000481.04.4776062.8130284.9600553.22110592.543187.192121.95125.58173.56169.47120.03229.79352.410226.0067.8918.9921.893.596060.74913.7101.44203.8OpenBenchmarking.org

AOM AV1

Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4KAOCC 4.0GCC 12.2510152025SE +/- 0.37, N = 12SE +/- 0.18, N = 1520.9817.661. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4KAOCC 4.0GCC 12.2918273645SE +/- 0.45, N = 3SE +/- 0.49, N = 1537.3936.321. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: MediumAOCC 4.0GCC 12.2100200300400500SE +/- 0.25, N = 3SE +/- 0.24, N = 3457.17403.531. (CXX) g++ options: -O3 -march=native -flto -pthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughAOCC 4.0GCC 12.21428425670SE +/- 0.03, N = 3SE +/- 0.03, N = 364.0062.971. (CXX) g++ options: -O3 -march=native -flto -pthread

Crypto++

Test: Keyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Keyed AlgorithmsAOCC 4.0GCC 12.22004006008001000SE +/- 1.21, N = 3SE +/- 0.35, N = 3796.93779.241. (CXX) g++ options: -O3 -march=native -fPIC -pthread -pipe

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed AlgorithmsAOCC 4.0GCC 12.2120240360480600SE +/- 0.08, N = 3SE +/- 0.18, N = 3534.93448.191. (CXX) g++ options: -O3 -march=native -fPIC -pthread -pipe

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 22.1Input: Carbon NanotubeAOCC 4.0GCC 12.2816243240SE +/- 0.03, N = 3SE +/- 0.11, N = 332.6735.001. (CC) gcc options: -shared -fwrapv -O2 -O3 -march=native -lxc -lblas -lmpi

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: RotateAOCC 4.0GCC 12.2160320480640800SE +/- 0.67, N = 3SE +/- 1.53, N = 37607321. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SharpenAOCC 4.0GCC 12.22004006008001000SE +/- 2.65, N = 3SE +/- 3.61, N = 311508641. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: EnhancedAOCC 4.0GCC 12.230060090012001500SE +/- 6.33, N = 3SE +/- 1.86, N = 3139314071. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread

JPEG XL Decoding libjxl

CPU Threads: 1

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding libjxl 0.7CPU Threads: 1AOCC 4.0GCC 12.21428425670SE +/- 0.22, N = 3SE +/- 0.05, N = 360.5754.96

JPEG XL Decoding libjxl

CPU Threads: All

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding libjxl 0.7CPU Threads: AllAOCC 4.0GCC 12.260120180240300SE +/- 2.41, N = 3SE +/- 0.55, N = 3297.49280.47

Kripke

OpenBenchmarking.orgThroughput FoM, More Is BetterKripke 1.2.4AOCC 4.0GCC 12.270M140M210M280M350MSE +/- 1235045.49, N = 3SE +/- 4347432.93, N = 15340106367308884107-fopenmp=libomp-fopenmp1. (CXX) g++ options: -O3 -march=native

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: MediumAOCC 4.0GCC 12.2816243240SE +/- 0.00, N = 3SE +/- 0.10, N = 334.2332.42-lpthread1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Very FastAOCC 4.0GCC 12.21632486480SE +/- 0.20, N = 3SE +/- 0.07, N = 371.1766.46-lpthread1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Ultra FastAOCC 4.0GCC 12.220406080100SE +/- 0.41, N = 3SE +/- 0.55, N = 382.7182.07-lpthread1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsAOCC 4.0GCC 12.21020304050SE +/- 0.06, N = 3SE +/- 0.09, N = 343.5442.171. (CXX) g++ options: -O3 -march=native -lm -ldl

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASAOCC 4.0GCC 12.23K6K9K12K15KSE +/- 62.91, N = 3SE +/- 56.50, N = 311900111101. (CXX) g++ options: -flto -O3 -march=native -pthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenAOCC 4.0GCC 12.24K8K12K16K20KSE +/- 173.43, N = 9SE +/- 95.82, N = 916982117071. (CXX) g++ options: -flto -O3 -march=native -pthread

libavif avifenc

Encoder Speed: 0

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 0AOCC 4.0GCC 12.21326395265SE +/- 0.15, N = 3SE +/- 0.49, N = 353.1557.291. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 2AOCC 4.0GCC 12.2714212835SE +/- 0.28, N = 3SE +/- 0.21, N = 329.8331.971. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6AOCC 4.0GCC 12.20.57891.15781.73672.31562.8945SE +/- 0.019, N = 3SE +/- 0.011, N = 32.4212.5731. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6, LosslessAOCC 4.0GCC 12.21.06582.13163.19744.26325.329SE +/- 0.008, N = 3SE +/- 0.028, N = 34.3424.7371. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 10, LosslessAOCC 4.0GCC 12.20.771.542.313.083.85SE +/- 0.004, N = 3SE +/- 0.011, N = 33.2563.4221. (CXX) g++ options: -O3 -fPIC -march=native -lm

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57AOCC 4.0GCC 12.2600M1200M1800M2400M3000MSE +/- 4590206.97, N = 3SE +/- 12676794.20, N = 3288400000026751333331. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 64 - Buffer Length: 256 - Filter Length: 57AOCC 4.0GCC 12.21200M2400M3600M4800M6000MSE +/- 7275300.68, N = 3SE +/- 18956353.38, N = 3555960000052802000001. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 128 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 128 - Buffer Length: 256 - Filter Length: 57AOCC 4.0GCC 12.21300M2600M3900M5200M6500MSE +/- 2562117.18, N = 3SE +/- 5550075.07, N = 3586886666754516000001. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AOCC 4.0GCC 12.210002000300040005000SE +/- 4.03, N = 3SE +/- 9.63, N = 34638.344539.561. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM2

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AOCC 4.0GCC 12.24080120160200SE +/- 0.16, N = 3SE +/- 0.39, N = 3185.53181.581. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUAOCC 4.0GCC 12.20.96371.92742.89113.85484.8185SE +/- 0.00415, N = 3SE +/- 0.05940, N = 150.459444.28328-fopenmp - MIN: 2.971. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUAOCC 4.0GCC 12.20.35280.70561.05841.41121.764SE +/- 0.000429, N = 3SE +/- 0.018545, N = 30.2784671.568020-fopenmp=libomp - MIN: 0.24-fopenmp - MIN: 1.21. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUAOCC 4.0GCC 12.20.08990.17980.26970.35960.4495SE +/- 0.000615, N = 3SE +/- 0.002555, N = 30.3203020.399774-fopenmp=libomp - MIN: 0.31-fopenmp - MIN: 0.361. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUAOCC 4.0GCC 12.20.45830.91661.37491.83322.2915SE +/- 0.00581, N = 3SE +/- 0.00916, N = 31.722182.03703-fopenmp=libomp - MIN: 1.53-fopenmp - MIN: 1.861. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUAOCC 4.0GCC 12.20.14350.2870.43050.5740.7175SE +/- 0.001000, N = 3SE +/- 0.003124, N = 30.5876710.637696-fopenmp=libomp - MIN: 0.54-fopenmp - MIN: 0.611. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUAOCC 4.0GCC 12.22004006008001000SE +/- 0.35, N = 3SE +/- 6.48, N = 3524.10869.01-fopenmp=libomp - MIN: 510.54-fopenmp - MIN: 839.811. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUAOCC 4.0GCC 12.2140280420560700SE +/- 0.18, N = 3SE +/- 4.69, N = 12292.77664.99-fopenmp=libomp - MIN: 283.8-fopenmp - MIN: 635.261. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUAOCC 4.0GCC 12.20.05630.11260.16890.22520.2815SE +/- 0.001257, N = 3SE +/- 0.002764, N = 40.1123820.250318-fopenmp=libomp - MIN: 0.1-fopenmp - MIN: 0.221. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA256AOCC 4.0GCC 12.230000M60000M90000M120000M150000MSE +/- 7229089.97, N = 3SE +/- 3427994.13, N = 3122861312743119198709000-Qunused-arguments1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096AOCC 4.0GCC 12.24K8K12K16K20KSE +/- 0.40, N = 3SE +/- 18.34, N = 319492.819467.8-Qunused-arguments1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096AOCC 4.0GCC 12.2300K600K900K1200K1500KSE +/- 66.97, N = 3SE +/- 2377.84, N = 31270745.61266612.9-Qunused-arguments1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSAOCC 4.0GCC 12.280K160K240K320K400KSE +/- 398.31, N = 3SE +/- 1573.48, N = 33702623395031. (CC) gcc options: -pedantic -O3

simdjson

Throughput Test: TopTweet

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: TopTweetAOCC 4.0GCC 12.2246810SE +/- 0.02, N = 3SE +/- 0.00, N = 38.547.591. (CXX) g++ options: -O3 -march=native

simdjson

Throughput Test: PartialTweets

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: PartialTweetsAOCC 4.0GCC 12.23691215SE +/- 0.05, N = 3SE +/- 0.06, N = 39.177.841. (CXX) g++ options: -O3 -march=native

simdjson

Throughput Test: DistinctUserID

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: DistinctUserIDAOCC 4.0GCC 12.23691215SE +/- 0.07, N = 3SE +/- 0.05, N = 39.558.031. (CXX) g++ options: -O3 -march=native

srsRAN

Test: OFDM_Test

OpenBenchmarking.orgSamples / Second, More Is BettersrsRAN 22.04.1Test: OFDM_TestAOCC 4.0GCC 12.240M80M120M160M200MSE +/- 1299145.02, N = 3SE +/- 781024.97, N = 3192833333189500000-latomic1. (CXX) g++ options: -O3 -march=native -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM

OpenBenchmarking.orgeNb Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAMAOCC 4.0GCC 12.2100200300400500SE +/- 0.50, N = 3SE +/- 1.94, N = 3476.6481.0-latomic1. (CXX) g++ options: -O3 -march=native -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 22.11.5Sample Rate: 96000 - Buffer Size: 512AOCC 4.0GCC 12.21.11662.23323.34984.46645.583SE +/- 0.008965, N = 3SE +/- 0.016850, N = 34.9626984.4776061. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 192000 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 22.11.5Sample Rate: 192000 - Buffer Size: 512AOCC 4.0GCC 12.20.68791.37582.06372.75163.4395SE +/- 0.001667, N = 3SE +/- 0.012940, N = 33.0575092.8130281. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 22.11.5Sample Rate: 96000 - Buffer Size: 1024AOCC 4.0GCC 12.21.22072.44143.66214.88286.1035SE +/- 0.005282, N = 3SE +/- 0.012443, N = 35.4252324.9600551. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 192000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 22.11.5Sample Rate: 192000 - Buffer Size: 1024AOCC 4.0GCC 12.20.76751.5352.30253.073.8375SE +/- 0.006797, N = 3SE +/- 0.009147, N = 33.4111873.2211051. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.4Encoder Mode: Preset 8 - Input: Bosphorus 4KAOCC 4.0GCC 12.220406080100SE +/- 0.24, N = 3SE +/- 1.00, N = 396.5592.541. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.4Encoder Mode: Preset 13 - Input: Bosphorus 4KAOCC 4.0GCC 12.250100150200250SE +/- 2.79, N = 3SE +/- 2.73, N = 15220.12187.191. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-HEVC

Tuning: 7 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 4KAOCC 4.0GCC 12.2306090120150SE +/- 1.90, N = 3SE +/- 1.03, N = 3147.93121.951. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 10 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 4KAOCC 4.0GCC 12.2306090120150SE +/- 1.46, N = 3SE +/- 0.86, N = 15157.12125.581. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: VMAF Optimized - Input: Bosphorus 4KAOCC 4.0GCC 12.24080120160200SE +/- 1.04, N = 3SE +/- 1.38, N = 15193.13173.561. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4KAOCC 4.0GCC 12.24080120160200SE +/- 1.57, N = 9SE +/- 1.94, N = 15189.03169.471. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 4KAOCC 4.0GCC 12.24080120160200SE +/- 1.60, N = 15SE +/- 0.72, N = 3160.53120.031. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2AOCC 4.0GCC 12.270140210280350SE +/- 4.23, N = 15SE +/- 0.12, N = 3343.97229.79-fopenmp=libomp - MIN: 274.57 / MAX: 495.33-fopenmp - MIN: 227.65 / MAX: 234.531. (CXX) g++ options: -O3 -march=native -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2AOCC 4.0GCC 12.21224364860SE +/- 0.59, N = 5SE +/- 0.31, N = 354.6252.41-fopenmp=libomp - MIN: 52.11 / MAX: 57.47-fopenmp - MIN: 51.53 / MAX: 54.741. (CXX) g++ options: -O3 -march=native -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1AOCC 4.0GCC 12.260120180240300SE +/- 0.01, N = 3SE +/- 0.01, N = 3287.19226.01-fopenmp=libomp - MIN: 286.77 / MAX: 288-fopenmp - MIN: 225.65 / MAX: 227.031. (CXX) g++ options: -O3 -march=native -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

VP9 libvpx Encoding

Speed: Speed 0 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 0 - Input: Bosphorus 4KAOCC 4.0GCC 12.2246810SE +/- 0.06, N = 12SE +/- 0.06, N = 37.997.891. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=native -fPIC -U_FORTIFY_SOURCE -std=gnu++11

VP9 libvpx Encoding

Speed: Speed 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 4KAOCC 4.0GCC 12.2510152025SE +/- 0.17, N = 3SE +/- 0.23, N = 1519.5118.991. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=native -fPIC -U_FORTIFY_SOURCE -std=gnu++11

WebP Image Encode

Encode Settings: Default

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: DefaultAOCC 4.0GCC 12.2510152025SE +/- 0.03, N = 3SE +/- 0.02, N = 322.7821.891. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Highest CompressionAOCC 4.0GCC 12.21.09352.1873.28054.3745.4675SE +/- 0.00, N = 3SE +/- 0.00, N = 34.863.591. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm

Zstd Compression

Compression Level: 8 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Compression SpeedAOCC 4.0GCC 12.213002600390052006500SE +/- 60.88, N = 3SE +/- 65.79, N = 46155.26060.71. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Decompression SpeedAOCC 4.0GCC 12.211002200330044005500SE +/- 10.29, N = 3SE +/- 29.00, N = 44722.54913.71. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression SpeedAOCC 4.0GCC 12.220406080100SE +/- 1.53, N = 12SE +/- 0.61, N = 3104.6101.41. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression SpeedAOCC 4.0GCC 12.29001800270036004500SE +/- 17.78, N = 12SE +/- 53.58, N = 34123.14203.81. (CC) gcc options: -O3 -march=native -pthread -lz -llzma


Phoronix Test Suite v10.8.4