AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarks Tests for a future article by Michael Larabel. AOCC 4.0: Processor: 2 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 22.10, Kernel: 5.19.0-26-generic (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 1920x1080 GCC 12.2: Processor: 2 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 22.10, Kernel: 5.19.0-26-generic (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, Screen Resolution: 1920x1080 Crypto++ 8.2 Test: Keyed Algorithms MiB/second > Higher Is Better AOCC 4.0 . 796.93 |============================================================ GCC 12.2 . 779.24 |=========================================================== Crypto++ 8.2 Test: Unkeyed Algorithms MiB/second > Higher Is Better AOCC 4.0 . 534.93 |============================================================ GCC 12.2 . 448.19 |================================================== LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better AOCC 4.0 . 11900 |============================================================= GCC 12.2 . 11110 |========================================================= LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better AOCC 4.0 . 16982 |============================================================= GCC 12.2 . 11707 |========================================== miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 GFInst/s > Higher Is Better AOCC 4.0 . 4638.34 |=========================================================== GCC 12.2 . 4539.56 |========================================================== miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Billion Interactions/s > Higher Is Better AOCC 4.0 . 185.53 |============================================================ GCC 12.2 . 181.58 |=========================================================== LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms ns/day > Higher Is Better AOCC 4.0 . 43.54 |============================================================= GCC 12.2 . 42.17 |=========================================================== simdjson 2.0 Throughput Test: TopTweet GB/s > Higher Is Better AOCC 4.0 . 8.54 |============================================================== GCC 12.2 . 7.59 |======================================================= simdjson 2.0 Throughput Test: PartialTweets GB/s > Higher Is Better AOCC 4.0 . 9.17 |============================================================== GCC 12.2 . 7.84 |===================================================== simdjson 2.0 Throughput Test: DistinctUserID GB/s > Higher Is Better AOCC 4.0 . 9.55 |============================================================== GCC 12.2 . 8.03 |==================================================== Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed MB/s > Higher Is Better AOCC 4.0 . 6155.2 |============================================================ GCC 12.2 . 6060.7 |=========================================================== Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed MB/s > Higher Is Better AOCC 4.0 . 4722.5 |========================================================== GCC 12.2 . 4913.7 |============================================================ Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed MB/s > Higher Is Better AOCC 4.0 . 104.6 |============================================================= GCC 12.2 . 101.4 |=========================================================== Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better AOCC 4.0 . 4123.1 |=========================================================== GCC 12.2 . 4203.8 |============================================================ JPEG XL Decoding libjxl 0.7 CPU Threads: 1 MP/s > Higher Is Better AOCC 4.0 . 60.57 |============================================================= GCC 12.2 . 54.96 |======================================================= JPEG XL Decoding libjxl 0.7 CPU Threads: All MP/s > Higher Is Better AOCC 4.0 . 297.49 |============================================================ GCC 12.2 . 280.47 |========================================================= WebP Image Encode 1.2.4 Encode Settings: Default MP/s > Higher Is Better AOCC 4.0 . 22.78 |============================================================= GCC 12.2 . 21.89 |=========================================================== WebP Image Encode 1.2.4 Encode Settings: Quality 100, Highest Compression MP/s > Higher Is Better AOCC 4.0 . 4.86 |============================================================== GCC 12.2 . 3.59 |============================================== srsRAN 22.04.1 Test: OFDM_Test Samples / Second > Higher Is Better AOCC 4.0 . 192833333 |========================================================= GCC 12.2 . 189500000 |======================================================== srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM eNb Mb/s > Higher Is Better AOCC 4.0 . 476.6 |============================================================ GCC 12.2 . 481.0 |============================================================= GraphicsMagick 1.3.38 Operation: Rotate Iterations Per Minute > Higher Is Better AOCC 4.0 . 760 |=============================================================== GCC 12.2 . 732 |============================================================= GraphicsMagick 1.3.38 Operation: Sharpen Iterations Per Minute > Higher Is Better AOCC 4.0 . 1150 |============================================================== GCC 12.2 . 864 |=============================================== GraphicsMagick 1.3.38 Operation: Enhanced Iterations Per Minute > Higher Is Better AOCC 4.0 . 1393 |============================================================= GCC 12.2 . 1407 |============================================================== AOM AV1 3.5 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 20.98 |============================================================= GCC 12.2 . 17.66 |=================================================== AOM AV1 3.5 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 37.39 |============================================================= GCC 12.2 . 36.32 |=========================================================== Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium Frames Per Second > Higher Is Better AOCC 4.0 . 34.23 |============================================================= GCC 12.2 . 32.42 |========================================================== Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast Frames Per Second > Higher Is Better AOCC 4.0 . 71.17 |============================================================= GCC 12.2 . 66.46 |========================================================= Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Frames Per Second > Higher Is Better AOCC 4.0 . 82.71 |============================================================= GCC 12.2 . 82.07 |============================================================= SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 96.55 |============================================================= GCC 12.2 . 92.54 |========================================================== SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 220.12 |============================================================ GCC 12.2 . 187.19 |=================================================== SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 147.93 |============================================================ GCC 12.2 . 121.95 |================================================= SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 157.12 |============================================================ GCC 12.2 . 125.58 |================================================ SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 193.13 |============================================================ GCC 12.2 . 173.56 |====================================================== SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 189.03 |============================================================ GCC 12.2 . 169.47 |====================================================== SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 160.53 |============================================================ GCC 12.2 . 120.03 |============================================= VP9 libvpx Encoding 1.10.0 Speed: Speed 0 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 7.99 |============================================================== GCC 12.2 . 7.89 |============================================================= VP9 libvpx Encoding 1.10.0 Speed: Speed 5 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 19.51 |============================================================= GCC 12.2 . 18.99 |=========================================================== Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 512 Render Ratio > Higher Is Better AOCC 4.0 . 4.962698 |========================================================== GCC 12.2 . 4.477606 |==================================================== Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 512 Render Ratio > Higher Is Better AOCC 4.0 . 3.057509 |========================================================== GCC 12.2 . 2.813028 |===================================================== Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 Render Ratio > Higher Is Better AOCC 4.0 . 5.425232 |========================================================== GCC 12.2 . 4.960055 |===================================================== Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 Render Ratio > Higher Is Better AOCC 4.0 . 3.411187 |========================================================== GCC 12.2 . 3.221105 |======================================================= libavif avifenc 0.11 Encoder Speed: 0 Seconds < Lower Is Better AOCC 4.0 . 53.15 |========================================================= GCC 12.2 . 57.29 |============================================================= libavif avifenc 0.11 Encoder Speed: 2 Seconds < Lower Is Better AOCC 4.0 . 29.83 |========================================================= GCC 12.2 . 31.97 |============================================================= libavif avifenc 0.11 Encoder Speed: 6 Seconds < Lower Is Better AOCC 4.0 . 2.421 |========================================================= GCC 12.2 . 2.573 |============================================================= libavif avifenc 0.11 Encoder Speed: 6, Lossless Seconds < Lower Is Better AOCC 4.0 . 4.342 |======================================================== GCC 12.2 . 4.737 |============================================================= libavif avifenc 0.11 Encoder Speed: 10, Lossless Seconds < Lower Is Better AOCC 4.0 . 3.256 |========================================================== GCC 12.2 . 3.422 |============================================================= oneDNN 2.7 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.45944 |====== GCC 12.2 . 4.28328 |=========================================================== oneDNN 2.7 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.278467 |========== GCC 12.2 . 1.568020 |========================================================== oneDNN 2.7 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.320302 |============================================== GCC 12.2 . 0.399774 |========================================================== oneDNN 2.7 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 1.72218 |================================================== GCC 12.2 . 2.03703 |=========================================================== oneDNN 2.7 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.587671 |===================================================== GCC 12.2 . 0.637696 |========================================================== oneDNN 2.7 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 524.10 |==================================== GCC 12.2 . 869.01 |============================================================ oneDNN 2.7 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 292.77 |========================== GCC 12.2 . 664.99 |============================================================ oneDNN 2.7 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.112382 |========================== GCC 12.2 . 0.250318 |========================================================== SecureMark 1.0.4 Benchmark: SecureMark-TLS marks > Higher Is Better AOCC 4.0 . 370262 |============================================================ GCC 12.2 . 339503 |======================================================= OpenSSL 3.0 Algorithm: SHA256 byte/s > Higher Is Better AOCC 4.0 . 122861312743 |====================================================== GCC 12.2 . 119198709000 |==================================================== OpenSSL 3.0 Algorithm: RSA4096 sign/s > Higher Is Better AOCC 4.0 . 19492.8 |=========================================================== GCC 12.2 . 19467.8 |=========================================================== OpenSSL 3.0 Algorithm: RSA4096 verify/s > Higher Is Better AOCC 4.0 . 1270745.6 |========================================================= GCC 12.2 . 1266612.9 |========================================================= Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 4.0 . 2884000000 |======================================================== GCC 12.2 . 2675133333 |==================================================== Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 4.0 . 5559600000 |======================================================== GCC 12.2 . 5280200000 |===================================================== Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 4.0 . 5868866667 |======================================================== GCC 12.2 . 5451600000 |==================================================== ASTC Encoder 4.0 Preset: Medium MT/s > Higher Is Better AOCC 4.0 . 457.17 |============================================================ GCC 12.2 . 403.53 |===================================================== ASTC Encoder 4.0 Preset: Thorough MT/s > Higher Is Better AOCC 4.0 . 64.00 |============================================================= GCC 12.2 . 62.97 |============================================================ GPAW 22.1 Input: Carbon Nanotube Seconds < Lower Is Better AOCC 4.0 . 32.67 |========================================================= GCC 12.2 . 35.00 |============================================================= TNN 0.3 Target: CPU - Model: MobileNet v2 ms < Lower Is Better AOCC 4.0 . 343.97 |============================================================ GCC 12.2 . 229.79 |======================================== TNN 0.3 Target: CPU - Model: SqueezeNet v2 ms < Lower Is Better AOCC 4.0 . 54.62 |============================================================= GCC 12.2 . 52.41 |=========================================================== TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 ms < Lower Is Better AOCC 4.0 . 287.19 |============================================================ GCC 12.2 . 226.01 |=============================================== Kripke 1.2.4 Throughput FoM > Higher Is Better AOCC 4.0 . 340106367 |========================================================= GCC 12.2 . 308884107 |====================================================