AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarks Tests for a future article by Michael Larabel. AOCC 4.0: Processor: 2 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 22.10, Kernel: 5.19.0-26-generic (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 1920x1080 GCC 12.2: Processor: 2 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 22.10, Kernel: 5.19.0-26-generic (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, Screen Resolution: 1920x1080 oneDNN 2.7 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.45944 |====== GCC 12.2 . 4.28328 |=========================================================== oneDNN 2.7 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.278467 |========== GCC 12.2 . 1.568020 |========================================================== oneDNN 2.7 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 292.77 |========================== GCC 12.2 . 664.99 |============================================================ oneDNN 2.7 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.112382 |========================== GCC 12.2 . 0.250318 |========================================================== oneDNN 2.7 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 524.10 |==================================== GCC 12.2 . 869.01 |============================================================ TNN 0.3 Target: CPU - Model: MobileNet v2 ms < Lower Is Better AOCC 4.0 . 343.97 |============================================================ GCC 12.2 . 229.79 |======================================== LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better AOCC 4.0 . 16982 |============================================================= GCC 12.2 . 11707 |========================================== WebP Image Encode 1.2.4 Encode Settings: Quality 100, Highest Compression MP/s > Higher Is Better AOCC 4.0 . 4.86 |============================================================== GCC 12.2 . 3.59 |============================================== SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 160.53 |============================================================ GCC 12.2 . 120.03 |============================================= GraphicsMagick 1.3.38 Operation: Sharpen Iterations Per Minute > Higher Is Better AOCC 4.0 . 1150 |============================================================== GCC 12.2 . 864 |=============================================== TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 ms < Lower Is Better AOCC 4.0 . 287.19 |============================================================ GCC 12.2 . 226.01 |=============================================== SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 157.12 |============================================================ GCC 12.2 . 125.58 |================================================ oneDNN 2.7 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.320302 |============================================== GCC 12.2 . 0.399774 |========================================================== SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 147.93 |============================================================ GCC 12.2 . 121.95 |================================================= Crypto++ 8.2 Test: Unkeyed Algorithms MiB/second > Higher Is Better AOCC 4.0 . 534.93 |============================================================ GCC 12.2 . 448.19 |================================================== simdjson 2.0 Throughput Test: DistinctUserID GB/s > Higher Is Better AOCC 4.0 . 9.55 |============================================================== GCC 12.2 . 8.03 |==================================================== oneDNN 2.7 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 1.72218 |================================================== GCC 12.2 . 2.03703 |=========================================================== SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 220.12 |============================================================ GCC 12.2 . 187.19 |=================================================== simdjson 2.0 Throughput Test: PartialTweets GB/s > Higher Is Better AOCC 4.0 . 9.17 |============================================================== GCC 12.2 . 7.84 |===================================================== ASTC Encoder 4.0 Preset: Medium MT/s > Higher Is Better AOCC 4.0 . 457.17 |============================================================ GCC 12.2 . 403.53 |===================================================== simdjson 2.0 Throughput Test: TopTweet GB/s > Higher Is Better AOCC 4.0 . 8.54 |============================================================== GCC 12.2 . 7.59 |======================================================= SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 189.03 |============================================================ GCC 12.2 . 169.47 |====================================================== SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 193.13 |============================================================ GCC 12.2 . 173.56 |====================================================== Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 512 Render Ratio > Higher Is Better AOCC 4.0 . 4.962698 |========================================================== GCC 12.2 . 4.477606 |==================================================== JPEG XL Decoding libjxl 0.7 CPU Threads: 1 MP/s > Higher Is Better AOCC 4.0 . 60.57 |============================================================= GCC 12.2 . 54.96 |======================================================= Kripke 1.2.4 Throughput FoM > Higher Is Better AOCC 4.0 . 340106367 |========================================================= GCC 12.2 . 308884107 |==================================================== Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 Render Ratio > Higher Is Better AOCC 4.0 . 5.425232 |========================================================== GCC 12.2 . 4.960055 |===================================================== libavif avifenc 0.11 Encoder Speed: 6, Lossless Seconds < Lower Is Better AOCC 4.0 . 4.342 |======================================================== GCC 12.2 . 4.737 |============================================================= SecureMark 1.0.4 Benchmark: SecureMark-TLS marks > Higher Is Better AOCC 4.0 . 370262 |============================================================ GCC 12.2 . 339503 |======================================================= Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 512 Render Ratio > Higher Is Better AOCC 4.0 . 3.057509 |========================================================== GCC 12.2 . 2.813028 |===================================================== oneDNN 2.7 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AOCC 4.0 . 0.587671 |===================================================== GCC 12.2 . 0.637696 |========================================================== Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 4.0 . 2884000000 |======================================================== GCC 12.2 . 2675133333 |==================================================== libavif avifenc 0.11 Encoder Speed: 0 Seconds < Lower Is Better AOCC 4.0 . 53.15 |========================================================= GCC 12.2 . 57.29 |============================================================= Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 4.0 . 5868866667 |======================================================== GCC 12.2 . 5451600000 |==================================================== libavif avifenc 0.11 Encoder Speed: 2 Seconds < Lower Is Better AOCC 4.0 . 29.83 |========================================================= GCC 12.2 . 31.97 |============================================================= GPAW 22.1 Input: Carbon Nanotube Seconds < Lower Is Better AOCC 4.0 . 32.67 |========================================================= GCC 12.2 . 35.00 |============================================================= LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better AOCC 4.0 . 11900 |============================================================= GCC 12.2 . 11110 |========================================================= Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast Frames Per Second > Higher Is Better AOCC 4.0 . 71.17 |============================================================= GCC 12.2 . 66.46 |========================================================= libavif avifenc 0.11 Encoder Speed: 6 Seconds < Lower Is Better AOCC 4.0 . 2.421 |========================================================= GCC 12.2 . 2.573 |============================================================= JPEG XL Decoding libjxl 0.7 CPU Threads: All MP/s > Higher Is Better AOCC 4.0 . 297.49 |============================================================ GCC 12.2 . 280.47 |========================================================= Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 Render Ratio > Higher Is Better AOCC 4.0 . 3.411187 |========================================================== GCC 12.2 . 3.221105 |======================================================= Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium Frames Per Second > Higher Is Better AOCC 4.0 . 34.23 |============================================================= GCC 12.2 . 32.42 |========================================================== Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AOCC 4.0 . 5559600000 |======================================================== GCC 12.2 . 5280200000 |===================================================== libavif avifenc 0.11 Encoder Speed: 10, Lossless Seconds < Lower Is Better AOCC 4.0 . 3.256 |========================================================== GCC 12.2 . 3.422 |============================================================= SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 96.55 |============================================================= GCC 12.2 . 92.54 |========================================================== TNN 0.3 Target: CPU - Model: SqueezeNet v2 ms < Lower Is Better AOCC 4.0 . 54.62 |============================================================= GCC 12.2 . 52.41 |=========================================================== WebP Image Encode 1.2.4 Encode Settings: Default MP/s > Higher Is Better AOCC 4.0 . 22.78 |============================================================= GCC 12.2 . 21.89 |=========================================================== Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed MB/s > Higher Is Better AOCC 4.0 . 4722.5 |========================================================== GCC 12.2 . 4913.7 |============================================================ GraphicsMagick 1.3.38 Operation: Rotate Iterations Per Minute > Higher Is Better AOCC 4.0 . 760 |=============================================================== GCC 12.2 . 732 |============================================================= LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms ns/day > Higher Is Better AOCC 4.0 . 43.54 |============================================================= GCC 12.2 . 42.17 |=========================================================== Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed MB/s > Higher Is Better AOCC 4.0 . 104.6 |============================================================= GCC 12.2 . 101.4 |=========================================================== OpenSSL 3.0 Algorithm: SHA256 byte/s > Higher Is Better AOCC 4.0 . 122861312743 |====================================================== GCC 12.2 . 119198709000 |==================================================== AOM AV1 3.5 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 37.39 |============================================================= GCC 12.2 . 36.32 |=========================================================== VP9 libvpx Encoding 1.10.0 Speed: Speed 5 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 19.51 |============================================================= GCC 12.2 . 18.99 |=========================================================== Crypto++ 8.2 Test: Keyed Algorithms MiB/second > Higher Is Better AOCC 4.0 . 796.93 |============================================================ GCC 12.2 . 779.24 |=========================================================== miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 Billion Interactions/s > Higher Is Better AOCC 4.0 . 185.53 |============================================================ GCC 12.2 . 181.58 |=========================================================== miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 GFInst/s > Higher Is Better AOCC 4.0 . 4638.34 |=========================================================== GCC 12.2 . 4539.56 |========================================================== Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better AOCC 4.0 . 4123.1 |=========================================================== GCC 12.2 . 4203.8 |============================================================ srsRAN 22.04.1 Test: OFDM_Test Samples / Second > Higher Is Better AOCC 4.0 . 192833333 |========================================================= GCC 12.2 . 189500000 |======================================================== ASTC Encoder 4.0 Preset: Thorough MT/s > Higher Is Better AOCC 4.0 . 64.00 |============================================================= GCC 12.2 . 62.97 |============================================================ Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed MB/s > Higher Is Better AOCC 4.0 . 6155.2 |============================================================ GCC 12.2 . 6060.7 |=========================================================== VP9 libvpx Encoding 1.10.0 Speed: Speed 0 - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 7.99 |============================================================== GCC 12.2 . 7.89 |============================================================= GraphicsMagick 1.3.38 Operation: Enhanced Iterations Per Minute > Higher Is Better AOCC 4.0 . 1393 |============================================================= GCC 12.2 . 1407 |============================================================== srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM eNb Mb/s > Higher Is Better AOCC 4.0 . 476.6 |============================================================ GCC 12.2 . 481.0 |============================================================= Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Frames Per Second > Higher Is Better AOCC 4.0 . 82.71 |============================================================= GCC 12.2 . 82.07 |============================================================= OpenSSL 3.0 Algorithm: RSA4096 verify/s > Higher Is Better AOCC 4.0 . 1270745.6 |========================================================= GCC 12.2 . 1266612.9 |========================================================= OpenSSL 3.0 Algorithm: RSA4096 sign/s > Higher Is Better AOCC 4.0 . 19492.8 |=========================================================== GCC 12.2 . 19467.8 |=========================================================== AOM AV1 3.5 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K Frames Per Second > Higher Is Better AOCC 4.0 . 20.98 |============================================================= GCC 12.2 . 17.66 |===================================================