Tests for a future article by Michael Larabel.
AOCC 4.0 Processor: 2 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 22.10, Kernel: 5.19.0-26-generic (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver4Processor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
GCC 12.2 OS: Ubuntu 22.10, Kernel: 5.19.0-26-generic (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarks OpenBenchmarking.org Phoronix Test Suite 2 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads) AMD Titanite_4G (RTI1002E BIOS) AMD Device 14a4 1520GB 800GB INTEL SSDPF21Q800GB ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Ubuntu 22.10 5.19.0-26-generic (x86_64) GNOME Shell 43.0 X Server 1.21.1.4 1.3.224 Clang 14.0.6 GCC 12.2.0 ext4 1920x1080 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compilers File-System Screen Resolution AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarks Performance System Logs - Transparent Huge Pages: madvise - CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" - AOCC 4.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver4 - GCC 12.2: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110d - Python 3.10.7 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AOCC 4.0 vs. GCC 12.2 Comparison Phoronix Test Suite Baseline +208.1% +208.1% +416.2% +416.2% +624.3% +624.3% 49.7% 27.1% 4.2% 4% 2% IP Shapes 1D - bf16bf16bf16 - CPU 832.3% IP Shapes 3D - bf16bf16bf16 - CPU 463.1% R.N.N.I - bf16bf16bf16 - CPU 127.1% M.M.B.S.T - bf16bf16bf16 - CPU 122.7% R.N.N.T - bf16bf16bf16 - CPU 65.8% CPU - MobileNet v2 Eigen 45.1% Q.1.H.C 35.4% V.Q.O - Bosphorus 4K 33.7% Sharpen 33.1% CPU - SqueezeNet v1.1 10 - Bosphorus 4K 25.1% C.B.S.A - bf16bf16bf16 - CPU 24.8% 7 - Bosphorus 4K 21.3% Unkeyed Algorithms 19.4% DistinctUserID 18.9% Speed 6 Two-Pass - Bosphorus 4K 18.8% D.B.s - bf16bf16bf16 - CPU 18.3% Preset 13 - Bosphorus 4K 17.6% PartialTweets 17% Medium 13.3% TopTweet 12.5% P.S.O - Bosphorus 4K 11.5% VMAF Optimized - Bosphorus 4K 11.3% 96000 - 512 10.8% 1 10.2% 10.1% 96000 - 1024 9.4% 6, Lossless 9.1% SecureMark-TLS 9.1% 192000 - 512 8.7% D.B.s - bf16bf16bf16 - CPU 8.5% 32 - 256 - 57 7.8% 7.8% 128 - 256 - 57 7.7% 2 7.2% Carbon Nanotube 7.1% BLAS 7.1% Bosphorus 4K - Very Fast 7.1% 6 6.3% All 6.1% 192000 - 1024 5.9% Bosphorus 4K - Medium 5.6% 64 - 256 - 57 5.3% 10, Lossless 5.1% Preset 8 - Bosphorus 4K 4.3% CPU - SqueezeNet v2 Default 4.1% 8 - D.S Rotate 3.8% 20k Atoms 3.3% 19 - Compression Speed 3.2% SHA256 3.1% Speed 9 Realtime - Bosphorus 4K 2.9% Speed 5 - Bosphorus 4K 2.7% Keyed Algorithms 2.3% OpenMP - BM2 2.2% OpenMP - BM2 2.2% 19 - D.S oneDNN oneDNN oneDNN oneDNN oneDNN TNN LeelaChessZero WebP Image Encode SVT-VP9 GraphicsMagick TNN SVT-HEVC oneDNN SVT-HEVC Crypto++ simdjson AOM AV1 oneDNN SVT-AV1 simdjson ASTC Encoder simdjson SVT-VP9 SVT-VP9 Stargate Digital Audio Workstation JPEG XL Decoding libjxl Kripke Stargate Digital Audio Workstation libavif avifenc SecureMark Stargate Digital Audio Workstation oneDNN Liquid-DSP libavif avifenc Liquid-DSP libavif avifenc GPAW LeelaChessZero Kvazaar libavif avifenc JPEG XL Decoding libjxl Stargate Digital Audio Workstation Kvazaar Liquid-DSP libavif avifenc SVT-AV1 TNN WebP Image Encode Zstd Compression GraphicsMagick LAMMPS Molecular Dynamics Simulator Zstd Compression OpenSSL AOM AV1 VP9 libvpx Encoding Crypto++ miniBUDE miniBUDE Zstd Compression AOCC 4.0 GCC 12.2
AOCC 4.0 AMD EPYC 9374F 2P Compiler Benchmarks minibude: OpenMP - BM2 openssl: SHA256 srsran: 4G PHY_DL_Test 100 PRB MIMO 64-QAM aom-av1: Speed 6 Two-Pass - Bosphorus 4K aom-av1: Speed 9 Realtime - Bosphorus 4K kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Ultra Fast svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-hevc: 7 - Bosphorus 4K svt-hevc: 10 - Bosphorus 4K svt-vp9: VMAF Optimized - Bosphorus 4K svt-vp9: PSNR/SSIM Optimized - Bosphorus 4K svt-vp9: Visual Quality Optimized - Bosphorus 4K vpxenc: Speed 0 - Bosphorus 4K vpxenc: Speed 5 - Bosphorus 4K simdjson: TopTweet simdjson: PartialTweets simdjson: DistinctUserID minibude: OpenMP - BM2 graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced securemark: SecureMark-TLS compress-zstd: 8 - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed cryptopp: Keyed Algorithms cryptopp: Unkeyed Algorithms jpegxl-decode: 1 jpegxl-decode: All webp: Default webp: Quality 100, Highest Compression astcenc: Medium astcenc: Thorough lczero: BLAS lczero: Eigen lammps: 20k Atoms stargate: 96000 - 512 stargate: 192000 - 512 stargate: 96000 - 1024 stargate: 192000 - 1024 srsran: OFDM_Test liquid-dsp: 32 - 256 - 57 liquid-dsp: 64 - 256 - 57 liquid-dsp: 128 - 256 - 57 openssl: RSA4096 kripke: openssl: RSA4096 onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v2 tnn: CPU - SqueezeNet v1.1 avifenc: 0 avifenc: 2 avifenc: 6 avifenc: 6, Lossless avifenc: 10, Lossless gpaw: Carbon Nanotube AOCC 4.0 GCC 12.2 185.534 122861312743 476.6 20.98 37.39 34.23 71.17 82.71 96.552 220.119 147.93 157.12 193.13 189.03 160.53 7.99 19.51 8.54 9.17 9.55 4638.340 760 1150 1393 370262 6155.2 4722.5 104.6 4123.1 796.930644 534.930083 60.57 297.49 22.78 4.86 457.1693 64.0015 11900 16982 43.544 4.962698 3.057509 5.425232 3.411187 192833333 2884000000 5559600000 5868866667 19492.8 340106367 1270745.6 0.45944 0.278467 0.320302 1.72218 0.587671 524.101 292.766 0.112382 343.974 54.618 287.194 53.154 29.833 2.421 4.342 3.256 32.671 181.582 119198709000 481.0 17.66 36.32 32.42 66.46 82.07 92.543 187.192 121.95 125.58 173.56 169.47 120.03 7.89 18.99 7.59 7.84 8.03 4539.557 732 864 1407 339503 6060.7 4913.7 101.4 4203.8 779.239837 448.186823 54.96 280.47 21.89 3.59 403.5316 62.9653 11110 11707 42.170 4.477606 2.813028 4.960055 3.221105 189500000 2675133333 5280200000 5451600000 19467.8 308884107 1266612.9 4.28328 1.56802 0.399774 2.03703 0.637696 869.007 664.985 0.250318 229.793 52.410 226.006 57.292 31.974 2.573 4.737 3.422 35.003 OpenBenchmarking.org
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 GCC 12.2 AOCC 4.0 40 80 120 160 200 SE +/- 0.39, N = 3 SE +/- 0.16, N = 3 181.58 185.53 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 GCC 12.2 AOCC 4.0 30000M 60000M 90000M 120000M 150000M SE +/- 3427994.13, N = 3 SE +/- 7229089.97, N = 3 119198709000 122861312743 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl
srsRAN srsRAN is an open-source LTE/5G software radio suite created by Software Radio Systems (SRS). The srsRAN radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org eNb Mb/s, More Is Better srsRAN 22.04.1 Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM GCC 12.2 AOCC 4.0 100 200 300 400 500 SE +/- 1.94, N = 3 SE +/- 0.50, N = 3 481.0 476.6 -latomic 1. (CXX) g++ options: -O3 -march=native -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm
AOM AV1 This is a test of the AOMedia AV1 encoder (libaom) developed by AOMedia and Google as the AV1 Codec Library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.5 Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 5 10 15 20 25 SE +/- 0.18, N = 15 SE +/- 0.37, N = 12 17.66 20.98 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.5 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 9 18 27 36 45 SE +/- 0.49, N = 15 SE +/- 0.45, N = 3 36.32 37.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -U_FORTIFY_SOURCE -lm
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium GCC 12.2 AOCC 4.0 8 16 24 32 40 SE +/- 0.10, N = 3 SE +/- 0.00, N = 3 32.42 34.23 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast GCC 12.2 AOCC 4.0 16 32 48 64 80 SE +/- 0.07, N = 3 SE +/- 0.20, N = 3 66.46 71.17 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast GCC 12.2 AOCC 4.0 20 40 60 80 100 SE +/- 0.55, N = 3 SE +/- 0.41, N = 3 82.07 82.71 -lpthread 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 8 - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 20 40 60 80 100 SE +/- 1.00, N = 3 SE +/- 0.24, N = 3 92.54 96.55 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 13 - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 50 100 150 200 250 SE +/- 2.73, N = 15 SE +/- 2.79, N = 3 187.19 220.12 1. (CXX) g++ options: -O3 -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-HEVC This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 30 60 90 120 150 SE +/- 1.03, N = 3 SE +/- 1.90, N = 3 121.95 147.93 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 30 60 90 120 150 SE +/- 0.86, N = 15 SE +/- 1.46, N = 3 125.58 157.12 1. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt
SVT-VP9 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample YUV input video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 40 80 120 160 200 SE +/- 1.38, N = 15 SE +/- 1.04, N = 3 173.56 193.13 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 40 80 120 160 200 SE +/- 1.94, N = 15 SE +/- 1.57, N = 9 169.47 189.03 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 40 80 120 160 200 SE +/- 0.72, N = 3 SE +/- 1.60, N = 15 120.03 160.53 1. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.10.0 Speed: Speed 5 - Input: Bosphorus 4K GCC 12.2 AOCC 4.0 5 10 15 20 25 SE +/- 0.23, N = 15 SE +/- 0.17, N = 3 18.99 19.51 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=native -fPIC -U_FORTIFY_SOURCE -std=gnu++11
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: TopTweet GCC 12.2 AOCC 4.0 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 7.59 8.54 1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: PartialTweets GCC 12.2 AOCC 4.0 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 7.84 9.17 1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: DistinctUserID GCC 12.2 AOCC 4.0 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 8.03 9.55 1. (CXX) g++ options: -O3 -march=native
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 GCC 12.2 AOCC 4.0 1000 2000 3000 4000 5000 SE +/- 9.63, N = 3 SE +/- 4.03, N = 3 4539.56 4638.34 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Rotate GCC 12.2 AOCC 4.0 160 320 480 640 800 SE +/- 1.53, N = 3 SE +/- 0.67, N = 3 732 760 1. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen GCC 12.2 AOCC 4.0 200 400 600 800 1000 SE +/- 3.61, N = 3 SE +/- 2.65, N = 3 864 1150 1. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced GCC 12.2 AOCC 4.0 300 600 900 1200 1500 SE +/- 1.86, N = 3 SE +/- 6.33, N = 3 1407 1393 1. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS GCC 12.2 AOCC 4.0 80K 160K 240K 320K 400K SE +/- 1573.48, N = 3 SE +/- 398.31, N = 3 339503 370262 1. (CC) gcc options: -pedantic -O3
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed GCC 12.2 AOCC 4.0 1300 2600 3900 5200 6500 SE +/- 65.79, N = 4 SE +/- 60.88, N = 3 6060.7 6155.2 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed GCC 12.2 AOCC 4.0 1100 2200 3300 4400 5500 SE +/- 29.00, N = 4 SE +/- 10.29, N = 3 4913.7 4722.5 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed GCC 12.2 AOCC 4.0 20 40 60 80 100 SE +/- 0.61, N = 3 SE +/- 1.53, N = 12 101.4 104.6 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed GCC 12.2 AOCC 4.0 900 1800 2700 3600 4500 SE +/- 53.58, N = 3 SE +/- 17.78, N = 12 4203.8 4123.1 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Unkeyed Algorithms GCC 12.2 AOCC 4.0 120 240 360 480 600 SE +/- 0.18, N = 3 SE +/- 0.08, N = 3 448.19 534.93 1. (CXX) g++ options: -O3 -march=native -fPIC -pthread -pipe
JPEG XL Decoding libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is suited for JPEG XL decode performance testing to PNG output file, the pts/jpexl test is for encode performance. The JPEG XL encoding/decoding is done using the libjxl codebase. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL Decoding libjxl 0.7 CPU Threads: 1 GCC 12.2 AOCC 4.0 14 28 42 56 70 SE +/- 0.05, N = 3 SE +/- 0.22, N = 3 54.96 60.57
OpenBenchmarking.org MP/s, More Is Better WebP Image Encode 1.2.4 Encode Settings: Quality 100, Highest Compression GCC 12.2 AOCC 4.0 1.0935 2.187 3.2805 4.374 5.4675 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.59 4.86 1. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Medium GCC 12.2 AOCC 4.0 100 200 300 400 500 SE +/- 0.24, N = 3 SE +/- 0.25, N = 3 403.53 457.17 1. (CXX) g++ options: -O3 -march=native -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough GCC 12.2 AOCC 4.0 14 28 42 56 70 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 62.97 64.00 1. (CXX) g++ options: -O3 -march=native -flto -pthread
LeelaChessZero LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS GCC 12.2 AOCC 4.0 3K 6K 9K 12K 15K SE +/- 56.50, N = 3 SE +/- 62.91, N = 3 11110 11900 1. (CXX) g++ options: -flto -O3 -march=native -pthread
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen GCC 12.2 AOCC 4.0 4K 8K 12K 16K 20K SE +/- 95.82, N = 9 SE +/- 173.43, N = 9 11707 16982 1. (CXX) g++ options: -flto -O3 -march=native -pthread
Stargate Digital Audio Workstation Stargate is an open-source, cross-platform digital audio workstation (DAW) software package with "a unique and carefully curated experience" with scalability from old systems up through modern multi-core systems. Stargate is GPLv3 licensed and makes use of Qt5 (PyQt5) for its user-interface. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 512 GCC 12.2 AOCC 4.0 1.1166 2.2332 3.3498 4.4664 5.583 SE +/- 0.016850, N = 3 SE +/- 0.008965, N = 3 4.477606 4.962698 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 512 GCC 12.2 AOCC 4.0 0.6879 1.3758 2.0637 2.7516 3.4395 SE +/- 0.012940, N = 3 SE +/- 0.001667, N = 3 2.813028 3.057509 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 GCC 12.2 AOCC 4.0 1.2207 2.4414 3.6621 4.8828 6.1035 SE +/- 0.012443, N = 3 SE +/- 0.005282, N = 3 4.960055 5.425232 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 GCC 12.2 AOCC 4.0 0.7675 1.535 2.3025 3.07 3.8375 SE +/- 0.009147, N = 3 SE +/- 0.006797, N = 3 3.221105 3.411187 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
srsRAN srsRAN is an open-source LTE/5G software radio suite created by Software Radio Systems (SRS). The srsRAN radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Samples / Second, More Is Better srsRAN 22.04.1 Test: OFDM_Test GCC 12.2 AOCC 4.0 40M 80M 120M 160M 200M SE +/- 781024.97, N = 3 SE +/- 1299145.02, N = 3 189500000 192833333 -latomic 1. (CXX) g++ options: -O3 -march=native -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -ldl -lpthread -lm
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 GCC 12.2 AOCC 4.0 600M 1200M 1800M 2400M 3000M SE +/- 12676794.20, N = 3 SE +/- 4590206.97, N = 3 2675133333 2884000000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 64 - Buffer Length: 256 - Filter Length: 57 GCC 12.2 AOCC 4.0 1200M 2400M 3600M 4800M 6000M SE +/- 18956353.38, N = 3 SE +/- 7275300.68, N = 3 5280200000 5559600000 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 128 - Buffer Length: 256 - Filter Length: 57 GCC 12.2 AOCC 4.0 1300M 2600M 3900M 5200M 6500M SE +/- 5550075.07, N = 3 SE +/- 2562117.18, N = 3 5451600000 5868866667 1. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC 12.2 AOCC 4.0 4K 8K 12K 16K 20K SE +/- 18.34, N = 3 SE +/- 0.40, N = 3 19467.8 19492.8 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl
Kripke Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. Kripke is developed by LLNL. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 GCC 12.2 AOCC 4.0 70M 140M 210M 280M 350M SE +/- 4347432.93, N = 15 SE +/- 1235045.49, N = 3 308884107 340106367 -fopenmp -fopenmp=libomp 1. (CXX) g++ options: -O3 -march=native
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 GCC 12.2 AOCC 4.0 300K 600K 900K 1200K 1500K SE +/- 2377.84, N = 3 SE +/- 66.97, N = 3 1266612.9 1270745.6 -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.7 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.2 AOCC 4.0 0.9637 1.9274 2.8911 3.8548 4.8185 SE +/- 0.05940, N = 15 SE +/- 0.00415, N = 3 4.28328 0.45944 -fopenmp - MIN: 2.97 1. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.7 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.2 AOCC 4.0 0.3528 0.7056 1.0584 1.4112 1.764 SE +/- 0.018545, N = 3 SE +/- 0.000429, N = 3 1.568020 0.278467 -fopenmp - MIN: 1.2 -fopenmp=libomp - MIN: 0.24 1. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.7 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.2 AOCC 4.0 0.0899 0.1798 0.2697 0.3596 0.4495 SE +/- 0.002555, N = 3 SE +/- 0.000615, N = 3 0.399774 0.320302 -fopenmp - MIN: 0.36 -fopenmp=libomp - MIN: 0.31 1. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.7 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.2 AOCC 4.0 0.4583 0.9166 1.3749 1.8332 2.2915 SE +/- 0.00916, N = 3 SE +/- 0.00581, N = 3 2.03703 1.72218 -fopenmp - MIN: 1.86 -fopenmp=libomp - MIN: 1.53 1. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.7 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.2 AOCC 4.0 0.1435 0.287 0.4305 0.574 0.7175 SE +/- 0.003124, N = 3 SE +/- 0.001000, N = 3 0.637696 0.587671 -fopenmp - MIN: 0.61 -fopenmp=libomp - MIN: 0.54 1. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.7 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.2 AOCC 4.0 200 400 600 800 1000 SE +/- 6.48, N = 3 SE +/- 0.35, N = 3 869.01 524.10 -fopenmp - MIN: 839.81 -fopenmp=libomp - MIN: 510.54 1. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.7 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.2 AOCC 4.0 140 280 420 560 700 SE +/- 4.69, N = 12 SE +/- 0.18, N = 3 664.99 292.77 -fopenmp - MIN: 635.26 -fopenmp=libomp - MIN: 283.8 1. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.7 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU GCC 12.2 AOCC 4.0 0.0563 0.1126 0.1689 0.2252 0.2815 SE +/- 0.002764, N = 4 SE +/- 0.001257, N = 3 0.250318 0.112382 -fopenmp - MIN: 0.22 -fopenmp=libomp - MIN: 0.1 1. (CXX) g++ options: -O3 -march=native -msse4.1 -fPIC -pie -ldl -lpthread
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 GCC 12.2 AOCC 4.0 70 140 210 280 350 SE +/- 0.12, N = 3 SE +/- 4.23, N = 15 229.79 343.97 -fopenmp - MIN: 227.65 / MAX: 234.53 -fopenmp=libomp - MIN: 274.57 / MAX: 495.33 1. (CXX) g++ options: -O3 -march=native -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 GCC 12.2 AOCC 4.0 12 24 36 48 60 SE +/- 0.31, N = 3 SE +/- 0.59, N = 5 52.41 54.62 -fopenmp - MIN: 51.53 / MAX: 54.74 -fopenmp=libomp - MIN: 52.11 / MAX: 57.47 1. (CXX) g++ options: -O3 -march=native -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 GCC 12.2 AOCC 4.0 60 120 180 240 300 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 226.01 287.19 -fopenmp - MIN: 225.65 / MAX: 227.03 -fopenmp=libomp - MIN: 286.77 / MAX: 288 1. (CXX) g++ options: -O3 -march=native -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 2 GCC 12.2 AOCC 4.0 7 14 21 28 35 SE +/- 0.21, N = 3 SE +/- 0.28, N = 3 31.97 29.83 1. (CXX) g++ options: -O3 -fPIC -march=native -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6 GCC 12.2 AOCC 4.0 0.5789 1.1578 1.7367 2.3156 2.8945 SE +/- 0.011, N = 3 SE +/- 0.019, N = 3 2.573 2.421 1. (CXX) g++ options: -O3 -fPIC -march=native -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6, Lossless GCC 12.2 AOCC 4.0 1.0658 2.1316 3.1974 4.2632 5.329 SE +/- 0.028, N = 3 SE +/- 0.008, N = 3 4.737 4.342 1. (CXX) g++ options: -O3 -fPIC -march=native -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 10, Lossless GCC 12.2 AOCC 4.0 0.77 1.54 2.31 3.08 3.85 SE +/- 0.011, N = 3 SE +/- 0.004, N = 3 3.422 3.256 1. (CXX) g++ options: -O3 -fPIC -march=native -lm
GPAW GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better GPAW 22.1 Input: Carbon Nanotube GCC 12.2 AOCC 4.0 8 16 24 32 40 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 35.00 32.67 1. (CC) gcc options: -shared -fwrapv -O2 -O3 -march=native -lxc -lblas -lmpi
AOCC 4.0 Processor: 2 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 22.10, Kernel: 5.19.0-26-generic (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: Clang 14.0.6, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver4Processor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 12 December 2022 18:06 by user phoronix.
GCC 12.2 Processor: 2 x AMD EPYC 9374F 32-Core @ 4.31GHz (64 Cores / 128 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 22.10, Kernel: 5.19.0-26-generic (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 13 December 2022 05:53 by user phoronix.