cpuall_v2 AMD EPYC 7413 24-Core testing with a GIGABYTE MZ32-AR0-00 v01000100 (M18 BIOS) and Gigabyte NVIDIA GeForce RTX 4090 on Rocky Linux 9.3 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403189-NE-2403174NE64&sro&grw .
cpuall_v2 Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel OpenCL Compiler File-System Screen Resolution Desktop Display Server 6000_4ch_ll 4090x2 Intel 0000% @ 3.30GHz (48 Cores / 96 Threads) ASUS Pro WS W790E-SAGE SE (0215 BIOS) Intel Alder Lake-S PCH 64GB 4001GB CT4000P3SSD8 + 0GB Virtual HDisk0 ASPEED Realtek ALC1220 2 x Intel X710 for 10GBASE-T Fedora 39 6.7.7-200.fc39.x86_64 (x86_64) OpenCL 3.0 + OpenCL 1.2 Intel FPGA SDK for OpenCL 20.3 + OpenCL 3.0 LINUX + OpenCL 1.2 Intel FPGA SDK for OpenCL 20.3 GCC 13.2.1 20231205 + Clang 17.0.6 + LLVM 17.0.6 xfs 1920x1200 AMD EPYC 7413 24-Core @ 2.65GHz (24 Cores) GIGABYTE MZ32-AR0-00 v01000100 (M18 BIOS) AMD Starship/Matisse 6 x 16 GB DDR4-2667MT/s 18ASF2G72PZ-2G6D2 960GB INTEL SSDPE21D960GA + 2 x 1600GB Toshiba KXG50PNV2T04 + 4001GB Nextorage SSD NE1N4TB + 3 x 59GB INTEL SSDPEK1A058GA Gigabyte NVIDIA GeForce RTX 4090 NVIDIA AD102 HD Audio Aquantia AQC107 NBase-T/IEEE + Mellanox MT27500 Rocky Linux 9.3 5.14.0-362.24.1.el9_3.x86_64 (x86_64) GNOME Shell 40.10 X Server 1.20.11 GCC 11.4.1 20230605 + Clang 16.0.6 + LLVM 16.0.6 + CUDA 12.3 1024x768 OpenBenchmarking.org Kernel Details - 6000_4ch_ll: Transparent Huge Pages: madvise - 4090x2: Transparent Huge Pages: always Compiler Details - 6000_4ch_ll: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-multilib --enable-offload-defaulted --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=i686 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver - 4090x2: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-isl Disk Details - NONE / attr2,inode64,logbsize=32k,logbufs=8,noquota,relatime,rw,seclabel / Block Size: 4096 Processor Details - 6000_4ch_ll: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xd0004b1 - 4090x2: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa0011d1 Security Details - 6000_4ch_ll: SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - 4090x2: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
cpuall_v2 intel-mlc: Peak Injection Bandwidth - 3:1 Reads-Writes intel-mlc: Max Bandwidth - Stream-Triad Like stream: Copy intel-mlc: Max Bandwidth - 2:1 Reads-Writes intel-mlc: Max Bandwidth - 1:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - Stream-Triad Like intel-mlc: Max Bandwidth - 3:1 Reads-Writes stream: Scale intel-mlc: Idle Latency stream: Triad stream: Add cachebench: Read cachebench: Write cachebench: Read / Modify / Write intel-mlc: Peak Injection Bandwidth - 2:1 Reads-Writes intel-mlc: Max Bandwidth - All Reads intel-mlc: Peak Injection Bandwidth - 1:1 Reads-Writes intel-mlc: Peak Injection Bandwidth - All Reads fio: Rand Read - POSIX AIO - Yes - 4KB - 1 - Default Test Directory fio: Rand Read - POSIX AIO - Yes - 4KB - 1 - Default Test Directory fio: Rand Read - POSIX AIO - Yes - 4KB - 32 - Default Test Directory fio: Rand Read - POSIX AIO - Yes - 4KB - 32 - Default Test Directory fio: Rand Write - POSIX AIO - Yes - 4KB - 1 - Default Test Directory fio: Rand Write - POSIX AIO - Yes - 4KB - 1 - Default Test Directory fio: Rand Write - POSIX AIO - Yes - 4KB - 32 - Default Test Directory fio: Rand Write - POSIX AIO - Yes - 4KB - 32 - Default Test Directory whisper-cpp: ggml-medium.en - 2016 State of the Union build-linux-kernel: defconfig 6000_4ch_ll 4090x2 97387.8 95035.27 90064.0 93920.66 88062.65 95038.5 98042.64 89640.2 114.7 94108.6 93836.8 12582.595354 85773.382625 93157.944569 92578.4 113954.65 87976.7 113973.3 57.7 14771 57.5 14720 389 99667 389 99620 1652.16433 51.727 31776.0 32011.55 32574.7 30554.89 29284.50 32055.0 32025.23 21120.9 91.5 23746.5 23560.9 9190.903357 51622.472353 102381.017139 30510.8 38822.93 29207.1 38812.0 30.6 7844 42.2 10798 253 64667 253 64967 1218.07265 72.107 OpenBenchmarking.org
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - 3:1 Reads-Writes 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 19.44, N = 3 SE +/- 30.53, N = 3 31776.0 97387.8
Intel Memory Latency Checker Test: Max Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - Stream-Triad Like 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 25.09, N = 3 SE +/- 8.34, N = 3 32011.55 95035.27
Stream Type: Copy OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Copy 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 31.85, N = 5 SE +/- 16.98, N = 5 32574.7 90064.0 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
Intel Memory Latency Checker Test: Max Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - 2:1 Reads-Writes 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 25.39, N = 3 SE +/- 7.15, N = 3 30554.89 93920.66
Intel Memory Latency Checker Test: Max Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - 1:1 Reads-Writes 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 19.96, N = 3 SE +/- 4.74, N = 3 29284.50 88062.65
Intel Memory Latency Checker Test: Peak Injection Bandwidth - Stream-Triad Like OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - Stream-Triad Like 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 16.97, N = 3 SE +/- 8.22, N = 3 32055.0 95038.5
Intel Memory Latency Checker Test: Max Bandwidth - 3:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - 3:1 Reads-Writes 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 27.70, N = 3 SE +/- 47.67, N = 3 32025.23 98042.64
Stream Type: Scale OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Scale 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 16.22, N = 5 SE +/- 63.75, N = 5 21120.9 89640.2 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
Intel Memory Latency Checker Test: Idle Latency OpenBenchmarking.org ns, Fewer Is Better Intel Memory Latency Checker 3.10 Test: Idle Latency 4090x2 6000_4ch_ll 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.37, N = 3 91.5 114.7
Stream Type: Triad OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Triad 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 14.55, N = 5 SE +/- 12.86, N = 5 23746.5 94108.6 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
Stream Type: Add OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Add 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 17.00, N = 5 SE +/- 26.62, N = 5 23560.9 93836.8 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read 4090x2 6000_4ch_ll 3K 6K 9K 12K 15K SE +/- 6.47, N = 3 SE +/- 0.15, N = 3 9190.90 12582.60 MIN: 9160.88 / MAX: 9203.86 MIN: 12577.1 / MAX: 12583.24 1. (CC) gcc options: -O3 -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 41.10, N = 3 SE +/- 23.70, N = 3 51622.47 85773.38 MIN: 39519.8 / MAX: 54896.66 MIN: 51047.3 / MAX: 97997.28 1. (CC) gcc options: -O3 -lrt
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 367.10, N = 3 SE +/- 6.98, N = 3 102381.02 93157.94 MIN: 77271.95 / MAX: 109243.01 MIN: 81727.05 / MAX: 98992.88 1. (CC) gcc options: -O3 -lrt
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 2:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - 2:1 Reads-Writes 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 25.15, N = 3 SE +/- 22.58, N = 3 30510.8 92578.4
Intel Memory Latency Checker Test: Max Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Max Bandwidth - All Reads 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 94.69, N = 3 SE +/- 27.22, N = 3 38822.93 113954.65
Intel Memory Latency Checker Test: Peak Injection Bandwidth - 1:1 Reads-Writes OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - 1:1 Reads-Writes 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 50.27, N = 3 SE +/- 29.53, N = 3 29207.1 87976.7
Intel Memory Latency Checker Test: Peak Injection Bandwidth - All Reads OpenBenchmarking.org MB/s, More Is Better Intel Memory Latency Checker 3.10 Test: Peak Injection Bandwidth - All Reads 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 88.37, N = 3 SE +/- 26.33, N = 3 38812.0 113973.3
Flexible IO Tester Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.36 Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory 4090x2 6000_4ch_ll 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.54, N = 7 30.6 57.7 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.36 Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory 4090x2 6000_4ch_ll 3K 6K 9K 12K 15K SE +/- 5.78, N = 3 SE +/- 137.52, N = 7 7844 14771 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.36 Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory 4090x2 6000_4ch_ll 13 26 39 52 65 SE +/- 1.10, N = 15 SE +/- 0.62, N = 5 42.2 57.5 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.36 Type: Random Read - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory 4090x2 6000_4ch_ll 3K 6K 9K 12K 15K SE +/- 280.58, N = 15 SE +/- 159.37, N = 5 10798 14720 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.36 Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory 4090x2 6000_4ch_ll 80 160 240 320 400 SE +/- 0.67, N = 3 SE +/- 7.12, N = 15 253 389 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.36 Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 1 - Disk Target: Default Test Directory 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 120.19, N = 3 SE +/- 1816.90, N = 15 64667 99667 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory OpenBenchmarking.org MB/s, More Is Better Flexible IO Tester 3.36 Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory 4090x2 6000_4ch_ll 80 160 240 320 400 SE +/- 0.33, N = 3 SE +/- 5.69, N = 15 253 389 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Flexible IO Tester Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory OpenBenchmarking.org IOPS, More Is Better Flexible IO Tester 3.36 Type: Random Write - Engine: POSIX AIO - Direct: Yes - Block Size: 4KB - Job Count: 32 - Disk Target: Default Test Directory 4090x2 6000_4ch_ll 20K 40K 60K 80K 100K SE +/- 66.67, N = 3 SE +/- 1450.85, N = 15 64967 99620 -libverbs -lrdmacm -lcurl -lssl -lcrypto 1. (CC) gcc options: -rdynamic -lz -lm -laio -lpthread -ldl -std=gnu99 -ffast-math -include -O3 -fcommon -march=native
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union 4090x2 6000_4ch_ll 400 800 1200 1600 2000 SE +/- 14.14, N = 9 SE +/- 10.69, N = 3 1218.07 1652.16 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig 4090x2 6000_4ch_ll 16 32 48 64 80 SE +/- 0.70, N = 3 SE +/- 0.51, N = 15 72.11 51.73
Phoronix Test Suite v10.8.5