AMD EPYC 3255 8-Core Temp testing with a congatec conga-B7E3 (5.13 BIOS) and NVIDIA Quadro P1000 4GB on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2108042-IB-SS681289082 ss6 - Phoronix Test Suite ss6 AMD EPYC 3255 8-Core Temp testing with a congatec conga-B7E3 (5.13 BIOS) and NVIDIA Quadro P1000 4GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2108042-IB-SS681289082&export=txt&grt&rdt&rro .
ss6 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Display Driver OpenGL Vulkan Compiler File-System Screen Resolution sysbench2004Ph10 graphics-magick2004Ph10 ipc-benchmark2004Ph10 amg2004Ph10 ramspeed2004PH1- npb2004Ph10 scimark2004Ph10 cachebench2004Ph10 onednn2004Ph10 apache2004Ph10 ctx-clock2004Ph10 hackbench1004Ph10 mbw2004Ph10 openssl2004Ph10 perf-bench2004Ph10 stress-ng2004Ph10 schbench2004Ph10 AMD EPYC 3255 8-Core Temp @ 2.50GHz (8 Cores / 16 Threads) congatec conga-B7E3 (5.13 BIOS) AMD 17h 32GB 1920GB ATP NVMe M.2 2280 SED SSD + 2000GB Portable SSD T5 NVIDIA Quadro P1000 4GB NVIDIA GP107GL HD Audio HP Z24n G2 Intel I210 + Intel I211 + 2 x AMD Device 1458 + 2 x AMD Device 1459 Ubuntu 20.04 5.4.0-65-generic (x86_64) nouveau 4.5 Mesa 21.0.3 (LLVM 12.0.0 256 bits) 1.0.2 GCC 9.3.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800126c Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
ss6 amg: apache: Static Web Page Serving cachebench: Read cachebench: Write cachebench: Read / Modify / Write ctx-clock: Context Switch Time graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space hackbench: 16 - Thread hackbench: 16 - Process ipc-benchmark: TCP Socket - 128 ipc-benchmark: TCP Socket - 1024 ipc-benchmark: Unnamed Pipe - 128 ipc-benchmark: Unnamed Pipe - 1024 ipc-benchmark: FIFO Named Pipe - 128 ipc-benchmark: FIFO Named Pipe - 1024 ipc-benchmark: Unnamed Unix Domain Socket - 128 ipc-benchmark: Unnamed Unix Domain Socket - 1024 mbw: Memory Copy - 1024 MiB mbw: Memory Copy, Fixed Block Size - 1024 MiB npb: EP.C npb: EP.D onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU openssl: RSA 4096-bit Performance perf-bench: Epoll Wait perf-bench: Futex Hash perf-bench: Memcpy 1MB perf-bench: Memset 1MB perf-bench: Sched Pipe perf-bench: Futex Lock-Pi perf-bench: Syscall Basic ramspeed: Add - Integer ramspeed: Scale - Integer ramspeed: Average - Integer ramspeed: Add - Floating Point ramspeed: Scale - Floating Point ramspeed: Average - Floating Point schbench: 8 - 16 scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation stress-ng: MMAP stress-ng: NUMA stress-ng: MEMFD stress-ng: Atomic stress-ng: Crypto stress-ng: Malloc stress-ng: Forking stress-ng: SENDFILE stress-ng: CPU Cache stress-ng: CPU Stress stress-ng: Semaphores stress-ng: Matrix Math stress-ng: Vector Math stress-ng: Memory Copying stress-ng: Socket Activity stress-ng: Context Switching stress-ng: Glibc C String Functions stress-ng: Glibc Qsort Data Sorting stress-ng: System V Message Passing sysbench: RAM / Memory sysbench: CPU sysbench2004Ph10 graphics-magick2004Ph10 ipc-benchmark2004Ph10 amg2004Ph10 ramspeed2004PH1- npb2004Ph10 scimark2004Ph10 cachebench2004Ph10 onednn2004Ph10 apache2004Ph10 ctx-clock2004Ph10 hackbench1004Ph10 mbw2004Ph10 openssl2004Ph10 perf-bench2004Ph10 stress-ng2004Ph10 schbench2004Ph10 7131.37 12224.31 308 510 88 122 622 158 715 1887239 1305400 1966470 1589778 1883295 1476384 1249552 922138 178649867 20720.60 17114.71 18296.39 21039.85 16118.97 18556.23 384.06 382.85 417.27 100.33 135.04 451.72 539.42 859.83 2145.005179 17416.295211 30368.705222 11.1572 14.0103 9.73030 3.58594 23.7800 14.3333 21.0820 27.2152 10.0702 13.8834 10707.5 5916.13 10785.1 5899.04 6.34811 10780.4 5929.13 6.59184 20782.21 175 76.578 69.547 11376.990 5994.751 1191.5 61141 3775107 13.475792 30.002033 33765 841 13007164 40.45 129.56 459.93 234591.40 1367.21 34500114.19 45987.97 96321.97 18.35 2312.35 1396806.20 28675.99 49739.06 1037.12 4594.77 2715180.95 644776.31 94.75 8365788.25 104747 OpenBenchmarking.org
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 amg2004Ph10 40M 80M 120M 160M 200M SE +/- 69024.21, N = 3 178649867 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.29 Static Web Page Serving apache2004Ph10 4K 8K 12K 16K 20K SE +/- 28.82, N = 3 20782.21 1. (CC) gcc options: -shared -fPIC -O2 -pthread
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read cachebench2004Ph10 500 1000 1500 2000 2500 SE +/- 0.01, N = 3 2145.01 MIN: 2143.44 / MAX: 2145.17 1. (CC) gcc options: -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write cachebench2004Ph10 4K 8K 12K 16K 20K SE +/- 28.41, N = 3 17416.30 MIN: 14119.7 / MAX: 18996.35 1. (CC) gcc options: -lrt
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write cachebench2004Ph10 7K 14K 21K 28K 35K SE +/- 201.26, N = 3 30368.71 MIN: 26009.51 / MAX: 36088.45 1. (CC) gcc options: -lrt
ctx_clock Context Switch Time OpenBenchmarking.org Clocks, Fewer Is Better ctx_clock Context Switch Time ctx-clock2004Ph10 40 80 120 160 200 175
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl graphics-magick2004Ph10 70 140 210 280 350 SE +/- 0.88, N = 3 308 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate graphics-magick2004Ph10 110 220 330 440 550 SE +/- 0.88, N = 3 510 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen graphics-magick2004Ph10 20 40 60 80 100 88 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced graphics-magick2004Ph10 30 60 90 120 150 122 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing graphics-magick2004Ph10 130 260 390 520 650 SE +/- 0.33, N = 3 622 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian graphics-magick2004Ph10 30 60 90 120 150 SE +/- 0.33, N = 3 158 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space graphics-magick2004Ph10 150 300 450 600 750 SE +/- 0.88, N = 3 715 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
Hackbench Count: 16 - Type: Thread OpenBenchmarking.org Seconds, Fewer Is Better Hackbench Count: 16 - Type: Thread hackbench1004Ph10 20 40 60 80 100 SE +/- 0.57, N = 3 76.58 1. (CC) gcc options: -lpthread
Hackbench Count: 16 - Type: Process OpenBenchmarking.org Seconds, Fewer Is Better Hackbench Count: 16 - Type: Process hackbench1004Ph10 15 30 45 60 75 SE +/- 1.01, N = 15 69.55 1. (CC) gcc options: -lpthread
IPC_benchmark Type: TCP Socket - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: TCP Socket - Message Bytes: 128 ipc-benchmark2004Ph10 400K 800K 1200K 1600K 2000K SE +/- 865.47, N = 3 1887239
IPC_benchmark Type: TCP Socket - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: TCP Socket - Message Bytes: 1024 ipc-benchmark2004Ph10 300K 600K 900K 1200K 1500K SE +/- 1589.09, N = 3 1305400
IPC_benchmark Type: Unnamed Pipe - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Pipe - Message Bytes: 128 ipc-benchmark2004Ph10 400K 800K 1200K 1600K 2000K SE +/- 10510.23, N = 3 1966470
IPC_benchmark Type: Unnamed Pipe - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Pipe - Message Bytes: 1024 ipc-benchmark2004Ph10 300K 600K 900K 1200K 1500K SE +/- 7171.13, N = 3 1589778
IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 128 ipc-benchmark2004Ph10 400K 800K 1200K 1600K 2000K SE +/- 20185.32, N = 4 1883295
IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 1024 ipc-benchmark2004Ph10 300K 600K 900K 1200K 1500K SE +/- 2419.86, N = 3 1476384
IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 128 ipc-benchmark2004Ph10 300K 600K 900K 1200K 1500K SE +/- 10605.40, N = 3 1249552
IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 1024 ipc-benchmark2004Ph10 200K 400K 600K 800K 1000K SE +/- 10653.07, N = 3 922138
MBW Test: Memory Copy - Array Size: 1024 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 1024 MiB mbw2004Ph10 2K 4K 6K 8K 10K SE +/- 41.51, N = 3 11376.99 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB mbw2004Ph10 1300 2600 3900 5200 6500 SE +/- 10.45, N = 3 5994.75 1. (CC) gcc options: -O3 -march=native
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C npb2004Ph10 80 160 240 320 400 SE +/- 1.59, N = 3 384.06 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D npb2004Ph10 80 160 240 320 400 SE +/- 4.54, N = 3 382.85 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU onednn2004Ph10 3 6 9 12 15 SE +/- 0.00, N = 3 11.16 MIN: 10.85 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU onednn2004Ph10 4 8 12 16 20 SE +/- 0.01, N = 3 14.01 MIN: 13.73 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU onednn2004Ph10 3 6 9 12 15 SE +/- 0.00567, N = 3 9.73030 MIN: 8.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU onednn2004Ph10 0.8068 1.6136 2.4204 3.2272 4.034 SE +/- 0.00038, N = 3 3.58594 MIN: 3.45 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU onednn2004Ph10 6 12 18 24 30 SE +/- 0.02, N = 3 23.78 MIN: 23.31 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU onednn2004Ph10 4 8 12 16 20 SE +/- 0.05, N = 3 14.33 MIN: 13.05 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU onednn2004Ph10 5 10 15 20 25 SE +/- 0.29, N = 3 21.08 MIN: 19.62 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU onednn2004Ph10 6 12 18 24 30 SE +/- 0.08, N = 3 27.22 MIN: 25.93 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU onednn2004Ph10 3 6 9 12 15 SE +/- 0.01, N = 3 10.07 MIN: 9.73 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU onednn2004Ph10 4 8 12 16 20 SE +/- 0.09, N = 14 13.88 MIN: 13.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU onednn2004Ph10 2K 4K 6K 8K 10K SE +/- 33.23, N = 3 10707.5 MIN: 10641.1 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU onednn2004Ph10 1300 2600 3900 5200 6500 SE +/- 7.60, N = 3 5916.13 MIN: 5896.18 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU onednn2004Ph10 2K 4K 6K 8K 10K SE +/- 12.33, N = 3 10785.1 MIN: 10763 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU onednn2004Ph10 1300 2600 3900 5200 6500 SE +/- 10.29, N = 3 5899.04 MIN: 5875.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU onednn2004Ph10 2 4 6 8 10 SE +/- 0.00111, N = 3 6.34811 MIN: 6.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU onednn2004Ph10 2K 4K 6K 8K 10K SE +/- 22.39, N = 3 10780.4 MIN: 10727.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU onednn2004Ph10 1300 2600 3900 5200 6500 SE +/- 8.66, N = 3 5929.13 MIN: 5906.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU onednn2004Ph10 2 4 6 8 10 SE +/- 0.00334, N = 3 6.59184 MIN: 6.31 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance openssl2004Ph10 300 600 900 1200 1500 SE +/- 2.38, N = 3 1191.5 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait perf-bench2004Ph10 13K 26K 39K 52K 65K SE +/- 109.01, N = 3 61141 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash perf-bench2004Ph10 800K 1600K 2400K 3200K 4000K SE +/- 1489.17, N = 3 3775107 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB perf-bench2004Ph10 3 6 9 12 15 SE +/- 0.05, N = 3 13.48 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB perf-bench2004Ph10 7 14 21 28 35 SE +/- 0.31, N = 3 30.00 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe perf-bench2004Ph10 7K 14K 21K 28K 35K SE +/- 285.95, N = 12 33765 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi perf-bench2004Ph10 200 400 600 800 1000 SE +/- 2.65, N = 3 841 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic perf-bench2004Ph10 3M 6M 9M 12M 15M SE +/- 117757.00, N = 7 13007164 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer ramspeed2004PH1- 4K 8K 12K 16K 20K SE +/- 9.28, N = 3 20720.60 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer ramspeed2004PH1- 4K 8K 12K 16K 20K SE +/- 132.01, N = 3 17114.71 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer ramspeed2004PH1- 4K 8K 12K 16K 20K SE +/- 47.75, N = 3 18296.39 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point ramspeed2004PH1- 5K 10K 15K 20K 25K SE +/- 28.29, N = 3 21039.85 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point ramspeed2004PH1- 3K 6K 9K 12K 15K SE +/- 49.73, N = 3 16118.97 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point ramspeed2004PH1- 4K 8K 12K 16K 20K SE +/- 26.66, N = 3 18556.23 1. (CC) gcc options: -O3 -march=native
Schbench Message Threads: 8 - Workers Per Message Thread: 16 OpenBenchmarking.org usec, 99.9th Latency Percentile, Fewer Is Better Schbench Message Threads: 8 - Workers Per Message Thread: 16 schbench2004Ph10 20K 40K 60K 80K 100K SE +/- 682.67, N = 3 104747 1. (CC) gcc options: -O2 -lpthread
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite scimark2004Ph10 90 180 270 360 450 SE +/- 2.85, N = 3 417.27 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo scimark2004Ph10 20 40 60 80 100 SE +/- 0.14, N = 3 100.33 1. (CC) gcc options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform scimark2004Ph10 30 60 90 120 150 SE +/- 0.89, N = 3 135.04 1. (CC) gcc options: -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply scimark2004Ph10 100 200 300 400 500 SE +/- 1.12, N = 3 451.72 1. (CC) gcc options: -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization scimark2004Ph10 120 240 360 480 600 SE +/- 28.52, N = 3 539.42 1. (CC) gcc options: -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation scimark2004Ph10 200 400 600 800 1000 SE +/- 15.70, N = 3 859.83 1. (CC) gcc options: -lm
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: MMAP stress-ng2004Ph10 9 18 27 36 45 SE +/- 1.00, N = 15 40.45 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: NUMA stress-ng2004Ph10 30 60 90 120 150 SE +/- 0.92, N = 3 129.56 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: MEMFD stress-ng2004Ph10 100 200 300 400 500 SE +/- 2.39, N = 3 459.93 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Atomic stress-ng2004Ph10 50K 100K 150K 200K 250K SE +/- 100.00, N = 3 234591.40 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Crypto stress-ng2004Ph10 300 600 900 1200 1500 SE +/- 1.13, N = 3 1367.21 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Malloc stress-ng2004Ph10 7M 14M 21M 28M 35M SE +/- 35888.73, N = 3 34500114.19 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Forking stress-ng2004Ph10 10K 20K 30K 40K 50K SE +/- 256.35, N = 3 45987.97 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: SENDFILE stress-ng2004Ph10 20K 40K 60K 80K 100K SE +/- 49.20, N = 3 96321.97 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: CPU Cache stress-ng2004Ph10 5 10 15 20 25 SE +/- 0.19, N = 3 18.35 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: CPU Stress stress-ng2004Ph10 500 1000 1500 2000 2500 SE +/- 0.45, N = 3 2312.35 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Semaphores stress-ng2004Ph10 300K 600K 900K 1200K 1500K SE +/- 647.20, N = 3 1396806.20 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Matrix Math stress-ng2004Ph10 6K 12K 18K 24K 30K SE +/- 52.11, N = 3 28675.99 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Vector Math stress-ng2004Ph10 11K 22K 33K 44K 55K SE +/- 32.31, N = 3 49739.06 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Memory Copying stress-ng2004Ph10 200 400 600 800 1000 SE +/- 0.64, N = 3 1037.12 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Socket Activity stress-ng2004Ph10 1000 2000 3000 4000 5000 SE +/- 19.14, N = 3 4594.77 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Context Switching stress-ng2004Ph10 600K 1200K 1800K 2400K 3000K SE +/- 156049.06, N = 15 2715180.95 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Glibc C String Functions stress-ng2004Ph10 140K 280K 420K 560K 700K SE +/- 6688.33, N = 3 644776.31 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Glibc Qsort Data Sorting stress-ng2004Ph10 20 40 60 80 100 SE +/- 0.48, N = 3 94.75 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: System V Message Passing stress-ng2004Ph10 2M 4M 6M 8M 10M SE +/- 92479.51, N = 3 8365788.25 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory sysbench2004Ph10 1500 3000 4500 6000 7500 SE +/- 45.49, N = 3 7131.37 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU sysbench2004Ph10 3K 6K 9K 12K 15K SE +/- 42.84, N = 3 12224.31 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Phoronix Test Suite v10.8.4