ss2 AMD EPYC 3255 8-Core Temp testing with a congatec conga-B7E3 (5.13 BIOS) and MSI NVIDIA GeForce GTX 1050 2GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2107011-IB-SS205239152&sro&grs .
ss2 Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Display Driver OpenGL Compiler File-System Screen Resolution sysbench2004Ph10 graphics-magick2004Ph10 ipc_benchmark2004Ph10 amg2004Ph10 ramspeed2004Ph10 npb2004Ph10 scimark2004Ph10 cachebench2004Ph10 onednn2004Ph10 onednn2-2004Ph10 apache2004Ph10 ctx_clock2004Ph10 hackbench2004Ph10 mbw2004Ph10 openssl2004Ph10 perf-bench2004Ph10 stress-ng2004Ph10 AMD EPYC 3255 8-Core Temp @ 2.50GHz (8 Cores / 16 Threads) congatec conga-B7E3 (5.13 BIOS) AMD 17h 32GB 2000GB Samsung SSD 970 EVO 2TB + 2000GB Portable SSD T5 MSI NVIDIA GeForce GTX 1050 2GB NVIDIA GP107GL HD Audio Intel I211 + Intel I210 + 2 x AMD Device 1458 + 2 x AMD Device 1459 Ubuntu 20.04 5.4.0-77-generic (x86_64) nouveau 4.5 Mesa 20.2.6 (LLVM 11.0.0 256 bits) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8001250 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
ss2 onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU stress-ng: System V Message Passing stress-ng: Glibc Qsort Data Sorting stress-ng: Glibc C String Functions stress-ng: Context Switching stress-ng: Socket Activity stress-ng: Memory Copying stress-ng: Vector Math stress-ng: Matrix Math stress-ng: Semaphores stress-ng: CPU Stress stress-ng: CPU Cache stress-ng: SENDFILE stress-ng: Forking stress-ng: Malloc stress-ng: Crypto stress-ng: Atomic stress-ng: MEMFD stress-ng: NUMA perf-bench: Syscall Basic perf-bench: Futex Lock-Pi perf-bench: Sched Pipe perf-bench: Memset 1MB perf-bench: Memcpy 1MB perf-bench: Futex Hash perf-bench: Epoll Wait openssl: RSA 4096-bit Performance mbw: Memory Copy, Fixed Block Size - 1024 MiB mbw: Memory Copy - 1024 MiB hackbench: 16 - Process hackbench: 16 - Thread apache: Static Web Page Serving onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU cachebench: Read / Modify / Write cachebench: Write cachebench: Read scimark2: Jacobi Successive Over-Relaxation scimark2: Sparse Matrix Multiply scimark2: Fast Fourier Transform scimark2: Monte Carlo scimark2: Composite npb: FT.C npb: EP.D npb: EP.C ramspeed: Average - Floating Point ramspeed: Scale - Floating Point ramspeed: Add - Floating Point ramspeed: Average - Integer ramspeed: Scale - Integer ramspeed: Add - Integer amg: ipc-benchmark: Unnamed Unix Domain Socket - 1024 ipc-benchmark: Unnamed Unix Domain Socket - 128 ipc-benchmark: FIFO Named Pipe - 1024 ipc-benchmark: FIFO Named Pipe - 128 ipc-benchmark: Unnamed Pipe - 1024 ipc-benchmark: Unnamed Pipe - 128 ipc-benchmark: TCP Socket - 1024 ipc-benchmark: TCP Socket - 128 graphics-magick: HWB Color Space graphics-magick: Noise-Gaussian graphics-magick: Resizing graphics-magick: Enhanced graphics-magick: Sharpen graphics-magick: Rotate graphics-magick: Swirl sysbench: CPU sysbench: RAM / Memory stress-ng: MMAP ctx-clock: Context Switch Time scimark2: Dense LU Matrix Factorization sysbench2004Ph10 graphics-magick2004Ph10 ipc_benchmark2004Ph10 amg2004Ph10 ramspeed2004Ph10 npb2004Ph10 scimark2004Ph10 cachebench2004Ph10 onednn2004Ph10 onednn2-2004Ph10 apache2004Ph10 ctx_clock2004Ph10 hackbench2004Ph10 mbw2004Ph10 openssl2004Ph10 perf-bench2004Ph10 stress-ng2004Ph10 11968.45 6742.41 679 154 588 119 86 467 303 887809 1203839 1434448 1871487 1535206 1905581 1240240 1834833 100256967 10508.78 9816.48 11223.54 10455.04 9839.16 11233.26 5785.55 379.35 381.58 846.84 435.10 127.68 97.06 388.38 399.27 29344.249300 16912.989006 2080.920431 9.87869 14.5030 12202.2 21.5017 18.4546 12245.1 10.7451 4.86169 44.1133 8534.43 13.7952 12244.9 17.8303 10.2345 38.2856 8530.48 9.78478 14.5557 12229.2 21.5469 18.4170 12229.3 10.7323 4.85667 44.0765 8528.44 13.8031 12238.1 17.8205 10.2389 38.3004 8527.37 6.91035 8532.56 20070.57 182 74.576 80.393 4799.613 8116.186 1170.9 11202227 808 35252 29.136207 13.090956 3596834 59159 10328168.09 93.21 646433.61 2343297.63 3878.66 604.26 48824.90 28206.47 1358793.01 1882.75 14.65 94315.69 40159.70 28599372.98 1338.10 228226.92 312.28 103.65 91.28 OpenBenchmarking.org
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 3 6 9 12 15 SE +/- 0.01313, N = 3 SE +/- 0.03098, N = 3 9.78478 9.87869 MIN: 9.28 MIN: 9.45 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 14.56 14.50 MIN: 13.09 MIN: 13.16 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 3K 6K 9K 12K 15K SE +/- 34.23, N = 3 SE +/- 17.24, N = 3 12229.2 12202.2 MIN: 12157 MIN: 12171 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 5 10 15 20 25 SE +/- 0.06, N = 3 SE +/- 0.14, N = 3 21.55 21.50 MIN: 19.95 MIN: 19.89 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 18.42 18.45 MIN: 18.09 MIN: 18.14 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 3K 6K 9K 12K 15K SE +/- 19.75, N = 3 SE +/- 5.40, N = 3 12229.3 12245.1 MIN: 12179 MIN: 12224.4 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 10.73 10.75 MIN: 10.36 MIN: 10.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 1.0939 2.1878 3.2817 4.3756 5.4695 SE +/- 0.00513, N = 3 SE +/- 0.00363, N = 3 4.85667 4.86169 MIN: 4.6 MIN: 4.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 44.08 44.11 MIN: 43.61 MIN: 43.63 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 2K 4K 6K 8K 10K SE +/- 15.42, N = 3 SE +/- 10.03, N = 3 8528.44 8534.43 MIN: 8487.98 MIN: 8497.26 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 13.80 13.80 MIN: 13.39 MIN: 13.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 3K 6K 9K 12K 15K SE +/- 3.61, N = 3 SE +/- 16.36, N = 3 12238.1 12244.9 MIN: 12215.6 MIN: 12194.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 17.82 17.83 MIN: 16.86 MIN: 16.86 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 10.24 10.23 MIN: 9.62 MIN: 9.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 9 18 27 36 45 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 38.30 38.29 MIN: 37.58 MIN: 37.55 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU onednn2-2004Ph10 onednn2004Ph10 2K 4K 6K 8K 10K SE +/- 6.76, N = 3 SE +/- 8.98, N = 3 8527.37 8530.48 MIN: 8492.67 MIN: 8496.86 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: System V Message Passing stress-ng2004Ph10 2M 4M 6M 8M 10M SE +/- 158615.33, N = 12 10328168.09 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Glibc Qsort Data Sorting stress-ng2004Ph10 20 40 60 80 100 SE +/- 0.23, N = 3 93.21 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Glibc C String Functions stress-ng2004Ph10 140K 280K 420K 560K 700K SE +/- 4652.30, N = 3 646433.61 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Context Switching stress-ng2004Ph10 500K 1000K 1500K 2000K 2500K SE +/- 28452.37, N = 3 2343297.63 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Socket Activity stress-ng2004Ph10 800 1600 2400 3200 4000 SE +/- 11.92, N = 3 3878.66 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Memory Copying stress-ng2004Ph10 130 260 390 520 650 SE +/- 0.38, N = 3 604.26 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Vector Math stress-ng2004Ph10 10K 20K 30K 40K 50K SE +/- 41.81, N = 3 48824.90 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Matrix Math stress-ng2004Ph10 6K 12K 18K 24K 30K SE +/- 29.01, N = 3 28206.47 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Semaphores stress-ng2004Ph10 300K 600K 900K 1200K 1500K SE +/- 831.38, N = 3 1358793.01 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: CPU Stress stress-ng2004Ph10 400 800 1200 1600 2000 SE +/- 2.97, N = 3 1882.75 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: CPU Cache stress-ng2004Ph10 4 8 12 16 20 SE +/- 0.13, N = 15 14.65 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: SENDFILE stress-ng2004Ph10 20K 40K 60K 80K 100K SE +/- 91.24, N = 3 94315.69 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Forking stress-ng2004Ph10 9K 18K 27K 36K 45K SE +/- 221.43, N = 3 40159.70 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Malloc stress-ng2004Ph10 6M 12M 18M 24M 30M SE +/- 26135.85, N = 3 28599372.98 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Crypto stress-ng2004Ph10 300 600 900 1200 1500 SE +/- 3.44, N = 3 1338.10 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: Atomic stress-ng2004Ph10 50K 100K 150K 200K 250K SE +/- 206.55, N = 3 228226.92 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: MEMFD stress-ng2004Ph10 70 140 210 280 350 SE +/- 0.78, N = 3 312.28 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: NUMA stress-ng2004Ph10 20 40 60 80 100 SE +/- 0.97, N = 15 103.65 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic perf-bench2004Ph10 2M 4M 6M 8M 10M SE +/- 107215.32, N = 3 11202227 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi perf-bench2004Ph10 200 400 600 800 1000 SE +/- 1.73, N = 3 808 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe perf-bench2004Ph10 8K 16K 24K 32K 40K SE +/- 374.38, N = 5 35252 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB perf-bench2004Ph10 7 14 21 28 35 SE +/- 0.11, N = 3 29.14 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB perf-bench2004Ph10 3 6 9 12 15 SE +/- 0.04, N = 3 13.09 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash perf-bench2004Ph10 800K 1600K 2400K 3200K 4000K SE +/- 10145.28, N = 3 3596834 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait perf-bench2004Ph10 13K 26K 39K 52K 65K SE +/- 160.94, N = 3 59159 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -lpthread -lrt -lm -ldl -lcrypto -lz -lnuma
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance openssl2004Ph10 300 600 900 1200 1500 SE +/- 2.40, N = 3 1170.9 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
MBW Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB mbw2004Ph10 1000 2000 3000 4000 5000 SE +/- 2.54, N = 3 4799.61 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 1024 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 1024 MiB mbw2004Ph10 2K 4K 6K 8K 10K SE +/- 6.56, N = 3 8116.19 1. (CC) gcc options: -O3 -march=native
Hackbench Count: 16 - Type: Process OpenBenchmarking.org Seconds, Fewer Is Better Hackbench Count: 16 - Type: Process hackbench2004Ph10 20 40 60 80 100 SE +/- 0.70, N = 3 74.58 1. (CC) gcc options: -lpthread
Hackbench Count: 16 - Type: Thread OpenBenchmarking.org Seconds, Fewer Is Better Hackbench Count: 16 - Type: Thread hackbench2004Ph10 20 40 60 80 100 SE +/- 1.07, N = 3 80.39 1. (CC) gcc options: -lpthread
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.29 Static Web Page Serving apache2004Ph10 4K 8K 12K 16K 20K SE +/- 34.80, N = 3 20070.57 1. (CC) gcc options: -shared -fPIC -O2 -pthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU onednn2-2004Ph10 2 4 6 8 10 SE +/- 0.00437, N = 3 6.91035 MIN: 6.42 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU onednn2-2004Ph10 2K 4K 6K 8K 10K SE +/- 4.17, N = 3 8532.56 MIN: 8501.91 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write cachebench2004Ph10 6K 12K 18K 24K 30K SE +/- 109.87, N = 3 29344.25 MIN: 25528.91 / MAX: 32707.41 1. (CC) gcc options: -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write cachebench2004Ph10 4K 8K 12K 16K 20K SE +/- 178.90, N = 5 16912.99 MIN: 12147.97 / MAX: 19456.5 1. (CC) gcc options: -lrt
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read cachebench2004Ph10 400 800 1200 1600 2000 SE +/- 3.24, N = 3 2080.92 MIN: 2073.92 / MAX: 2084.44 1. (CC) gcc options: -lrt
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation scimark2004Ph10 200 400 600 800 1000 SE +/- 0.16, N = 3 846.84 1. (CC) gcc options: -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply scimark2004Ph10 90 180 270 360 450 SE +/- 0.30, N = 3 435.10 1. (CC) gcc options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform scimark2004Ph10 30 60 90 120 150 SE +/- 0.41, N = 3 127.68 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo scimark2004Ph10 20 40 60 80 100 SE +/- 0.17, N = 3 97.06 1. (CC) gcc options: -lm
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite scimark2004Ph10 80 160 240 320 400 SE +/- 4.40, N = 15 388.38 1. (CC) gcc options: -lm
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C npb2004Ph10 1200 2400 3600 4800 6000 SE +/- 11.02, N = 3 5785.55 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D npb2004Ph10 80 160 240 320 400 SE +/- 4.09, N = 4 379.35 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C npb2004Ph10 80 160 240 320 400 SE +/- 0.85, N = 3 381.58 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point ramspeed2004Ph10 2K 4K 6K 8K 10K SE +/- 6.82, N = 3 10508.78 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point ramspeed2004Ph10 2K 4K 6K 8K 10K SE +/- 11.87, N = 3 9816.48 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point ramspeed2004Ph10 2K 4K 6K 8K 10K SE +/- 8.00, N = 3 11223.54 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer ramspeed2004Ph10 2K 4K 6K 8K 10K SE +/- 14.90, N = 3 10455.04 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer ramspeed2004Ph10 2K 4K 6K 8K 10K SE +/- 8.14, N = 3 9839.16 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer ramspeed2004Ph10 2K 4K 6K 8K 10K SE +/- 10.55, N = 3 11233.26 1. (CC) gcc options: -O3 -march=native
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 amg2004Ph10 20M 40M 60M 80M 100M SE +/- 28448.34, N = 3 100256967 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 1024 ipc_benchmark2004Ph10 200K 400K 600K 800K 1000K SE +/- 5635.32, N = 3 887809
IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 128 ipc_benchmark2004Ph10 300K 600K 900K 1200K 1500K SE +/- 13076.65, N = 3 1203839
IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 1024 ipc_benchmark2004Ph10 300K 600K 900K 1200K 1500K SE +/- 10432.31, N = 3 1434448
IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 128 ipc_benchmark2004Ph10 400K 800K 1200K 1600K 2000K SE +/- 22048.60, N = 3 1871487
IPC_benchmark Type: Unnamed Pipe - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Pipe - Message Bytes: 1024 ipc_benchmark2004Ph10 300K 600K 900K 1200K 1500K SE +/- 15158.00, N = 3 1535206
IPC_benchmark Type: Unnamed Pipe - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Pipe - Message Bytes: 128 ipc_benchmark2004Ph10 400K 800K 1200K 1600K 2000K SE +/- 13255.52, N = 3 1905581
IPC_benchmark Type: TCP Socket - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: TCP Socket - Message Bytes: 1024 ipc_benchmark2004Ph10 300K 600K 900K 1200K 1500K SE +/- 1928.65, N = 3 1240240
IPC_benchmark Type: TCP Socket - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: TCP Socket - Message Bytes: 128 ipc_benchmark2004Ph10 400K 800K 1200K 1600K 2000K SE +/- 6651.88, N = 3 1834833
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space graphics-magick2004Ph10 150 300 450 600 750 SE +/- 1.33, N = 3 679 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian graphics-magick2004Ph10 30 60 90 120 150 154 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing graphics-magick2004Ph10 130 260 390 520 650 SE +/- 0.88, N = 3 588 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced graphics-magick2004Ph10 30 60 90 120 150 SE +/- 0.33, N = 3 119 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen graphics-magick2004Ph10 20 40 60 80 100 86 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate graphics-magick2004Ph10 100 200 300 400 500 SE +/- 0.67, N = 3 467 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl graphics-magick2004Ph10 70 140 210 280 350 SE +/- 1.00, N = 3 303 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU sysbench2004Ph10 3K 6K 9K 12K 15K SE +/- 43.11, N = 3 11968.45 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory sysbench2004Ph10 1400 2800 4200 5600 7000 SE +/- 6.11, N = 3 6742.41 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.11.07 Test: MMAP stress-ng2004Ph10 20 40 60 80 100 SE +/- 2.24, N = 12 91.28 1. (CC) gcc options: -O2 -std=gnu99 -lm -laio -lcrypt -lrt -lz -ldl -lpthread -lc
ctx_clock Context Switch Time OpenBenchmarking.org Clocks, Fewer Is Better ctx_clock Context Switch Time ctx_clock2004Ph10 40 80 120 160 200 SE +/- 2.95, N = 15 182
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization scimark2004Ph10 90 180 270 360 450 SE +/- 45.04, N = 3 399.27 1. (CC) gcc options: -lm
Phoronix Test Suite v10.8.5