ss AMD EPYC 3255 8-Core Temp testing with a congatec conga-B7E3 (5.13 BIOS) and MSI NVIDIA GeForce GTX 1050 2GB on Ubuntu 16.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2106294-IB-SS497779426&grt&sro&export=pdf .
ss Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Display Server Display Driver OpenGL Compiler File-System Screen Resolution sysbench1604Ph10 graphics-magick1604Ph10 ipc-benchmarking1604Ph10 ipc-benchmarking1024-1604Ph10 amg1604Ph10 tensorflow1604PH10 ramspeed1604Ph10 npb1604Ph10 scimark1604Ph10 cachebench1604Ph10 onednn1604Ph10 apache-ctx_clock1604Ph10 AMD EPYC 3255 8-Core Temp @ 2.50GHz (8 Cores / 16 Threads) congatec conga-B7E3 (5.13 BIOS) AMD 17h 32GB 2000GB Samsung SSD 970 EVO 2TB + 2000GB Portable SSD T5 MSI NVIDIA GeForce GTX 1050 2GB NVIDIA GP107GL HD Audio Intel I211 + Intel I210 + 2 x AMD Device 1458 + 2 x AMD Device 1459 Ubuntu 16.04 4.15.0-123-generic (x86_64) X Server NVIDIA 1.4 (2.1 Mesa 10.5.4) GCC 5.5.0 20171010 ext4 800x600 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - sysbench1604Ph10, graphics-magick1604Ph10, ipc-benchmarking1604Ph10, ipc-benchmarking1024-1604Ph10, amg1604Ph10, ramspeed1604Ph10, npb1604Ph10, scimark1604Ph10, cachebench1604Ph10, onednn1604Ph10, apache-ctx_clock1604Ph10: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8001250 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected Python Details - tensorflow1604PH10: sh: 1: /opt/TensorRT/python: Permission denied + Python 3.5.2
ss amg: apache: Static Web Page Serving cachebench: Read cachebench: Write cachebench: Read / Modify / Write graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space ipc-benchmark: TCP Socket - 128 ipc-benchmark: Unnamed Pipe - 128 ipc-benchmark: FIFO Named Pipe - 128 ipc-benchmark: Unnamed Unix Domain Socket - 128 ipc-benchmark: TCP Socket - 1024 ipc-benchmark: Unnamed Pipe - 1024 ipc-benchmark: FIFO Named Pipe - 1024 ipc-benchmark: Unnamed Unix Domain Socket - 1024 npb: EP.C npb: EP.D onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU ramspeed: Add - Integer ramspeed: Scale - Integer ramspeed: Average - Integer ramspeed: Add - Floating Point ramspeed: Scale - Floating Point ramspeed: Average - Floating Point scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation sysbench: CPU sysbench1604Ph10 graphics-magick1604Ph10 ipc-benchmarking1604Ph10 ipc-benchmarking1024-1604Ph10 amg1604Ph10 tensorflow1604PH10 ramspeed1604Ph10 npb1604Ph10 scimark1604Ph10 cachebench1604Ph10 onednn1604Ph10 apache-ctx_clock1604Ph10 12018.18 287 458 82 120 579 120 711 1936803 1926153 1811477 1002040 1476380 1647029 1590154 1498550 97848307 11115.05 9654.84 10366.89 11122.64 9818.92 10488.67 221.32 217.22 382.91 101.47 167.16 500.28 314.80 830.83 2086.961474 10865.062305 21639.888567 17.7590 18.5927 9.77902 4.86345 38.3076 14.6100 20.9738 43.9371 10.2525 13.8261 12254.6 8579.99 12314.2 8577.57 10.8067 12262.6 8572.49 6.41798 19458.70 OpenBenchmarking.org
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 amg1604Ph10 20M 40M 60M 80M 100M SE +/- 47423.33, N = 3 97848307 1. (CC) gcc-7 options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.29 Static Web Page Serving apache-ctx_clock1604Ph10 4K 8K 12K 16K 20K SE +/- 173.96, N = 3 19458.70 1. (CC) gcc-7 options: -shared -fPIC -O2 -pthread
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read cachebench1604Ph10 400 800 1200 1600 2000 SE +/- 4.55, N = 3 2086.96 MIN: 2078.1 / MAX: 2093.84 1. (CC) gcc-7 options: -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write cachebench1604Ph10 2K 4K 6K 8K 10K SE +/- 20.48, N = 3 10865.06 MIN: 9466.53 / MAX: 11482.88 1. (CC) gcc-7 options: -lrt
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write cachebench1604Ph10 5K 10K 15K 20K 25K SE +/- 39.55, N = 3 21639.89 MIN: 18656.61 / MAX: 22903.78 1. (CC) gcc-7 options: -lrt
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl graphics-magick1604Ph10 60 120 180 240 300 SE +/- 0.88, N = 3 287 1. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate graphics-magick1604Ph10 100 200 300 400 500 SE +/- 1.86, N = 3 458 1. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen graphics-magick1604Ph10 20 40 60 80 100 82 1. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced graphics-magick1604Ph10 30 60 90 120 150 SE +/- 0.33, N = 3 120 1. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing graphics-magick1604Ph10 130 260 390 520 650 SE +/- 0.33, N = 3 579 1. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian graphics-magick1604Ph10 30 60 90 120 150 120 1. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space graphics-magick1604Ph10 150 300 450 600 750 SE +/- 0.67, N = 3 711 1. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
IPC_benchmark Type: TCP Socket - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: TCP Socket - Message Bytes: 128 ipc-benchmarking1604Ph10 400K 800K 1200K 1600K 2000K SE +/- 4379.21, N = 3 1936803
IPC_benchmark Type: Unnamed Pipe - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Pipe - Message Bytes: 128 ipc-benchmarking1604Ph10 400K 800K 1200K 1600K 2000K SE +/- 7932.21, N = 3 1926153
IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 128 ipc-benchmarking1604Ph10 400K 800K 1200K 1600K 2000K SE +/- 19705.06, N = 3 1811477
IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 128 ipc-benchmarking1604Ph10 200K 400K 600K 800K 1000K SE +/- 43657.25, N = 15 1002040
IPC_benchmark Type: TCP Socket - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: TCP Socket - Message Bytes: 1024 ipc-benchmarking1024-1604Ph10 300K 600K 900K 1200K 1500K SE +/- 5988.24, N = 3 1476380
IPC_benchmark Type: Unnamed Pipe - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Pipe - Message Bytes: 1024 ipc-benchmarking1024-1604Ph10 400K 800K 1200K 1600K 2000K SE +/- 13168.11, N = 3 1647029
IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 1024 ipc-benchmarking1024-1604Ph10 300K 600K 900K 1200K 1500K SE +/- 9737.22, N = 3 1590154
IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 1024 ipc-benchmarking1024-1604Ph10 300K 600K 900K 1200K 1500K SE +/- 7449.83, N = 3 1498550
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C npb1604Ph10 50 100 150 200 250 SE +/- 0.38, N = 3 221.32 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D npb1604Ph10 50 100 150 200 250 SE +/- 0.55, N = 3 217.22 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 1.10.2
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU onednn1604Ph10 4 8 12 16 20 SE +/- 0.11, N = 3 17.76 MIN: 16.64 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU onednn1604Ph10 5 10 15 20 25 SE +/- 0.02, N = 3 18.59 MIN: 18.25 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU onednn1604Ph10 3 6 9 12 15 SE +/- 0.00477, N = 3 9.77902 MIN: 9.18 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU onednn1604Ph10 1.0943 2.1886 3.2829 4.3772 5.4715 SE +/- 0.00067, N = 3 4.86345 MIN: 4.61 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU onednn1604Ph10 9 18 27 36 45 SE +/- 0.00, N = 3 38.31 MIN: 37.4 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU onednn1604Ph10 4 8 12 16 20 SE +/- 0.03, N = 3 14.61 MIN: 13.03 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU onednn1604Ph10 5 10 15 20 25 SE +/- 0.06, N = 3 20.97 MIN: 19.86 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU onednn1604Ph10 10 20 30 40 50 SE +/- 0.02, N = 3 43.94 MIN: 43.44 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU onednn1604Ph10 3 6 9 12 15 SE +/- 0.01, N = 3 10.25 MIN: 9.57 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU onednn1604Ph10 4 8 12 16 20 SE +/- 0.06, N = 3 13.83 MIN: 13.37 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU onednn1604Ph10 3K 6K 9K 12K 15K SE +/- 38.73, N = 3 12254.6 MIN: 12140.4 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU onednn1604Ph10 2K 4K 6K 8K 10K SE +/- 12.76, N = 3 8579.99 MIN: 8534.51 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU onednn1604Ph10 3K 6K 9K 12K 15K SE +/- 27.82, N = 3 12314.2 MIN: 12240.5 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU onednn1604Ph10 2K 4K 6K 8K 10K SE +/- 13.01, N = 3 8577.57 MIN: 8526.62 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU onednn1604Ph10 3 6 9 12 15 SE +/- 0.06, N = 3 10.81 MIN: 10.37 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU onednn1604Ph10 3K 6K 9K 12K 15K SE +/- 27.97, N = 3 12262.6 MIN: 12196.2 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU onednn1604Ph10 2K 4K 6K 8K 10K SE +/- 23.27, N = 3 8572.49 MIN: 8526.54 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU onednn1604Ph10 2 4 6 8 10 SE +/- 0.00458, N = 3 6.41798 MIN: 6.09 1. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer ramspeed1604Ph10 2K 4K 6K 8K 10K SE +/- 7.81, N = 3 11115.05 1. (CC) gcc-7 options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer ramspeed1604Ph10 2K 4K 6K 8K 10K SE +/- 35.92, N = 3 9654.84 1. (CC) gcc-7 options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer ramspeed1604Ph10 2K 4K 6K 8K 10K SE +/- 18.60, N = 3 10366.89 1. (CC) gcc-7 options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point ramspeed1604Ph10 2K 4K 6K 8K 10K SE +/- 8.14, N = 3 11122.64 1. (CC) gcc-7 options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point ramspeed1604Ph10 2K 4K 6K 8K 10K SE +/- 5.81, N = 3 9818.92 1. (CC) gcc-7 options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point ramspeed1604Ph10 2K 4K 6K 8K 10K SE +/- 5.49, N = 3 10488.67 1. (CC) gcc-7 options: -O3 -march=native
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite scimark1604Ph10 80 160 240 320 400 SE +/- 3.97, N = 3 382.91 1. (CC) gcc-7 options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo scimark1604Ph10 20 40 60 80 100 SE +/- 0.31, N = 3 101.47 1. (CC) gcc-7 options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform scimark1604Ph10 40 80 120 160 200 SE +/- 0.58, N = 3 167.16 1. (CC) gcc-7 options: -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply scimark1604Ph10 110 220 330 440 550 SE +/- 1.11, N = 3 500.28 1. (CC) gcc-7 options: -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization scimark1604Ph10 70 140 210 280 350 SE +/- 1.26, N = 3 314.80 1. (CC) gcc-7 options: -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation scimark1604Ph10 200 400 600 800 1000 SE +/- 22.88, N = 3 830.83 1. (CC) gcc-7 options: -lm
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU sysbench1604Ph10 3K 6K 9K 12K 15K SE +/- 54.78, N = 3 12018.18 1. (CC) gcc-7 options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Phoronix Test Suite v10.8.5