ss

AMD EPYC 3255 8-Core Temp testing with a congatec conga-B7E3 (5.13 BIOS) and MSI NVIDIA GeForce GTX 1050 2GB on Ubuntu 16.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2106292-IB-SS261289426.

ssProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen Resolutionsysbench1604Ph10graphics-magick1604Ph10ipc-benchmarking1604Ph10ipc-benchmarking1024-1604Ph10amg1604Ph10tensorflow1604PH10ramspeed1604Ph10npb1604Ph10scimark1604Ph10cachebench1604Ph10onednn1604Ph10apache-ctx_clock1604Ph10ctx-clock1604Ph10hackbenchAll1604Ph10hackbench1604Ph10mbw1604Ph10openssl1604Ph10perf-bench1604Ph10schbench8-16-1604Ph10AMD EPYC 3255 8-Core Temp @ 2.50GHz (8 Cores / 16 Threads)congatec conga-B7E3 (5.13 BIOS)AMD 17h32GB2000GB Samsung SSD 970 EVO 2TB + 2000GB Portable SSD T5MSI NVIDIA GeForce GTX 1050 2GBNVIDIA GP107GL HD AudioIntel I211 + Intel I210 + 2 x AMD Device 1458 + 2 x AMD Device 1459Ubuntu 16.044.15.0-123-generic (x86_64)X ServerNVIDIA1.4 (2.1 Mesa 10.5.4)GCC 5.5.0 20171010ext4800x600OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- sysbench1604Ph10, graphics-magick1604Ph10, ipc-benchmarking1604Ph10, ipc-benchmarking1024-1604Ph10, amg1604Ph10, ramspeed1604Ph10, npb1604Ph10, scimark1604Ph10, cachebench1604Ph10, onednn1604Ph10, apache-ctx_clock1604Ph10, ctx-clock1604Ph10, hackbenchAll1604Ph10, hackbench1604Ph10, mbw1604Ph10, openssl1604Ph10, perf-bench1604Ph10, schbench8-16-1604Ph10: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8001250Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affectedPython Details- tensorflow1604PH10: sh: 1: /opt/TensorRT/python: Permission denied + Python 3.5.2

sssysbench: CPUgraphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizinggraphics-magick: Noise-Gaussiangraphics-magick: HWB Color Spaceipc-benchmark: TCP Socket - 128ipc-benchmark: Unnamed Pipe - 128ipc-benchmark: FIFO Named Pipe - 128ipc-benchmark: Unnamed Unix Domain Socket - 128ipc-benchmark: TCP Socket - 1024ipc-benchmark: Unnamed Pipe - 1024ipc-benchmark: FIFO Named Pipe - 1024ipc-benchmark: Unnamed Unix Domain Socket - 1024amg: ramspeed: Add - Integerramspeed: Scale - Integerramspeed: Average - Integerramspeed: Add - Floating Pointramspeed: Scale - Floating Pointramspeed: Average - Floating Pointnpb: EP.Cnpb: EP.Dscimark2: Compositescimark2: Monte Carloscimark2: Fast Fourier Transformscimark2: Sparse Matrix Multiplyscimark2: Dense LU Matrix Factorizationscimark2: Jacobi Successive Over-Relaxationcachebench: Readcachebench: Writecachebench: Read / Modify / Writeonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUapache: Static Web Page Servingctx-clock: Context Switch Timehackbench: 16 - Threadhackbench: 16 - Processmbw: Memory Copy - 1024 MiBmbw: Memory Copy, Fixed Block Size - 1024 MiBopenssl: RSA 4096-bit Performanceperf-bench: Epoll Waitperf-bench: Futex Hashperf-bench: Memcpy 1MBperf-bench: Memset 1MBperf-bench: Sched Pipeperf-bench: Futex Lock-Piperf-bench: Syscall Basicschbench: 8 - 16sysbench1604Ph10graphics-magick1604Ph10ipc-benchmarking1604Ph10ipc-benchmarking1024-1604Ph10amg1604Ph10tensorflow1604PH10ramspeed1604Ph10npb1604Ph10scimark1604Ph10cachebench1604Ph10onednn1604Ph10apache-ctx_clock1604Ph10ctx-clock1604Ph10hackbenchAll1604Ph10hackbench1604Ph10mbw1604Ph10openssl1604Ph10perf-bench1604Ph10schbench8-16-1604Ph1012018.1828745882120579120711193680319261531811477100204014763801647029159015414985509784830711115.059654.8410366.8911122.649818.9210488.67221.32217.22382.91101.47167.16500.28314.80830.832086.96147410865.06230521639.88856717.759018.59279.779024.8634538.307614.610020.973843.937110.252513.826112254.68579.9912314.28577.5710.806712262.68572.496.4179819458.7017554.26653.0768058.3204729.5101177.137459364933314.07305839.5934709978589214521622104686OpenBenchmarking.org

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPUsysbench1604Ph103K6K9K12K15KSE +/- 54.78, N = 312018.181. (CC) gcc-7 options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Swirlgraphics-magick1604Ph1060120180240300SE +/- 0.88, N = 32871. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Rotategraphics-magick1604Ph10100200300400500SE +/- 1.86, N = 34581. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Sharpengraphics-magick1604Ph1020406080100821. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Enhancedgraphics-magick1604Ph10306090120150SE +/- 0.33, N = 31201. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Resizinggraphics-magick1604Ph10130260390520650SE +/- 0.33, N = 35791. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-Gaussiangraphics-magick1604Ph103060901201501201. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color Spacegraphics-magick1604Ph10150300450600750SE +/- 0.67, N = 37111. (CC) gcc-7 options: -fopenmp -O2 -pthread -ljbig -ltiff -ljasper -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

IPC_benchmark

Type: TCP Socket - Message Bytes: 128

OpenBenchmarking.orgMessages Per Second, More Is BetterIPC_benchmarkType: TCP Socket - Message Bytes: 128ipc-benchmarking1604Ph10400K800K1200K1600K2000KSE +/- 4379.21, N = 31936803

IPC_benchmark

Type: Unnamed Pipe - Message Bytes: 128

OpenBenchmarking.orgMessages Per Second, More Is BetterIPC_benchmarkType: Unnamed Pipe - Message Bytes: 128ipc-benchmarking1604Ph10400K800K1200K1600K2000KSE +/- 7932.21, N = 31926153

IPC_benchmark

Type: FIFO Named Pipe - Message Bytes: 128

OpenBenchmarking.orgMessages Per Second, More Is BetterIPC_benchmarkType: FIFO Named Pipe - Message Bytes: 128ipc-benchmarking1604Ph10400K800K1200K1600K2000KSE +/- 19705.06, N = 31811477

IPC_benchmark

Type: Unnamed Unix Domain Socket - Message Bytes: 128

OpenBenchmarking.orgMessages Per Second, More Is BetterIPC_benchmarkType: Unnamed Unix Domain Socket - Message Bytes: 128ipc-benchmarking1604Ph10200K400K600K800K1000KSE +/- 43657.25, N = 151002040

IPC_benchmark

Type: TCP Socket - Message Bytes: 1024

OpenBenchmarking.orgMessages Per Second, More Is BetterIPC_benchmarkType: TCP Socket - Message Bytes: 1024ipc-benchmarking1024-1604Ph10300K600K900K1200K1500KSE +/- 5988.24, N = 31476380

IPC_benchmark

Type: Unnamed Pipe - Message Bytes: 1024

OpenBenchmarking.orgMessages Per Second, More Is BetterIPC_benchmarkType: Unnamed Pipe - Message Bytes: 1024ipc-benchmarking1024-1604Ph10400K800K1200K1600K2000KSE +/- 13168.11, N = 31647029

IPC_benchmark

Type: FIFO Named Pipe - Message Bytes: 1024

OpenBenchmarking.orgMessages Per Second, More Is BetterIPC_benchmarkType: FIFO Named Pipe - Message Bytes: 1024ipc-benchmarking1024-1604Ph10300K600K900K1200K1500KSE +/- 9737.22, N = 31590154

IPC_benchmark

Type: Unnamed Unix Domain Socket - Message Bytes: 1024

OpenBenchmarking.orgMessages Per Second, More Is BetterIPC_benchmarkType: Unnamed Unix Domain Socket - Message Bytes: 1024ipc-benchmarking1024-1604Ph10300K600K900K1200K1500KSE +/- 7449.83, N = 31498550

Algebraic Multi-Grid Benchmark

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2amg1604Ph1020M40M60M80M100MSE +/- 47423.33, N = 3978483071. (CC) gcc-7 options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi

RAMspeed SMP

Type: Add - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Add - Benchmark: Integerramspeed1604Ph102K4K6K8K10KSE +/- 7.81, N = 311115.051. (CC) gcc-7 options: -O3 -march=native

RAMspeed SMP

Type: Scale - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Scale - Benchmark: Integerramspeed1604Ph102K4K6K8K10KSE +/- 35.92, N = 39654.841. (CC) gcc-7 options: -O3 -march=native

RAMspeed SMP

Type: Average - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Average - Benchmark: Integerramspeed1604Ph102K4K6K8K10KSE +/- 18.60, N = 310366.891. (CC) gcc-7 options: -O3 -march=native

RAMspeed SMP

Type: Add - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Add - Benchmark: Floating Pointramspeed1604Ph102K4K6K8K10KSE +/- 8.14, N = 311122.641. (CC) gcc-7 options: -O3 -march=native

RAMspeed SMP

Type: Scale - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Scale - Benchmark: Floating Pointramspeed1604Ph102K4K6K8K10KSE +/- 5.81, N = 39818.921. (CC) gcc-7 options: -O3 -march=native

RAMspeed SMP

Type: Average - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Average - Benchmark: Floating Pointramspeed1604Ph102K4K6K8K10KSE +/- 5.49, N = 310488.671. (CC) gcc-7 options: -O3 -march=native

NAS Parallel Benchmarks

Test / Class: EP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Cnpb1604Ph1050100150200250SE +/- 0.38, N = 3221.321. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: EP.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Dnpb1604Ph1050100150200250SE +/- 0.55, N = 3217.221. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Compositescimark1604Ph1080160240320400SE +/- 3.97, N = 3382.911. (CC) gcc-7 options: -lm

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte Carloscimark1604Ph1020406080100SE +/- 0.31, N = 3101.471. (CC) gcc-7 options: -lm

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier Transformscimark1604Ph104080120160200SE +/- 0.58, N = 3167.161. (CC) gcc-7 options: -lm

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix Multiplyscimark1604Ph10110220330440550SE +/- 1.11, N = 3500.281. (CC) gcc-7 options: -lm

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix Factorizationscimark1604Ph1070140210280350SE +/- 1.26, N = 3314.801. (CC) gcc-7 options: -lm

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-Relaxationscimark1604Ph102004006008001000SE +/- 22.88, N = 3830.831. (CC) gcc-7 options: -lm

CacheBench

Test: Read

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Readcachebench1604Ph10400800120016002000SE +/- 4.55, N = 32086.96MIN: 2078.1 / MAX: 2093.841. (CC) gcc-7 options: -lrt

CacheBench

Test: Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Writecachebench1604Ph102K4K6K8K10KSE +/- 20.48, N = 310865.06MIN: 9466.53 / MAX: 11482.881. (CC) gcc-7 options: -lrt

CacheBench

Test: Read / Modify / Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / Writecachebench1604Ph105K10K15K20K25KSE +/- 39.55, N = 321639.89MIN: 18656.61 / MAX: 22903.781. (CC) gcc-7 options: -lrt

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUonednn1604Ph1048121620SE +/- 0.11, N = 317.76MIN: 16.641. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUonednn1604Ph10510152025SE +/- 0.02, N = 318.59MIN: 18.251. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUonednn1604Ph103691215SE +/- 0.00477, N = 39.77902MIN: 9.181. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUonednn1604Ph101.09432.18863.28294.37725.4715SE +/- 0.00067, N = 34.86345MIN: 4.611. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUonednn1604Ph10918273645SE +/- 0.00, N = 338.31MIN: 37.41. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUonednn1604Ph1048121620SE +/- 0.03, N = 314.61MIN: 13.031. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUonednn1604Ph10510152025SE +/- 0.06, N = 320.97MIN: 19.861. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUonednn1604Ph101020304050SE +/- 0.02, N = 343.94MIN: 43.441. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUonednn1604Ph103691215SE +/- 0.01, N = 310.25MIN: 9.571. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUonednn1604Ph1048121620SE +/- 0.06, N = 313.83MIN: 13.371. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUonednn1604Ph103K6K9K12K15KSE +/- 38.73, N = 312254.6MIN: 12140.41. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUonednn1604Ph102K4K6K8K10KSE +/- 12.76, N = 38579.99MIN: 8534.511. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUonednn1604Ph103K6K9K12K15KSE +/- 27.82, N = 312314.2MIN: 12240.51. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUonednn1604Ph102K4K6K8K10KSE +/- 13.01, N = 38577.57MIN: 8526.621. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUonednn1604Ph103691215SE +/- 0.06, N = 310.81MIN: 10.371. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUonednn1604Ph103K6K9K12K15KSE +/- 27.97, N = 312262.6MIN: 12196.21. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUonednn1604Ph102K4K6K8K10KSE +/- 23.27, N = 38572.49MIN: 8526.541. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUonednn1604Ph10246810SE +/- 0.00458, N = 36.41798MIN: 6.091. (CXX) g++-7 options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Apache Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.29Static Web Page Servingapache-ctx_clock1604Ph104K8K12K16K20KSE +/- 173.96, N = 319458.701. (CC) gcc-7 options: -shared -fPIC -O2 -pthread

ctx_clock

Context Switch Time

OpenBenchmarking.orgClocks, Fewer Is Betterctx_clockContext Switch Timectx-clock1604Ph104080120160200175

Hackbench

Count: 16 - Type: Thread

OpenBenchmarking.orgSeconds, Fewer Is BetterHackbenchCount: 16 - Type: Threadhackbench1604Ph101224364860SE +/- 0.77, N = 354.271. (CC) gcc-7 options: -lpthread

Hackbench

Count: 16 - Type: Process

OpenBenchmarking.orgSeconds, Fewer Is BetterHackbenchCount: 16 - Type: Processhackbench1604Ph101224364860SE +/- 0.64, N = 353.081. (CC) gcc-7 options: -lpthread

MBW

Test: Memory Copy - Array Size: 1024 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 1024 MiBmbw1604Ph102K4K6K8K10KSE +/- 14.77, N = 38058.321. (CC) gcc-7 options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiBmbw1604Ph1010002000300040005000SE +/- 20.26, N = 34729.511. (CC) gcc-7 options: -O3 -march=native

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit Performanceopenssl1604Ph1030060090012001500SE +/- 2.35, N = 31177.11. (CC) gcc-7 options: -pthread -m64 -O3 -lssl -lcrypto -ldl

perf-bench

Benchmark: Epoll Wait

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Epoll Waitperf-bench1604Ph108K16K24K32K40KSE +/- 36.75, N = 3374591. (CC) gcc-7 options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -export-dynamic -lpthread -lrt -lm -ldl -lcrypto -lperl -lc -lcrypt -lpython2.7 -lutil -lz -llzma -lnuma

perf-bench

Benchmark: Futex Hash

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Futex Hashperf-bench1604Ph10800K1600K2400K3200K4000KSE +/- 8513.42, N = 336493331. (CC) gcc-7 options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -export-dynamic -lpthread -lrt -lm -ldl -lcrypto -lperl -lc -lcrypt -lpython2.7 -lutil -lz -llzma -lnuma

perf-bench

Benchmark: Memcpy 1MB

OpenBenchmarking.orgGB/sec, More Is Betterperf-benchBenchmark: Memcpy 1MBperf-bench1604Ph1048121620SE +/- 0.09, N = 314.071. (CC) gcc-7 options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -export-dynamic -lpthread -lrt -lm -ldl -lcrypto -lperl -lc -lcrypt -lpython2.7 -lutil -lz -llzma -lnuma

perf-bench

Benchmark: Memset 1MB

OpenBenchmarking.orgGB/sec, More Is Betterperf-benchBenchmark: Memset 1MBperf-bench1604Ph10918273645SE +/- 0.10, N = 339.591. (CC) gcc-7 options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -export-dynamic -lpthread -lrt -lm -ldl -lcrypto -lperl -lc -lcrypt -lpython2.7 -lutil -lz -llzma -lnuma

perf-bench

Benchmark: Sched Pipe

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Sched Pipeperf-bench1604Ph1020K40K60K80K100KSE +/- 340.78, N = 3997851. (CC) gcc-7 options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -export-dynamic -lpthread -lrt -lm -ldl -lcrypto -lperl -lc -lcrypt -lpython2.7 -lutil -lz -llzma -lnuma

perf-bench

Benchmark: Futex Lock-Pi

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Futex Lock-Piperf-bench1604Ph1020040060080010008921. (CC) gcc-7 options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -export-dynamic -lpthread -lrt -lm -ldl -lcrypto -lperl -lc -lcrypt -lpython2.7 -lutil -lz -llzma -lnuma

perf-bench

Benchmark: Syscall Basic

OpenBenchmarking.orgops/sec, More Is Betterperf-benchBenchmark: Syscall Basicperf-bench1604Ph103M6M9M12M15MSE +/- 36942.47, N = 3145216221. (CC) gcc-7 options: -O6 -ggdb3 -funwind-tables -std=gnu99 -Xlinker -export-dynamic -lpthread -lrt -lm -ldl -lcrypto -lperl -lc -lcrypt -lpython2.7 -lutil -lz -llzma -lnuma

Schbench

Message Threads: 8 - Workers Per Message Thread: 16

OpenBenchmarking.orgusec, 99.9th Latency Percentile, Fewer Is BetterSchbenchMessage Threads: 8 - Workers Per Message Thread: 16schbench8-16-1604Ph1020K40K60K80K100KSE +/- 1375.04, N = 71046861. (CC) gcc-7 options: -O2 -lpthread


Phoronix Test Suite v10.8.4