Intel Core i9-7980XE testing with a ASRock X299E-ITX/ac (P1.60 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2107062-IB-SS498538552 ss4 - Phoronix Test Suite ss4 Intel Core i9-7980XE testing with a ASRock X299E-ITX/ac (P1.60 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2107062-IB-SS498538552&gru&sro&export=txt .
ss4 Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution sysbench-i9Ph10 graphics-magick-i9Ph10 ipc-benchmark-i9Ph10 amg-i9Ph10 ramspeed-i9Ph10 ramspeed-i9Ph10-2 npb-i9Ph10 onednn-i9Ph10 scimark-i9Ph10 cachebench-19Ph10 apache-i9Ph10 ctx-clock-i9Ph10 Intel Core i9-7980XE @ 4.40GHz (18 Cores / 36 Threads) ASRock X299E-ITX/ac (P1.60 BIOS) Intel Sky Lake-E DMI3 Registers 32GB 512GB Western Digital CL SN520 SDAPNUW-512G-1022 llvmpipe Realtek ALC1220 Intel I219-V + Intel I211 + 2 x Intel 10-Gigabit X540-AT2 + Intel 8265 / 8275 Ubuntu 20.04 5.8.0-55-generic (x86_64) GNOME Shell 3.36.7 X Server 1.20.9 NVIDIA 4.5 Mesa 20.2.6 (LLVM 11.0.0 256 bits) GCC 9.3.0 ext4 3840x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0x2006b06 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
ss4 sysbench: CPU amg: graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space ramspeed: Add - Integer ramspeed: Scale - Integer ramspeed: Average - Integer ramspeed: Add - Floating Point ramspeed: Scale - Floating Point ramspeed: Average - Floating Point cachebench: Read cachebench: Write cachebench: Read / Modify / Write ipc-benchmark: TCP Socket - 128 ipc-benchmark: TCP Socket - 1024 ipc-benchmark: Unnamed Pipe - 128 ipc-benchmark: Unnamed Pipe - 1024 ipc-benchmark: FIFO Named Pipe - 128 ipc-benchmark: FIFO Named Pipe - 1024 ipc-benchmark: Unnamed Unix Domain Socket - 128 ipc-benchmark: Unnamed Unix Domain Socket - 1024 scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation sysbench: RAM / Memory apache: Static Web Page Serving npb: EP.C npb: EP.D ctx-clock: Context Switch Time onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU sysbench-i9Ph10 graphics-magick-i9Ph10 ipc-benchmark-i9Ph10 amg-i9Ph10 ramspeed-i9Ph10 ramspeed-i9Ph10-2 npb-i9Ph10 onednn-i9Ph10 scimark-i9Ph10 cachebench-19Ph10 apache-i9Ph10 ctx-clock-i9Ph10 12520.70 12623.74 244 758 77 125 581 155 711 1618720 1231428 1736829 1529046 1800108 1547106 1101142 982013 421663767 29880.84 28949.51 29409.08 29791.07 28949.60 28998.35 29206.73 28492.59 28905.77 1506.40 2072.04 29.8943 60.1806 3.19822 29.5124 15.3357 22.0739 9.38504 10.2698 6.22421 8.69230 3.32760 3.75618 4016.91 2528.57 4016.07 22.1832 31.2255 26.7428 2530.55 2.54261 4015.84 2529.35 1.85819 5.43817 667.44 145.14 306.37 759.82 938.77 1187.12 3549.697231 29727.115846 48323.244029 23584.15 796 OpenBenchmarking.org
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU sysbench-i9Ph10 3K 6K 9K 12K 15K SE +/- 3.16, N = 3 12520.70 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 amg-i9Ph10 90M 180M 270M 360M 450M SE +/- 226558.75, N = 3 421663767 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl graphics-magick-i9Ph10 50 100 150 200 250 244 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate graphics-magick-i9Ph10 160 320 480 640 800 SE +/- 8.25, N = 5 758 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen graphics-magick-i9Ph10 20 40 60 80 100 SE +/- 0.33, N = 3 77 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced graphics-magick-i9Ph10 30 60 90 120 150 125 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing graphics-magick-i9Ph10 130 260 390 520 650 SE +/- 0.58, N = 3 581 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian graphics-magick-i9Ph10 30 60 90 120 150 155 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space graphics-magick-i9Ph10 150 300 450 600 750 SE +/- 1.20, N = 3 711 1. (CC) gcc options: -fopenmp -O2 -pthread -ljpeg -lz -lm -lpthread
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer ramspeed-i9Ph10 ramspeed-i9Ph10-2 6K 12K 18K 24K 30K SE +/- 10.07, N = 3 SE +/- 33.36, N = 3 29880.84 29791.07 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer ramspeed-i9Ph10 ramspeed-i9Ph10-2 6K 12K 18K 24K 30K SE +/- 19.97, N = 3 SE +/- 11.65, N = 3 28949.51 28949.60 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer ramspeed-i9Ph10 ramspeed-i9Ph10-2 6K 12K 18K 24K 30K SE +/- 3.06, N = 3 SE +/- 14.58, N = 3 29409.08 28998.35 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point ramspeed-i9Ph10-2 6K 12K 18K 24K 30K SE +/- 156.20, N = 3 29206.73 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point ramspeed-i9Ph10-2 6K 12K 18K 24K 30K SE +/- 42.13, N = 3 28492.59 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point ramspeed-i9Ph10-2 6K 12K 18K 24K 30K SE +/- 20.30, N = 3 28905.77 1. (CC) gcc options: -O3 -march=native
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read cachebench-19Ph10 800 1600 2400 3200 4000 SE +/- 0.10, N = 3 3549.70 MIN: 3539.96 / MAX: 3554.41 1. (CC) gcc options: -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write cachebench-19Ph10 6K 12K 18K 24K 30K SE +/- 4.33, N = 3 29727.12 MIN: 26749.71 / MAX: 31785.28 1. (CC) gcc options: -lrt
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write cachebench-19Ph10 10K 20K 30K 40K 50K SE +/- 5.12, N = 3 48323.24 MIN: 43906.38 / MAX: 50885.74 1. (CC) gcc options: -lrt
IPC_benchmark Type: TCP Socket - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: TCP Socket - Message Bytes: 128 ipc-benchmark-i9Ph10 300K 600K 900K 1200K 1500K SE +/- 1126.58, N = 3 1618720
IPC_benchmark Type: TCP Socket - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: TCP Socket - Message Bytes: 1024 ipc-benchmark-i9Ph10 300K 600K 900K 1200K 1500K SE +/- 7215.07, N = 3 1231428
IPC_benchmark Type: Unnamed Pipe - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Pipe - Message Bytes: 128 ipc-benchmark-i9Ph10 400K 800K 1200K 1600K 2000K SE +/- 29706.06, N = 15 1736829
IPC_benchmark Type: Unnamed Pipe - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Pipe - Message Bytes: 1024 ipc-benchmark-i9Ph10 300K 600K 900K 1200K 1500K SE +/- 21356.75, N = 3 1529046
IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 128 ipc-benchmark-i9Ph10 400K 800K 1200K 1600K 2000K SE +/- 25575.15, N = 3 1800108
IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: FIFO Named Pipe - Message Bytes: 1024 ipc-benchmark-i9Ph10 300K 600K 900K 1200K 1500K SE +/- 5801.12, N = 3 1547106
IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 128 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 128 ipc-benchmark-i9Ph10 200K 400K 600K 800K 1000K SE +/- 1288.51, N = 3 1101142
IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 1024 OpenBenchmarking.org Messages Per Second, More Is Better IPC_benchmark Type: Unnamed Unix Domain Socket - Message Bytes: 1024 ipc-benchmark-i9Ph10 200K 400K 600K 800K 1000K SE +/- 660.09, N = 3 982013
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite scimark-i9Ph10 140 280 420 560 700 SE +/- 0.38, N = 3 667.44 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo scimark-i9Ph10 30 60 90 120 150 SE +/- 0.02, N = 3 145.14 1. (CC) gcc options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform scimark-i9Ph10 70 140 210 280 350 SE +/- 0.93, N = 3 306.37 1. (CC) gcc options: -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply scimark-i9Ph10 160 320 480 640 800 SE +/- 0.38, N = 3 759.82 1. (CC) gcc options: -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization scimark-i9Ph10 200 400 600 800 1000 SE +/- 0.62, N = 3 938.77 1. (CC) gcc options: -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation scimark-i9Ph10 300 600 900 1200 1500 SE +/- 0.14, N = 3 1187.12 1. (CC) gcc options: -lm
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory sysbench-i9Ph10 3K 6K 9K 12K 15K SE +/- 43.24, N = 3 12623.74 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.29 Static Web Page Serving apache-i9Ph10 5K 10K 15K 20K 25K SE +/- 18.01, N = 3 23584.15 1. (CC) gcc options: -shared -fPIC -O2 -pthread
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C npb-i9Ph10 300 600 900 1200 1500 SE +/- 14.95, N = 15 1506.40 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D npb-i9Ph10 400 800 1200 1600 2000 SE +/- 11.23, N = 3 2072.04 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi 2. Open MPI 4.0.3
ctx_clock Context Switch Time OpenBenchmarking.org Clocks, Fewer Is Better ctx_clock Context Switch Time ctx-clock-i9Ph10 200 400 600 800 1000 796
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU onednn-i9Ph10 7 14 21 28 35 SE +/- 0.31, N = 3 29.89 MIN: 17.05 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU onednn-i9Ph10 13 26 39 52 65 SE +/- 0.57, N = 3 60.18 MIN: 27.05 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU onednn-i9Ph10 0.7196 1.4392 2.1588 2.8784 3.598 SE +/- 0.00253, N = 3 3.19822 MIN: 2.79 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU onednn-i9Ph10 7 14 21 28 35 SE +/- 0.24, N = 15 29.51 MIN: 4.14 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU onednn-i9Ph10 4 8 12 16 20 SE +/- 0.01, N = 3 15.34 MIN: 14.73 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU onednn-i9Ph10 5 10 15 20 25 SE +/- 0.20, N = 15 22.07 MIN: 6.98 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU onednn-i9Ph10 3 6 9 12 15 SE +/- 0.00136, N = 3 9.38504 MIN: 9.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU onednn-i9Ph10 3 6 9 12 15 SE +/- 0.04, N = 3 10.27 MIN: 7.43 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU onednn-i9Ph10 2 4 6 8 10 SE +/- 0.00670, N = 3 6.22421 MIN: 6.1 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU onednn-i9Ph10 2 4 6 8 10 SE +/- 0.00069, N = 3 8.69230 MIN: 8.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU onednn-i9Ph10 0.7487 1.4974 2.2461 2.9948 3.7435 SE +/- 0.00103, N = 3 3.32760 MIN: 2.81 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU onednn-i9Ph10 0.8451 1.6902 2.5353 3.3804 4.2255 SE +/- 0.02427, N = 3 3.75618 MIN: 3.64 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU onednn-i9Ph10 900 1800 2700 3600 4500 SE +/- 0.94, N = 3 4016.91 MIN: 3981.58 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU onednn-i9Ph10 500 1000 1500 2000 2500 SE +/- 1.65, N = 3 2528.57 MIN: 2493.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU onednn-i9Ph10 900 1800 2700 3600 4500 SE +/- 1.46, N = 3 4016.07 MIN: 3984.54 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU onednn-i9Ph10 5 10 15 20 25 SE +/- 0.01, N = 3 22.18 MIN: 21.95 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU onednn-i9Ph10 7 14 21 28 35 SE +/- 0.01, N = 3 31.23 MIN: 30.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU onednn-i9Ph10 6 12 18 24 30 SE +/- 0.03, N = 3 26.74 MIN: 26.27 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU onednn-i9Ph10 500 1000 1500 2000 2500 SE +/- 0.81, N = 3 2530.55 MIN: 2497.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU onednn-i9Ph10 0.5721 1.1442 1.7163 2.2884 2.8605 SE +/- 0.00332, N = 3 2.54261 MIN: 2.15 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU onednn-i9Ph10 900 1800 2700 3600 4500 SE +/- 1.16, N = 3 4015.84 MIN: 3980.89 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU onednn-i9Ph10 500 1000 1500 2000 2500 SE +/- 1.04, N = 3 2529.35 MIN: 2494.55 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU onednn-i9Ph10 0.4181 0.8362 1.2543 1.6724 2.0905 SE +/- 0.00144, N = 3 1.85819 MIN: 1.49 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU onednn-i9Ph10 1.2236 2.4472 3.6708 4.8944 6.118 SE +/- 0.00321, N = 3 5.43817 MIN: 4.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Phoronix Test Suite v10.8.4