8380 sun 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2204037-NE-8380SUN3435&grr .
8380 sun Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution A B C D 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Device 0998 512GB 3841GB Micron_9300_MTFDHAL3T8TDP ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 20.04 5.15.11-051511-generic (x86_64) GNOME Shell 3.36.9 X Server 1.20.13 1.0.2 GCC 9.3.0 + Clang 10.0.0-4ubuntu1 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0xd0002a0 Java Details - OpenJDK Runtime Environment (build 11.0.13+8-Ubuntu-0ubuntu1.20.04) Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
8380 sun java-jmh: Throughput perf-bench: Futex Lock-Pi onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU perf-bench: Sched Pipe perf-bench: Epoll Wait perf-bench: Futex Hash perf-bench: Memcpy 1MB onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU perf-bench: Memset 1MB onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU perf-bench: Syscall Basic onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU A B C D 117974431957.74 47 618.676 636.287 627.389 378.959 379.171 378.855 157484 3190 2990938 16.846117 7.24738 3.78575 0.372274 58.850905 0.874369 2.98315 1.30484 2.08241 0.230657 0.177939 1.27751 1.81335 0.444806 13955440 1.13824 1.38472 2.12309 3.62162 0.884168 0.194293 117397403601.04 44 626.982 621.88 625.616 386.179 379.589 383.208 159641 3376 2990356 15.93434 7.23701 3.79039 0.3691 53.171926 0.858012 3.00061 1.29761 2.09074 0.234187 0.170109 1.2813 1.81013 0.438798 13985008 1.12915 1.39272 2.11786 3.61967 0.888675 0.196251 117539633814.56 49 618.143 618.823 621.734 379.585 379.634 379.206 159064 3518 2985843 16.267801 7.26969 3.76245 0.364478 58.448906 0.863419 2.99363 1.28310 2.05510 0.232623 0.172350 1.29049 1.82023 0.443744 13976056 1.14503 1.38829 2.10492 3.60012 0.888288 0.194502 117793385377.77 50 617.792 617.729 622.500 377.234 378.233 378.641 158896 3345 2984853 15.954044 7.29653 3.77449 0.372693 59.788673 0.855213 2.99408 1.27539 2.08466 0.232562 0.172182 1.28979 1.82082 0.445376 13925517 1.15270 1.38183 2.11735 3.60258 0.884572 0.191428 OpenBenchmarking.org
Java JMH Throughput OpenBenchmarking.org Ops/s, More Is Better Java JMH Throughput A B C D 30000M 60000M 90000M 120000M 150000M 117974431957.74 117397403601.04 117539633814.56 117793385377.77
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi A B C D 11 22 33 44 55 SE +/- 1.92, N = 15 SE +/- 2.63, N = 15 47 44 49 50 1. (CC) gcc options: -pthread -shared -lunwind-x86_64 -lunwind -llzma -Xlinker -export-dynamic -O6 -ggdb3 -funwind-tables -std=gnu99 -fPIC -lnuma
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU A B C D 140 280 420 560 700 SE +/- 1.17, N = 3 SE +/- 0.39, N = 3 618.68 626.98 618.14 617.79 MIN: 594.98 MIN: 598.45 MIN: 592.21 MIN: 594.21 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU A B C D 140 280 420 560 700 SE +/- 0.48, N = 3 SE +/- 0.57, N = 3 636.29 621.88 618.82 617.73 MIN: 595.54 MIN: 595.66 MIN: 594.97 MIN: 594.74 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU A B C D 140 280 420 560 700 SE +/- 1.66, N = 3 SE +/- 1.46, N = 3 627.39 625.62 621.73 622.50 MIN: 596.88 MIN: 597.43 MIN: 594.79 MIN: 596.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU A B C D 80 160 240 320 400 SE +/- 1.61, N = 3 SE +/- 1.09, N = 3 378.96 386.18 379.59 377.23 MIN: 363.09 MIN: 363.76 MIN: 358.8 MIN: 359.79 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU A B C D 80 160 240 320 400 SE +/- 1.29, N = 3 SE +/- 0.51, N = 3 379.17 379.59 379.63 378.23 MIN: 363.09 MIN: 361.95 MIN: 361.75 MIN: 360.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU A B C D 80 160 240 320 400 SE +/- 2.30, N = 3 SE +/- 0.30, N = 3 378.86 383.21 379.21 378.64 MIN: 362.36 MIN: 366.5 MIN: 358.89 MIN: 362.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe A B C D 30K 60K 90K 120K 150K SE +/- 545.51, N = 3 SE +/- 451.65, N = 3 157484 159641 159064 158896 1. (CC) gcc options: -pthread -shared -lunwind-x86_64 -lunwind -llzma -Xlinker -export-dynamic -O6 -ggdb3 -funwind-tables -std=gnu99 -fPIC -lnuma
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait A B C D 800 1600 2400 3200 4000 SE +/- 26.58, N = 3 SE +/- 25.11, N = 3 3190 3376 3518 3345 1. (CC) gcc options: -pthread -shared -lunwind-x86_64 -lunwind -llzma -Xlinker -export-dynamic -O6 -ggdb3 -funwind-tables -std=gnu99 -fPIC -lnuma
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash A B C D 600K 1200K 1800K 2400K 3000K SE +/- 2180.19, N = 3 SE +/- 4824.86, N = 3 2990938 2990356 2985843 2984853 1. (CC) gcc options: -pthread -shared -lunwind-x86_64 -lunwind -llzma -Xlinker -export-dynamic -O6 -ggdb3 -funwind-tables -std=gnu99 -fPIC -lnuma
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB A B C D 4 8 12 16 20 SE +/- 0.16, N = 3 SE +/- 0.18, N = 3 16.85 15.93 16.27 15.95 1. (CC) gcc options: -pthread -shared -lunwind-x86_64 -lunwind -llzma -Xlinker -export-dynamic -O6 -ggdb3 -funwind-tables -std=gnu99 -fPIC -lnuma
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU A B C D 2 4 6 8 10 SE +/- 0.00587, N = 3 SE +/- 0.00979, N = 3 7.24738 7.23701 7.26969 7.29653 MIN: 6.76 MIN: 6.73 MIN: 6.69 MIN: 6.67 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU A B C D 0.8528 1.7056 2.5584 3.4112 4.264 SE +/- 0.00243, N = 3 SE +/- 0.00723, N = 3 3.78575 3.79039 3.76245 3.77449 MIN: 3.54 MIN: 3.53 MIN: 3.52 MIN: 3.53 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU A B C D 0.0839 0.1678 0.2517 0.3356 0.4195 SE +/- 0.000878, N = 3 SE +/- 0.003422, N = 3 0.372274 0.369100 0.364478 0.372693 MIN: 0.33 MIN: 0.33 MIN: 0.33 MIN: 0.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB A B C D 13 26 39 52 65 SE +/- 0.55, N = 15 SE +/- 1.02, N = 3 58.85 53.17 58.45 59.79 1. (CC) gcc options: -pthread -shared -lunwind-x86_64 -lunwind -llzma -Xlinker -export-dynamic -O6 -ggdb3 -funwind-tables -std=gnu99 -fPIC -lnuma
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU A B C D 0.1967 0.3934 0.5901 0.7868 0.9835 SE +/- 0.001707, N = 3 SE +/- 0.002275, N = 3 0.874369 0.858012 0.863419 0.855213 MIN: 0.81 MIN: 0.8 MIN: 0.8 MIN: 0.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU A B C D 0.6751 1.3502 2.0253 2.7004 3.3755 SE +/- 0.00300, N = 3 SE +/- 0.00450, N = 3 2.98315 3.00061 2.99363 2.99408 MIN: 2.86 MIN: 2.87 MIN: 2.86 MIN: 2.85 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU A B C D 0.2936 0.5872 0.8808 1.1744 1.468 SE +/- 0.02105, N = 3 SE +/- 0.01852, N = 3 1.30484 1.29761 1.28310 1.27539 MIN: 0.94 MIN: 1.02 MIN: 0.93 MIN: 0.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU A B C D 0.4704 0.9408 1.4112 1.8816 2.352 SE +/- 0.01164, N = 3 SE +/- 0.03558, N = 3 2.08241 2.09074 2.05510 2.08466 MIN: 1.86 MIN: 1.85 MIN: 1.81 MIN: 1.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU A B C D 0.0527 0.1054 0.1581 0.2108 0.2635 SE +/- 0.001971, N = 3 SE +/- 0.000626, N = 3 0.230657 0.234187 0.232623 0.232562 MIN: 0.21 MIN: 0.22 MIN: 0.21 MIN: 0.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU A B C D 0.04 0.08 0.12 0.16 0.2 SE +/- 0.000739, N = 3 SE +/- 0.001576, N = 3 0.177939 0.170109 0.172350 0.172182 MIN: 0.16 MIN: 0.15 MIN: 0.16 MIN: 0.15 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU A B C D 0.2904 0.5808 0.8712 1.1616 1.452 SE +/- 0.00528, N = 3 SE +/- 0.00317, N = 3 1.27751 1.28130 1.29049 1.28979 MIN: 1.24 MIN: 1.25 MIN: 1.25 MIN: 1.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU A B C D 0.4097 0.8194 1.2291 1.6388 2.0485 SE +/- 0.00151, N = 3 SE +/- 0.00600, N = 3 1.81335 1.81013 1.82023 1.82082 MIN: 1.68 MIN: 1.68 MIN: 1.68 MIN: 1.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU A B C D 0.1002 0.2004 0.3006 0.4008 0.501 SE +/- 0.001544, N = 3 SE +/- 0.000362, N = 3 0.444806 0.438798 0.443744 0.445376 MIN: 0.41 MIN: 0.41 MIN: 0.41 MIN: 0.41 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic A B C D 3M 6M 9M 12M 15M SE +/- 4397.07, N = 3 SE +/- 37556.12, N = 3 13955440 13985008 13976056 13925517 1. (CC) gcc options: -pthread -shared -lunwind-x86_64 -lunwind -llzma -Xlinker -export-dynamic -O6 -ggdb3 -funwind-tables -std=gnu99 -fPIC -lnuma
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU A B C D 0.2594 0.5188 0.7782 1.0376 1.297 SE +/- 0.00239, N = 3 SE +/- 0.00521, N = 3 1.13824 1.12915 1.14503 1.15270 MIN: 0.95 MIN: 0.96 MIN: 0.97 MIN: 0.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU A B C D 0.3134 0.6268 0.9402 1.2536 1.567 SE +/- 0.00571, N = 3 SE +/- 0.00232, N = 3 1.38472 1.39272 1.38829 1.38183 MIN: 1.28 MIN: 1.24 MIN: 1.26 MIN: 1.26 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU A B C D 0.4777 0.9554 1.4331 1.9108 2.3885 SE +/- 0.00132, N = 3 SE +/- 0.00242, N = 3 2.12309 2.11786 2.10492 2.11735 MIN: 2.03 MIN: 2.03 MIN: 2.03 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU A B C D 0.8149 1.6298 2.4447 3.2596 4.0745 SE +/- 0.00386, N = 3 SE +/- 0.00989, N = 3 3.62162 3.61967 3.60012 3.60258 MIN: 3.51 MIN: 3.51 MIN: 3.51 MIN: 3.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU A B C D 0.2 0.4 0.6 0.8 1 SE +/- 0.002781, N = 3 SE +/- 0.001577, N = 3 0.884168 0.888675 0.888288 0.884572 MIN: 0.84 MIN: 0.84 MIN: 0.84 MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU A B C D 0.0442 0.0884 0.1326 0.1768 0.221 SE +/- 0.001404, N = 3 SE +/- 0.000487, N = 3 0.194293 0.196251 0.194502 0.191428 MIN: 0.18 MIN: 0.18 MIN: 0.18 MIN: 0.18 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
Phoronix Test Suite v10.8.4