oneDNN 2.0 3900X AMD Ryzen 9 3900X 12-Core testing with a ASUS TUF GAMING X570-PLUS (WI-FI) (2203 BIOS) and MSI AMD Radeon RX 470/480/570/570X/580/580X/590 8GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012108-PTS-ONEDNN2022&grt&export=txt&sor .
oneDNN 2.0 3900X Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution MSI AMD Radeon RX 470 2 3 AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads) ASUS TUF GAMING X570-PLUS (WI-FI) (2203 BIOS) AMD Starship/Matisse 16GB Samsung SSD 970 EVO Plus 250GB MSI AMD Radeon RX 470/480/570/570X/580/580X/590 8GB (1366/2000MHz) AMD Ellesmere HDMI Audio G237HL Realtek RTL8111/8168/8411 + Intel-AC 9260 Ubuntu 20.04 5.9.0-050900rc6daily20200922-generic (x86_64) 20200921 GNOME Shell 3.36.4 X Server 1.20.8 modesetting 1.20.8 4.6 Mesa 20.2.0-devel (git-64cdc13 2020-07-02 focal-oibaf-ppa) (LLVM 10.0.0) 1.2.131 GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8701021 Graphics Details - GLAMOR Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
oneDNN 2.0 3900X betsy: ETC1 - Highest betsy: ETC2 RGB - Highest onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU phpbench: PHP Benchmark Suite build-clash: Time To Compile MSI AMD Radeon RX 470 2 3 8.353 9.810 30.1708 10.8342 3.02821 0.890855 22.3813 3.52990 5.04580 24.8781 4.23449 3.58831 3957.60 2385.69 3981.47 2400.59 0.874190 3966.95 2382.91 2.01888 701260 370.200 8.188 9.639 4.76942 10.6640 1.93713 0.918892 22.3816 3.54577 5.05578 24.7929 4.23734 3.59039 3977.86 2396.09 3968.03 2392.78 0.881017 3962.15 2393.07 2.02211 697067 370.523 8.180 9.638 4.75794 10.8055 1.93079 0.905647 22.3912 3.54390 5.05350 24.8527 4.22621 3.59006 3969.01 2377.20 3965.63 2397.36 0.880630 3982.55 2375.03 2.02449 683719 369.536 OpenBenchmarking.org
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest 3 2 MSI AMD Radeon RX 470 2 4 6 8 10 SE +/- 0.015, N = 3 SE +/- 0.019, N = 3 SE +/- 0.182, N = 15 8.180 8.188 8.353 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest 3 2 MSI AMD Radeon RX 470 3 6 9 12 15 SE +/- 0.008, N = 3 SE +/- 0.007, N = 3 SE +/- 0.188, N = 15 9.638 9.639 9.810 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3 2 MSI AMD Radeon RX 470 7 14 21 28 35 SE +/- 0.00346, N = 3 SE +/- 0.00104, N = 3 SE +/- 3.03866, N = 15 4.75794 4.76942 30.17080 MIN: 4.53 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 2 3 MSI AMD Radeon RX 470 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 10.66 10.81 10.83 MIN: 10.49 MIN: 10.63 MIN: 10.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3 2 MSI AMD Radeon RX 470 0.6813 1.3626 2.0439 2.7252 3.4065 SE +/- 0.00531, N = 3 SE +/- 0.00401, N = 3 SE +/- 1.08836, N = 15 1.93079 1.93713 3.02821 MIN: 1.89 MIN: 1.89 MIN: 1.87 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU MSI AMD Radeon RX 470 3 2 0.2068 0.4136 0.6204 0.8272 1.034 SE +/- 0.002229, N = 3 SE +/- 0.003367, N = 3 SE +/- 0.006713, N = 3 0.890855 0.905647 0.918892 MIN: 0.83 MIN: 0.85 MIN: 0.85 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU MSI AMD Radeon RX 470 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 22.38 22.38 22.39 MIN: 21.93 MIN: 21.68 MIN: 21.84 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU MSI AMD Radeon RX 470 3 2 0.7978 1.5956 2.3934 3.1912 3.989 SE +/- 0.00643, N = 3 SE +/- 0.00543, N = 3 SE +/- 0.01224, N = 3 3.52990 3.54390 3.54577 MIN: 3.46 MIN: 3.47 MIN: 3.45 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU MSI AMD Radeon RX 470 3 2 1.1376 2.2752 3.4128 4.5504 5.688 SE +/- 0.01192, N = 3 SE +/- 0.00860, N = 3 SE +/- 0.01382, N = 3 5.04580 5.05350 5.05578 MIN: 4.96 MIN: 4.96 MIN: 4.97 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 2 3 MSI AMD Radeon RX 470 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 24.79 24.85 24.88 MIN: 24.32 MIN: 24.27 MIN: 24.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3 MSI AMD Radeon RX 470 2 0.9534 1.9068 2.8602 3.8136 4.767 SE +/- 0.00768, N = 3 SE +/- 0.00679, N = 3 SE +/- 0.00585, N = 3 4.22621 4.23449 4.23734 MIN: 4.05 MIN: 4.06 MIN: 4.05 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU MSI AMD Radeon RX 470 3 2 0.8078 1.6156 2.4234 3.2312 4.039 SE +/- 0.00126, N = 3 SE +/- 0.00045, N = 3 SE +/- 0.00067, N = 3 3.58831 3.59006 3.59039 MIN: 3.47 MIN: 3.48 MIN: 3.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU MSI AMD Radeon RX 470 3 2 900 1800 2700 3600 4500 SE +/- 9.55, N = 3 SE +/- 14.29, N = 3 SE +/- 12.90, N = 3 3957.60 3969.01 3977.86 MIN: 3930.81 MIN: 3933.83 MIN: 3948.18 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 3 MSI AMD Radeon RX 470 2 500 1000 1500 2000 2500 SE +/- 6.32, N = 3 SE +/- 13.92, N = 3 SE +/- 11.94, N = 3 2377.20 2385.69 2396.09 MIN: 2349.95 MIN: 2338.22 MIN: 2357.37 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3 2 MSI AMD Radeon RX 470 900 1800 2700 3600 4500 SE +/- 7.53, N = 3 SE +/- 2.31, N = 3 SE +/- 4.60, N = 3 3965.63 3968.03 3981.47 MIN: 3944.28 MIN: 3952.97 MIN: 3962.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 2 3 MSI AMD Radeon RX 470 500 1000 1500 2000 2500 SE +/- 4.47, N = 3 SE +/- 12.67, N = 3 SE +/- 13.97, N = 3 2392.78 2397.36 2400.59 MIN: 2375.64 MIN: 2358.43 MIN: 2361.61 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU MSI AMD Radeon RX 470 3 2 0.1982 0.3964 0.5946 0.7928 0.991 SE +/- 0.002122, N = 3 SE +/- 0.001694, N = 3 SE +/- 0.005068, N = 3 0.874190 0.880630 0.881017 MIN: 0.84 MIN: 0.85 MIN: 0.84 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 2 MSI AMD Radeon RX 470 3 900 1800 2700 3600 4500 SE +/- 10.99, N = 3 SE +/- 11.28, N = 3 SE +/- 5.71, N = 3 3962.15 3966.95 3982.55 MIN: 3935.18 MIN: 3935.95 MIN: 3956.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3 MSI AMD Radeon RX 470 2 500 1000 1500 2000 2500 SE +/- 3.09, N = 3 SE +/- 16.86, N = 3 SE +/- 13.42, N = 3 2375.03 2382.91 2393.07 MIN: 2359.61 MIN: 2352.53 MIN: 2358.03 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU MSI AMD Radeon RX 470 2 3 0.4555 0.911 1.3665 1.822 2.2775 SE +/- 0.00079, N = 3 SE +/- 0.00164, N = 3 SE +/- 0.00280, N = 3 2.01888 2.02211 2.02449 MIN: 1.97 MIN: 1.98 MIN: 1.97 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite MSI AMD Radeon RX 470 2 3 150K 300K 450K 600K 750K SE +/- 4779.16, N = 3 SE +/- 6034.57, N = 3 SE +/- 6543.52, N = 3 701260 697067 683719
Timed Clash Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Clash Compilation Time To Compile 3 MSI AMD Radeon RX 470 2 80 160 240 320 400 SE +/- 0.60, N = 3 SE +/- 0.93, N = 3 SE +/- 0.15, N = 3 369.54 370.20 370.52
Phoronix Test Suite v10.8.5