oneDNN 2.0 3900X

AMD Ryzen 9 3900X 12-Core testing with a ASUS TUF GAMING X570-PLUS (WI-FI) (2203 BIOS) and MSI AMD Radeon RX 470/480/570/570X/580/580X/590 8GB on Ubuntu 20.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2012108-PTS-ONEDNN2022.

oneDNN 2.0 3900XProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionMSI AMD Radeon RX 47023AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads)ASUS TUF GAMING X570-PLUS (WI-FI) (2203 BIOS)AMD Starship/Matisse16GBSamsung SSD 970 EVO Plus 250GBMSI AMD Radeon RX 470/480/570/570X/580/580X/590 8GB (1366/2000MHz)AMD Ellesmere HDMI AudioG237HLRealtek RTL8111/8168/8411 + Intel-AC 9260Ubuntu 20.045.9.0-050900rc6daily20200922-generic (x86_64) 20200921GNOME Shell 3.36.4X Server 1.20.8modesetting 1.20.84.6 Mesa 20.2.0-devel (git-64cdc13 2020-07-02 focal-oibaf-ppa) (LLVM 10.0.0)1.2.131GCC 9.3.0ext41920x1080OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8701021Graphics Details- GLAMORSecurity Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

oneDNN 2.0 3900Xbetsy: ETC1 - Highestbetsy: ETC2 RGB - Highestonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUbuild-clash: Time To Compilephpbench: PHP Benchmark SuiteMSI AMD Radeon RX 470238.3539.81030.170810.83423.028210.89085522.38133.529905.0458024.87814.234493.588313957.602385.693981.472400.590.8741903966.952382.912.01888370.2007012608.1889.6394.7694210.66401.937130.91889222.38163.545775.0557824.79294.237343.590393977.862396.093968.032392.780.8810173962.152393.072.02211370.5236970678.1809.6384.7579410.80551.930790.90564722.39123.543905.0535024.85274.226213.590063969.012377.203965.632397.360.8806303982.552375.032.02449369.536683719OpenBenchmarking.org

Betsy GPU Compressor

Codec: ETC1 - Quality: Highest

OpenBenchmarking.orgSeconds, Fewer Is BetterBetsy GPU Compressor 1.1 BetaCodec: ETC1 - Quality: HighestMSI AMD Radeon RX 47023246810SE +/- 0.182, N = 15SE +/- 0.019, N = 3SE +/- 0.015, N = 38.3538.1888.1801. (CXX) g++ options: -O3 -O2 -lpthread -ldl

Betsy GPU Compressor

Codec: ETC2 RGB - Quality: Highest

OpenBenchmarking.orgSeconds, Fewer Is BetterBetsy GPU Compressor 1.1 BetaCodec: ETC2 RGB - Quality: HighestMSI AMD Radeon RX 470233691215SE +/- 0.188, N = 15SE +/- 0.007, N = 3SE +/- 0.008, N = 39.8109.6399.6381. (CXX) g++ options: -O3 -O2 -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUMSI AMD Radeon RX 47023714212835SE +/- 3.03866, N = 15SE +/- 0.00104, N = 3SE +/- 0.00346, N = 330.170804.769424.75794MIN: 4.531. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUMSI AMD Radeon RX 470233691215SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 310.8310.6610.81MIN: 10.66MIN: 10.49MIN: 10.631. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUMSI AMD Radeon RX 470230.68131.36262.04392.72523.4065SE +/- 1.08836, N = 15SE +/- 0.00401, N = 3SE +/- 0.00531, N = 33.028211.937131.93079MIN: 1.87MIN: 1.89MIN: 1.891. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUMSI AMD Radeon RX 470230.20680.41360.62040.82721.034SE +/- 0.002229, N = 3SE +/- 0.006713, N = 3SE +/- 0.003367, N = 30.8908550.9188920.905647MIN: 0.83MIN: 0.85MIN: 0.851. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUMSI AMD Radeon RX 47023510152025SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 322.3822.3822.39MIN: 21.93MIN: 21.68MIN: 21.841. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUMSI AMD Radeon RX 470230.79781.59562.39343.19123.989SE +/- 0.00643, N = 3SE +/- 0.01224, N = 3SE +/- 0.00543, N = 33.529903.545773.54390MIN: 3.46MIN: 3.45MIN: 3.471. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUMSI AMD Radeon RX 470231.13762.27523.41284.55045.688SE +/- 0.01192, N = 3SE +/- 0.01382, N = 3SE +/- 0.00860, N = 35.045805.055785.05350MIN: 4.96MIN: 4.97MIN: 4.961. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUMSI AMD Radeon RX 47023612182430SE +/- 0.07, N = 3SE +/- 0.01, N = 3SE +/- 0.05, N = 324.8824.7924.85MIN: 24.3MIN: 24.32MIN: 24.271. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUMSI AMD Radeon RX 470230.95341.90682.86023.81364.767SE +/- 0.00679, N = 3SE +/- 0.00585, N = 3SE +/- 0.00768, N = 34.234494.237344.22621MIN: 4.06MIN: 4.05MIN: 4.051. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUMSI AMD Radeon RX 470230.80781.61562.42343.23124.039SE +/- 0.00126, N = 3SE +/- 0.00067, N = 3SE +/- 0.00045, N = 33.588313.590393.59006MIN: 3.47MIN: 3.47MIN: 3.481. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUMSI AMD Radeon RX 470239001800270036004500SE +/- 9.55, N = 3SE +/- 12.90, N = 3SE +/- 14.29, N = 33957.603977.863969.01MIN: 3930.81MIN: 3948.18MIN: 3933.831. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUMSI AMD Radeon RX 470235001000150020002500SE +/- 13.92, N = 3SE +/- 11.94, N = 3SE +/- 6.32, N = 32385.692396.092377.20MIN: 2338.22MIN: 2357.37MIN: 2349.951. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUMSI AMD Radeon RX 470239001800270036004500SE +/- 4.60, N = 3SE +/- 2.31, N = 3SE +/- 7.53, N = 33981.473968.033965.63MIN: 3962.47MIN: 3952.97MIN: 3944.281. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUMSI AMD Radeon RX 470235001000150020002500SE +/- 13.97, N = 3SE +/- 4.47, N = 3SE +/- 12.67, N = 32400.592392.782397.36MIN: 2361.61MIN: 2375.64MIN: 2358.431. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUMSI AMD Radeon RX 470230.19820.39640.59460.79280.991SE +/- 0.002122, N = 3SE +/- 0.005068, N = 3SE +/- 0.001694, N = 30.8741900.8810170.880630MIN: 0.84MIN: 0.84MIN: 0.851. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUMSI AMD Radeon RX 470239001800270036004500SE +/- 11.28, N = 3SE +/- 10.99, N = 3SE +/- 5.71, N = 33966.953962.153982.55MIN: 3935.95MIN: 3935.18MIN: 3956.561. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUMSI AMD Radeon RX 470235001000150020002500SE +/- 16.86, N = 3SE +/- 13.42, N = 3SE +/- 3.09, N = 32382.912393.072375.03MIN: 2352.53MIN: 2358.03MIN: 2359.611. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUMSI AMD Radeon RX 470230.45550.9111.36651.8222.2775SE +/- 0.00079, N = 3SE +/- 0.00164, N = 3SE +/- 0.00280, N = 32.018882.022112.02449MIN: 1.97MIN: 1.98MIN: 1.971. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

Timed Clash Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Clash CompilationTime To CompileMSI AMD Radeon RX 4702380160240320400SE +/- 0.93, N = 3SE +/- 0.15, N = 3SE +/- 0.60, N = 3370.20370.52369.54

PHPBench

PHP Benchmark Suite

OpenBenchmarking.orgScore, More Is BetterPHPBench 0.8.1PHP Benchmark SuiteMSI AMD Radeon RX 47023150K300K450K600K750KSE +/- 4779.16, N = 3SE +/- 6034.57, N = 3SE +/- 6543.52, N = 3701260697067683719


Phoronix Test Suite v10.8.4