3900XT oneDNN

AMD Ryzen 9 3900XT 12-Core testing with a MSI MEG X570 GODLIKE (MS-7C34) v1.0 (1.B3 BIOS) and AMD Radeon RX 56/64 8GB on Ubuntu 20.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2103135-PTS-3900XTON26
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
1
March 13 2021
  35 Minutes
2
March 13 2021
  35 Minutes
3
March 13 2021
  35 Minutes
Invert Hiding All Results Option
  35 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


3900XT oneDNNProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen Resolution123AMD Ryzen 9 3900XT 12-Core @ 3.80GHz (12 Cores / 24 Threads)MSI MEG X570 GODLIKE (MS-7C34) v1.0 (1.B3 BIOS)AMD Starship/Matisse16GB500GB Seagate FireCuda 520 SSD ZP500GM30002AMD Radeon RX 56/64 8GB (1630/945MHz)AMD Vega 10 HDMI AudioASUS MG28URealtek Device 2600 + Realtek Device 3000 + Intel Wi-Fi 6 AX200Ubuntu 20.105.11.0-rc1-phx (x86_64) 20201228GNOME Shell 3.38.1X Server 1.20.94.6 Mesa 20.2.1 (LLVM 11.0.0)1.2.131GCC 10.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8701021Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

123Result OverviewPhoronix Test Suite100%101%103%104%106%oneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNND.B.s - f32 - CPUIP Shapes 3D - u8s8f32 - CPUR.N.N.T - bf16bf16bf16 - CPUIP Shapes 3D - f32 - CPUC.B.S.A - f32 - CPUC.B.S.A - u8s8f32 - CPUD.B.s - u8s8f32 - CPUIP Shapes 1D - f32 - CPUM.M.B.S.T - u8s8f32 - CPUR.N.N.I - bf16bf16bf16 - CPUR.N.N.T - u8s8f32 - CPUM.M.B.S.T - f32 - CPUR.N.N.T - f32 - CPUD.B.s - f32 - CPUR.N.N.I - u8s8f32 - CPUIP Shapes 1D - u8s8f32 - CPUD.B.s - u8s8f32 - CPUR.N.N.I - f32 - CPU

3900XT oneDNNonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU1235.0582512.57502.032521.0332123.33735.820715.6063126.52712.655493.593714379.152612.584388.252604.850.9964474376.332615.192.253105.0298912.58222.032911.0751423.54625.860805.5970326.36972.655553.609474384.022611.524378.862598.640.9939334366.932609.292.246405.0395712.44552.028851.0609423.41116.141315.6137626.34262.659493.614634367.102611.374369.052605.420.9925154428.492603.722.24083OpenBenchmarking.org

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU1231.13812.27623.41434.55245.6905SE +/- 0.06874, N = 3SE +/- 0.03983, N = 3SE +/- 0.03993, N = 35.058255.029895.03957MIN: 4.47MIN: 4.5MIN: 4.491. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU123246810Min: 4.95 / Avg: 5.06 / Max: 5.19Min: 4.97 / Avg: 5.03 / Max: 5.1Min: 4.97 / Avg: 5.04 / Max: 5.11. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU1233691215SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 312.5812.5812.45MIN: 11.9MIN: 11.93MIN: 11.831. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU12348121620Min: 12.49 / Avg: 12.58 / Max: 12.68Min: 12.56 / Avg: 12.58 / Max: 12.63Min: 12.43 / Avg: 12.45 / Max: 12.471. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU1230.45740.91481.37221.82962.287SE +/- 0.01002, N = 3SE +/- 0.01542, N = 3SE +/- 0.00690, N = 32.032522.032912.02885MIN: 1.83MIN: 1.83MIN: 1.831. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU123246810Min: 2.01 / Avg: 2.03 / Max: 2.05Min: 2 / Avg: 2.03 / Max: 2.05Min: 2.02 / Avg: 2.03 / Max: 2.041. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU1230.24190.48380.72570.96761.2095SE +/- 0.00785, N = 3SE +/- 0.01015, N = 3SE +/- 0.01495, N = 31.033211.075141.06094MIN: 0.88MIN: 0.92MIN: 0.91. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU123246810Min: 1.02 / Avg: 1.03 / Max: 1.05Min: 1.06 / Avg: 1.08 / Max: 1.09Min: 1.03 / Avg: 1.06 / Max: 1.081. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU123612182430SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.03, N = 323.3423.5523.41MIN: 21.47MIN: 21.14MIN: 21.211. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU123510152025Min: 23.21 / Avg: 23.34 / Max: 23.46Min: 23.38 / Avg: 23.55 / Max: 23.66Min: 23.35 / Avg: 23.41 / Max: 23.461. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU123246810SE +/- 0.22838, N = 12SE +/- 0.27338, N = 12SE +/- 0.20970, N = 125.820715.860806.14131MIN: 3.86MIN: 3.9MIN: 3.91. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU123246810Min: 5.03 / Avg: 5.82 / Max: 7.35Min: 4.81 / Avg: 5.86 / Max: 7.57Min: 4.66 / Avg: 6.14 / Max: 7.341. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU1231.26312.52623.78935.05246.3155SE +/- 0.02081, N = 3SE +/- 0.01011, N = 3SE +/- 0.04185, N = 35.606315.597035.61376MIN: 5.18MIN: 5.19MIN: 5.21. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU123246810Min: 5.57 / Avg: 5.61 / Max: 5.64Min: 5.58 / Avg: 5.6 / Max: 5.61Min: 5.54 / Avg: 5.61 / Max: 5.691. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU123612182430SE +/- 0.19, N = 3SE +/- 0.05, N = 3SE +/- 0.09, N = 326.5326.3726.34MIN: 23.72MIN: 24.02MIN: 23.911. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU123612182430Min: 26.29 / Avg: 26.53 / Max: 26.89Min: 26.29 / Avg: 26.37 / Max: 26.47Min: 26.16 / Avg: 26.34 / Max: 26.461. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU1230.59841.19681.79522.39362.992SE +/- 0.02833, N = 3SE +/- 0.01797, N = 3SE +/- 0.03012, N = 32.655492.655552.65949MIN: 2.41MIN: 2.41MIN: 2.41. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU123246810Min: 2.62 / Avg: 2.66 / Max: 2.71Min: 2.62 / Avg: 2.66 / Max: 2.68Min: 2.62 / Avg: 2.66 / Max: 2.721. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU1230.81331.62662.43993.25324.0665SE +/- 0.03392, N = 3SE +/- 0.00112, N = 3SE +/- 0.00964, N = 33.593713.609473.61463MIN: 3.27MIN: 3.28MIN: 3.271. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU123246810Min: 3.53 / Avg: 3.59 / Max: 3.64Min: 3.61 / Avg: 3.61 / Max: 3.61Min: 3.6 / Avg: 3.61 / Max: 3.631. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU1239001800270036004500SE +/- 4.16, N = 3SE +/- 7.82, N = 3SE +/- 3.06, N = 34379.154384.024367.10MIN: 4303.45MIN: 4287.48MIN: 4293.291. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU1238001600240032004000Min: 4372.39 / Avg: 4379.15 / Max: 4386.72Min: 4368.51 / Avg: 4384.02 / Max: 4393.47Min: 4361.74 / Avg: 4367.1 / Max: 4372.341. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU1236001200180024003000SE +/- 9.13, N = 3SE +/- 4.39, N = 3SE +/- 6.06, N = 32612.582611.522611.37MIN: 2543.92MIN: 2552.33MIN: 2547.491. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU1235001000150020002500Min: 2598.07 / Avg: 2612.58 / Max: 2629.44Min: 2606.76 / Avg: 2611.52 / Max: 2620.29Min: 2600.01 / Avg: 2611.37 / Max: 2620.731. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU1239001800270036004500SE +/- 1.15, N = 3SE +/- 3.43, N = 3SE +/- 5.02, N = 34388.254378.864369.05MIN: 4298.02MIN: 4307.68MIN: 4296.441. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU1238001600240032004000Min: 4386.02 / Avg: 4388.25 / Max: 4389.86Min: 4375.43 / Avg: 4378.86 / Max: 4385.72Min: 4362.56 / Avg: 4369.05 / Max: 4378.921. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU1236001200180024003000SE +/- 8.85, N = 3SE +/- 1.50, N = 3SE +/- 7.05, N = 32604.852598.642605.42MIN: 2542.46MIN: 2535.53MIN: 2544.761. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU1235001000150020002500Min: 2587.89 / Avg: 2604.85 / Max: 2617.71Min: 2595.64 / Avg: 2598.64 / Max: 2600.3Min: 2595.25 / Avg: 2605.42 / Max: 2618.971. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU1230.22420.44840.67260.89681.121SE +/- 0.003806, N = 3SE +/- 0.009540, N = 3SE +/- 0.008832, N = 30.9964470.9939330.992515MIN: 0.87MIN: 0.87MIN: 0.871. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU123246810Min: 0.99 / Avg: 1 / Max: 1Min: 0.98 / Avg: 0.99 / Max: 1.01Min: 0.98 / Avg: 0.99 / Max: 1.011. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU1239001800270036004500SE +/- 8.07, N = 3SE +/- 5.60, N = 3SE +/- 59.30, N = 34376.334366.934428.49MIN: 4286.92MIN: 4292.7MIN: 4292.561. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU1238001600240032004000Min: 4361.32 / Avg: 4376.33 / Max: 4388.98Min: 4358.71 / Avg: 4366.93 / Max: 4377.63Min: 4357.94 / Avg: 4428.49 / Max: 4546.321. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU1236001200180024003000SE +/- 5.09, N = 3SE +/- 0.99, N = 3SE +/- 6.10, N = 32615.192609.292603.72MIN: 2541.78MIN: 2546.55MIN: 2543.951. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU1235001000150020002500Min: 2605.17 / Avg: 2615.19 / Max: 2621.72Min: 2607.31 / Avg: 2609.29 / Max: 2610.43Min: 2596.07 / Avg: 2603.72 / Max: 2615.771. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU1230.50691.01381.52072.02762.5345SE +/- 0.01438, N = 3SE +/- 0.00661, N = 3SE +/- 0.00845, N = 32.253102.246402.24083MIN: 2.03MIN: 2.03MIN: 2.041. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU123246810Min: 2.23 / Avg: 2.25 / Max: 2.28Min: 2.23 / Avg: 2.25 / Max: 2.26Min: 2.22 / Avg: 2.24 / Max: 2.251. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl