3300X oneDNN SVT Stuff AMD Ryzen 3 3300X 4-Core testing with a MSI B350M GAMING PRO (MS-7A39) v1.0 (2.NR BIOS) and AMD FirePro V3800 512MB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2103158-HA-3300XONED31&grt&sor .
3300X oneDNN SVT Stuff Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution 1 2 3 AMD Ryzen 3 3300X 4-Core @ 3.80GHz (4 Cores / 8 Threads) MSI B350M GAMING PRO (MS-7A39) v1.0 (2.NR BIOS) AMD Starship/Matisse 8GB 256GB INTEL SSDPEKKW256G7 AMD FirePro V3800 512MB AMD Redwood HDMI Audio DELL S2409W Realtek RTL8111/8168/8411 Ubuntu 20.04 5.9.0-rc5-14sep-patch (x86_64) 20200914 GNOME Shell 3.36.4 X Server 1.20.9 3.3 Mesa 20.0.8 (LLVM 10.0.0) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8701021 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
3300X oneDNN SVT Stuff onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU svt-hevc: 1 - Bosphorus 1080p svt-hevc: 7 - Bosphorus 1080p svt-hevc: 10 - Bosphorus 1080p svt-vp9: VMAF Optimized - Bosphorus 1080p svt-vp9: PSNR/SSIM Optimized - Bosphorus 1080p svt-vp9: Visual Quality Optimized - Bosphorus 1080p sysbench: RAM / Memory sysbench: CPU 1 2 3 6.78051 10.4017 4.91753 2.73839 21.7320 11.04320 11.3315 22.6564 6.79725 9.15795 5993.82 3109.22 6032.29 3101.94 5.02827 6030.31 3113.93 6.09889 4.03 62.02 133.19 108.00 110.67 87.76 16393.95 8918.82 6.75096 10.4706 4.90848 2.71709 21.8783 13.0095 11.3471 22.5345 6.78850 9.17290 5983.99 3118.20 6025.13 3107.96 5.03340 6010.23 3110.22 6.11621 4.04 61.91 133.04 109.62 110.57 88.03 16345.46 8918.55 6.78600 10.2441 4.88115 2.68835 21.7457 13.0919 11.2956 22.6150 6.77109 9.15620 5981.79 3119.40 6008.78 3113.52 5.01653 6015.97 3102.45 6.12141 4.04 61.92 133.20 109.10 111.25 87.67 16159.08 8921.34 OpenBenchmarking.org
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 2 1 3 2 4 6 8 10 SE +/- 0.00381, N = 3 SE +/- 0.00993, N = 3 SE +/- 0.03383, N = 3 6.75096 6.78051 6.78600 MIN: 6.52 MIN: 6.54 MIN: 6.48 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3 1 2 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 10.24 10.40 10.47 MIN: 10.04 MIN: 10.22 MIN: 10.32 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3 2 1 1.1064 2.2128 3.3192 4.4256 5.532 SE +/- 0.00783, N = 3 SE +/- 0.02509, N = 3 SE +/- 0.02875, N = 3 4.88115 4.90848 4.91753 MIN: 4.74 MIN: 4.73 MIN: 4.74 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3 2 1 0.6161 1.2322 1.8483 2.4644 3.0805 SE +/- 0.02173, N = 3 SE +/- 0.01683, N = 3 SE +/- 0.02085, N = 3 2.68835 2.71709 2.73839 MIN: 2.34 MIN: 2.39 MIN: 2.39 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 3 2 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 21.73 21.75 21.88 MIN: 21.32 MIN: 21.34 MIN: 21.34 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.48, N = 15 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 11.04 13.01 13.09 MIN: 8.76 MIN: 8.85 MIN: 8.86 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3 1 2 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 11.30 11.33 11.35 MIN: 11 MIN: 11.07 MIN: 11.05 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 2 3 1 5 10 15 20 25 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 22.53 22.62 22.66 MIN: 21.82 MIN: 21.87 MIN: 21.83 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3 2 1 2 4 6 8 10 SE +/- 0.01342, N = 3 SE +/- 0.02051, N = 3 SE +/- 0.01459, N = 3 6.77109 6.78850 6.79725 MIN: 6.54 MIN: 6.55 MIN: 6.54 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 3 1 2 3 6 9 12 15 SE +/- 0.05208, N = 3 SE +/- 0.04201, N = 3 SE +/- 0.02925, N = 3 9.15620 9.15795 9.17290 MIN: 8.83 MIN: 8.8 MIN: 8.82 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 3 2 1 1300 2600 3900 5200 6500 SE +/- 15.20, N = 3 SE +/- 10.53, N = 3 SE +/- 24.14, N = 3 5981.79 5983.99 5993.82 MIN: 5900.15 MIN: 5903.92 MIN: 5896.31 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 700 1400 2100 2800 3500 SE +/- 5.50, N = 3 SE +/- 6.83, N = 3 SE +/- 4.28, N = 3 3109.22 3118.20 3119.40 MIN: 3068.21 MIN: 3081.35 MIN: 3075.08 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3 2 1 1300 2600 3900 5200 6500 SE +/- 9.07, N = 3 SE +/- 1.80, N = 3 SE +/- 9.85, N = 3 6008.78 6025.13 6032.29 MIN: 5954.04 MIN: 5961.24 MIN: 5963.68 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 700 1400 2100 2800 3500 SE +/- 11.59, N = 3 SE +/- 2.08, N = 3 SE +/- 9.34, N = 3 3101.94 3107.96 3113.52 MIN: 3058.41 MIN: 3073 MIN: 3078.02 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 3 1 2 1.1325 2.265 3.3975 4.53 5.6625 SE +/- 0.00923, N = 3 SE +/- 0.01117, N = 3 SE +/- 0.01604, N = 3 5.01653 5.02827 5.03340 MIN: 4.88 MIN: 4.89 MIN: 4.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 2 3 1 1300 2600 3900 5200 6500 SE +/- 7.25, N = 3 SE +/- 17.06, N = 3 SE +/- 2.70, N = 3 6010.23 6015.97 6030.31 MIN: 5940.42 MIN: 5932.88 MIN: 5964.79 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3 2 1 700 1400 2100 2800 3500 SE +/- 6.91, N = 3 SE +/- 2.85, N = 3 SE +/- 4.61, N = 3 3102.45 3110.22 3113.93 MIN: 3067.54 MIN: 3064.84 MIN: 3078.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.02165, N = 3 SE +/- 0.02430, N = 3 SE +/- 0.02421, N = 3 6.09889 6.11621 6.12141 MIN: 5.9 MIN: 5.92 MIN: 5.92 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
SVT-HEVC Tuning: 1 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p 3 2 1 0.909 1.818 2.727 3.636 4.545 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.04 4.04 4.03 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 7 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p 1 3 2 14 28 42 56 70 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 62.02 61.92 61.91 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-HEVC Tuning: 10 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p 3 1 2 30 60 90 120 150 SE +/- 0.09, N = 3 SE +/- 0.16, N = 3 SE +/- 0.44, N = 3 133.20 133.19 133.04 1. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
SVT-VP9 Tuning: VMAF Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p 2 3 1 20 40 60 80 100 SE +/- 0.39, N = 3 SE +/- 0.22, N = 3 SE +/- 1.12, N = 3 109.62 109.10 108.00 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: PSNR/SSIM Optimized - Input: Bosphorus 1080p 3 1 2 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.17, N = 3 SE +/- 0.16, N = 3 111.25 110.67 110.57 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
SVT-VP9 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 0.3 Tuning: Visual Quality Optimized - Input: Bosphorus 1080p 2 1 3 20 40 60 80 100 SE +/- 0.19, N = 3 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 88.03 87.76 87.67 1. (CC) gcc options: -O3 -fcommon -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory 1 2 3 4K 8K 12K 16K 20K SE +/- 34.72, N = 3 SE +/- 36.56, N = 3 SE +/- 257.28, N = 3 16393.95 16345.46 16159.08 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU 3 1 2 2K 4K 6K 8K 10K SE +/- 0.88, N = 3 SE +/- 2.49, N = 3 SE +/- 2.00, N = 3 8921.34 8918.82 8918.55 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
Phoronix Test Suite v10.8.5