one one AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310140-PTS-ONEONE4432&grs&sro .
one one Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.04 6.2.0-32-generic (x86_64) GNOME Shell 44.0 X Server + Wayland 4.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49) GCC 12.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
one one onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU embree: Pathtracer - Asian Dragon embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown openvkl: vklBenchmarkCPU Scalar onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU embree: Pathtracer ISPC - Asian Dragon Obj embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon onednn: Deconvolution Batch shapes_1d - f32 - CPU openvkl: vklBenchmarkCPU ISPC oidn: RTLightmap.hdr.4096x4096 - CPU-Only onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU a b c d 9669.71 1007 2.50584 2.46367 5.90121 1.01363 41.9581 53.6883 48.661 425 1.62713 35.7683 36.0793 41.4105 9.68878 744 0.67 2.1401 1.0828 0.988933 848.083 6.5002 857.633 1.36 1.36 4010.95 4014.4 4010.42 849.771 2.36677 2.28707 6.01717 1.02735 41.3706 52.7009 47.5384 410 1.68374 35.3972 35.6958 40.9626 9.52953 731 0.67 2.1244 1.08255 1.00008 853.458 6.50551 857.243 1.36 1.36 3996.22 4014.69 4008.22 857.774 2.50646 2.41181 6.32392 0.976971 40.1833 51.4653 46.8266 410 1.65764 34.8361 35.5365 40.5698 9.75211 741 0.66 2.13061 1.06728 0.988952 845.921 6.53647 850.832 1.35 1.35 4012.8 4014.72 4006.87 848.697 2.13116 2.55858 6.35958 1.0375 40.6314 52.1679 47.184 410 1.67016 34.9048 35.1487 40.3973 9.62871 743 0.66 2.15552 1.06927 0.994753 850.226 6.55404 853.262 1.35 1.35 3998.33 4003.11 OpenBenchmarking.org
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b c d 2K 4K 6K 8K 10K 9669.71 4010.42 4008.22 4006.87 MIN: 7787.84 MIN: 3984.21 MIN: 3979.27 MIN: 3982.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU a b c d 200 400 600 800 1000 1007.00 849.77 857.77 848.70 MIN: 843.56 MIN: 830.81 MIN: 839.35 MIN: 831.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU a b c d 0.564 1.128 1.692 2.256 2.82 2.50584 2.36677 2.50646 2.13116 MIN: 2.03 MIN: 1.89 MIN: 2.02 MIN: 1.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b c d 0.5757 1.1514 1.7271 2.3028 2.8785 2.46367 2.28707 2.41181 2.55858 MIN: 1.96 MIN: 1.93 MIN: 1.97 MIN: 2.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU a b c d 2 4 6 8 10 5.90121 6.01717 6.32392 6.35958 MIN: 5.75 MIN: 5.87 MIN: 6.17 MIN: 6.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU a b c d 0.2334 0.4668 0.7002 0.9336 1.167 1.013630 1.027350 0.976971 1.037500 MIN: 0.93 MIN: 0.93 MIN: 0.9 MIN: 0.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon a b c d 10 20 30 40 50 41.96 41.37 40.18 40.63 MIN: 41.59 / MAX: 43.14 MIN: 40.99 / MAX: 42.35 MIN: 39.8 / MAX: 41.46 MIN: 40.27 / MAX: 41.67
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Crown a b c d 12 24 36 48 60 53.69 52.70 51.47 52.17 MIN: 52.88 / MAX: 55.7 MIN: 51.87 / MAX: 55.15 MIN: 50.69 / MAX: 53.73 MIN: 51.39 / MAX: 54.72
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown a b c d 11 22 33 44 55 48.66 47.54 46.83 47.18 MIN: 47.93 / MAX: 50.27 MIN: 46.84 / MAX: 48.82 MIN: 46.12 / MAX: 48.33 MIN: 46.43 / MAX: 49
OpenVKL Benchmark: vklBenchmarkCPU Scalar OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 2.0.0 Benchmark: vklBenchmarkCPU Scalar a b c d 90 180 270 360 450 425 410 410 410 MIN: 32 / MAX: 7630 MIN: 32 / MAX: 7618 MIN: 32 / MAX: 7553 MIN: 32 / MAX: 7568
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b c d 0.3788 0.7576 1.1364 1.5152 1.894 1.62713 1.68374 1.65764 1.67016 MIN: 1.44 MIN: 1.47 MIN: 1.47 MIN: 1.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b c d 8 16 24 32 40 35.77 35.40 34.84 34.90 MIN: 35.43 / MAX: 37.08 MIN: 35.07 / MAX: 36.62 MIN: 34.51 / MAX: 36.1 MIN: 34.56 / MAX: 35.96
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon Obj a b c d 8 16 24 32 40 36.08 35.70 35.54 35.15 MIN: 35.73 / MAX: 37.21 MIN: 35.37 / MAX: 37.09 MIN: 35.2 / MAX: 36.63 MIN: 34.85 / MAX: 36.21
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon a b c d 9 18 27 36 45 41.41 40.96 40.57 40.40 MIN: 41.01 / MAX: 42.78 MIN: 40.58 / MAX: 42.16 MIN: 40.19 / MAX: 41.97 MIN: 40.03 / MAX: 41.5
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b c d 3 6 9 12 15 9.68878 9.52953 9.75211 9.62871 MIN: 8.11 MIN: 8.42 MIN: 8.72 MIN: 8.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenVKL Benchmark: vklBenchmarkCPU ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 2.0.0 Benchmark: vklBenchmarkCPU ISPC a b c d 160 320 480 640 800 744 731 741 743 MIN: 68 / MAX: 9491 MIN: 68 / MAX: 9252 MIN: 68 / MAX: 9128 MIN: 68 / MAX: 9131
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.1 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b c d 0.1508 0.3016 0.4524 0.6032 0.754 0.67 0.67 0.66 0.66
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b c d 0.485 0.97 1.455 1.94 2.425 2.14010 2.12440 2.13061 2.15552 MIN: 2.06 MIN: 2.07 MIN: 2.07 MIN: 2.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b c d 0.2436 0.4872 0.7308 0.9744 1.218 1.08280 1.08255 1.06728 1.06927 MIN: 0.98 MIN: 0.99 MIN: 0.97 MIN: 0.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b c d 0.225 0.45 0.675 0.9 1.125 0.988933 1.000080 0.988952 0.994753 MIN: 0.93 MIN: 0.94 MIN: 0.92 MIN: 0.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b c d 200 400 600 800 1000 848.08 853.46 845.92 850.23 MIN: 830.7 MIN: 835.68 MIN: 829.1 MIN: 830.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a b c d 2 4 6 8 10 6.50020 6.50551 6.53647 6.55404 MIN: 6.38 MIN: 6.4 MIN: 6.41 MIN: 6.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b c d 200 400 600 800 1000 857.63 857.24 850.83 853.26 MIN: 838.08 MIN: 839.72 MIN: 833.23 MIN: 835.15 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.1 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b c d 0.306 0.612 0.918 1.224 1.53 1.36 1.36 1.35 1.35
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.1 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b c d 0.306 0.612 0.918 1.224 1.53 1.36 1.36 1.35 1.35
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b c d 900 1800 2700 3600 4500 4010.95 3996.22 4012.80 3998.33 MIN: 3990.21 MIN: 3974.78 MIN: 3987.49 MIN: 3971.44 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b c d 900 1800 2700 3600 4500 4014.40 4014.69 4014.72 4003.11 MIN: 3987.09 MIN: 3992.17 MIN: 3990.65 MIN: 3978.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Phoronix Test Suite v10.8.5