one one AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310140-PTS-ONEONE4432&grr&sro .
one one Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.04 6.2.0-32-generic (x86_64) GNOME Shell 44.0 X Server + Wayland 4.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49) GCC 12.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
one one openvkl: vklBenchmarkCPU ISPC openvkl: vklBenchmarkCPU Scalar onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU oidn: RTLightmap.hdr.4096x4096 - CPU-Only embree: Pathtracer ISPC - Asian Dragon Obj embree: Pathtracer - Asian Dragon Obj oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer - Asian Dragon embree: Pathtracer ISPC - Crown onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU embree: Pathtracer - Crown onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU a b c d 744 425 9669.71 4014.4 4010.95 1007 857.633 848.083 0.67 35.7683 36.0793 1.36 1.36 9.68878 1.62713 41.4105 41.9581 48.661 2.50584 2.46367 53.6883 5.90121 1.0828 1.01363 6.5002 2.1401 0.988933 731 410 4010.42 4014.69 3996.22 849.771 857.243 853.458 0.67 35.3972 35.6958 1.36 1.36 9.52953 1.68374 40.9626 41.3706 47.5384 2.36677 2.28707 52.7009 6.01717 1.08255 1.02735 6.50551 2.1244 1.00008 741 410 4008.22 4014.72 4012.8 857.774 850.832 845.921 0.66 34.8361 35.5365 1.35 1.35 9.75211 1.65764 40.5698 40.1833 46.8266 2.50646 2.41181 51.4653 6.32392 1.06728 0.976971 6.53647 2.13061 0.988952 743 410 4006.87 4003.11 3998.33 848.697 853.262 850.226 0.66 34.9048 35.1487 1.35 1.35 9.62871 1.67016 40.3973 40.6314 47.184 2.13116 2.55858 52.1679 6.35958 1.06927 1.0375 6.55404 2.15552 0.994753 OpenBenchmarking.org
OpenVKL Benchmark: vklBenchmarkCPU ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 2.0.0 Benchmark: vklBenchmarkCPU ISPC a b c d 160 320 480 640 800 744 731 741 743 MIN: 68 / MAX: 9491 MIN: 68 / MAX: 9252 MIN: 68 / MAX: 9128 MIN: 68 / MAX: 9131
OpenVKL Benchmark: vklBenchmarkCPU Scalar OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 2.0.0 Benchmark: vklBenchmarkCPU Scalar a b c d 90 180 270 360 450 425 410 410 410 MIN: 32 / MAX: 7630 MIN: 32 / MAX: 7618 MIN: 32 / MAX: 7553 MIN: 32 / MAX: 7568
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b c d 2K 4K 6K 8K 10K 9669.71 4010.42 4008.22 4006.87 MIN: 7787.84 MIN: 3984.21 MIN: 3979.27 MIN: 3982.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b c d 900 1800 2700 3600 4500 4014.40 4014.69 4014.72 4003.11 MIN: 3987.09 MIN: 3992.17 MIN: 3990.65 MIN: 3978.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b c d 900 1800 2700 3600 4500 4010.95 3996.22 4012.80 3998.33 MIN: 3990.21 MIN: 3974.78 MIN: 3987.49 MIN: 3971.44 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU a b c d 200 400 600 800 1000 1007.00 849.77 857.77 848.70 MIN: 843.56 MIN: 830.81 MIN: 839.35 MIN: 831.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b c d 200 400 600 800 1000 857.63 857.24 850.83 853.26 MIN: 838.08 MIN: 839.72 MIN: 833.23 MIN: 835.15 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b c d 200 400 600 800 1000 848.08 853.46 845.92 850.23 MIN: 830.7 MIN: 835.68 MIN: 829.1 MIN: 830.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.1 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b c d 0.1508 0.3016 0.4524 0.6032 0.754 0.67 0.67 0.66 0.66
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b c d 8 16 24 32 40 35.77 35.40 34.84 34.90 MIN: 35.43 / MAX: 37.08 MIN: 35.07 / MAX: 36.62 MIN: 34.51 / MAX: 36.1 MIN: 34.56 / MAX: 35.96
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon Obj a b c d 8 16 24 32 40 36.08 35.70 35.54 35.15 MIN: 35.73 / MAX: 37.21 MIN: 35.37 / MAX: 37.09 MIN: 35.2 / MAX: 36.63 MIN: 34.85 / MAX: 36.21
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.1 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b c d 0.306 0.612 0.918 1.224 1.53 1.36 1.36 1.35 1.35
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.1 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b c d 0.306 0.612 0.918 1.224 1.53 1.36 1.36 1.35 1.35
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b c d 3 6 9 12 15 9.68878 9.52953 9.75211 9.62871 MIN: 8.11 MIN: 8.42 MIN: 8.72 MIN: 8.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b c d 0.3788 0.7576 1.1364 1.5152 1.894 1.62713 1.68374 1.65764 1.67016 MIN: 1.44 MIN: 1.47 MIN: 1.47 MIN: 1.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon a b c d 9 18 27 36 45 41.41 40.96 40.57 40.40 MIN: 41.01 / MAX: 42.78 MIN: 40.58 / MAX: 42.16 MIN: 40.19 / MAX: 41.97 MIN: 40.03 / MAX: 41.5
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon a b c d 10 20 30 40 50 41.96 41.37 40.18 40.63 MIN: 41.59 / MAX: 43.14 MIN: 40.99 / MAX: 42.35 MIN: 39.8 / MAX: 41.46 MIN: 40.27 / MAX: 41.67
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown a b c d 11 22 33 44 55 48.66 47.54 46.83 47.18 MIN: 47.93 / MAX: 50.27 MIN: 46.84 / MAX: 48.82 MIN: 46.12 / MAX: 48.33 MIN: 46.43 / MAX: 49
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU a b c d 0.564 1.128 1.692 2.256 2.82 2.50584 2.36677 2.50646 2.13116 MIN: 2.03 MIN: 1.89 MIN: 2.02 MIN: 1.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b c d 0.5757 1.1514 1.7271 2.3028 2.8785 2.46367 2.28707 2.41181 2.55858 MIN: 1.96 MIN: 1.93 MIN: 1.97 MIN: 2.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Crown a b c d 12 24 36 48 60 53.69 52.70 51.47 52.17 MIN: 52.88 / MAX: 55.7 MIN: 51.87 / MAX: 55.15 MIN: 50.69 / MAX: 53.73 MIN: 51.39 / MAX: 54.72
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU a b c d 2 4 6 8 10 5.90121 6.01717 6.32392 6.35958 MIN: 5.75 MIN: 5.87 MIN: 6.17 MIN: 6.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b c d 0.2436 0.4872 0.7308 0.9744 1.218 1.08280 1.08255 1.06728 1.06927 MIN: 0.98 MIN: 0.99 MIN: 0.97 MIN: 0.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU a b c d 0.2334 0.4668 0.7002 0.9336 1.167 1.013630 1.027350 0.976971 1.037500 MIN: 0.93 MIN: 0.93 MIN: 0.9 MIN: 0.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a b c d 2 4 6 8 10 6.50020 6.50551 6.53647 6.55404 MIN: 6.38 MIN: 6.4 MIN: 6.41 MIN: 6.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b c d 0.485 0.97 1.455 1.94 2.425 2.14010 2.12440 2.13061 2.15552 MIN: 2.06 MIN: 2.07 MIN: 2.07 MIN: 2.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b c d 0.225 0.45 0.675 0.9 1.125 0.988933 1.000080 0.988952 0.994753 MIN: 0.93 MIN: 0.94 MIN: 0.92 MIN: 0.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Phoronix Test Suite v10.8.5