Kepler-game-dev-results AMD Ryzen 9 9950X 16-Core testing with a ASUS PRIME B650M-A II (3201 BIOS) and NVIDIA GeForce RTX 4060 Ti 16GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2501308-NE-YIPPEEE9841&grr .
Kepler-game-dev-results Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution wheeee AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads) ASUS PRIME B650M-A II (3201 BIOS) AMD Device 14d8 4 x 48GB DDR5-3600MT/s G Skill F5-6800J3446F48G 2000GB Samsung SSD 980 PRO 2TB NVIDIA GeForce RTX 4060 Ti 16GB NVIDIA Device 22bd 2 x Intel 10-Gigabit X540-AT2 + Realtek RTL8125 2.5GbE Ubuntu 24.04 6.8.0-51-generic (x86_64) GNOME Shell 46.0 X Server 1.21.1.11 NVIDIA 4.6.0 OpenCL 3.0 CUDA 12.4.131 GCC 13.3.0 + CUDA 12.4 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xb404023 - GLAMOR - BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.06.34.00.ec - Python 3.12.3 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Kepler-game-dev-results openvkl: vklBenchmarkCPU ISPC openvkl: vklBenchmarkCPU Scalar blender: Barbershop - CPU-Only build-godot: Time To Compile blender: Pabellon Barcelona - CPU-Only blender: Classroom - CPU-Only blender: Barbershop - NVIDIA CUDA toktx: UASTC 4 + Zstd Compression 19 blender: Barbershop - NVIDIA OptiX astcenc: Very Thorough astcenc: Exhaustive betsy: ETC1 - Highest betsy: ETC2 RGB - Highest blender: Pabellon Barcelona - NVIDIA CUDA blender: Junkshop - CPU-Only blender: Fishy Cat - CPU-Only oidn: RTLightmap.hdr.4096x4096 - CPU-Only blender: BMW27 - CPU-Only luajit: Composite blender: Classroom - NVIDIA CUDA blender: Fishy Cat - NVIDIA CUDA oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only blender: Junkshop - NVIDIA CUDA astcenc: Thorough blender: Pabellon Barcelona - NVIDIA OptiX etcpak: Multi-Threaded - ETC2 blender: Classroom - NVIDIA OptiX basis: UASTC Level 3 blender: BMW27 - NVIDIA OptiX blender: Junkshop - NVIDIA OptiX basis: ETC1S blender: BMW27 - NVIDIA CUDA blender: Fishy Cat - NVIDIA OptiX toktx: Zstd Compression 19 astcenc: Fast basis: UASTC Level 2 toktx: UASTC 3 + Zstd Compression 19 astcenc: Medium draco: Church Facade oidn: RTLightmap.hdr.4096x4096 - NVIDIA CUDA draco: Lion toktx: UASTC 3 basis: UASTC Level 0 oidn: RT.hdr_alb_nrm.3840x2160 - NVIDIA CUDA oidn: RT.ldr_alb_nrm.3840x2160 - NVIDIA CUDA luajit: Jacobi Successive Over-Relaxation luajit: Dense LU Matrix Factorization luajit: Sparse Matrix Multiply luajit: Fast Fourier Transform luajit: Monte Carlo wheeee 680 277 488.79 167.589 149.77 133.60 127.76 107.924 84.47 3.0631 1.8755 76.226 76.092 68.76 67.46 67.33 0.50 47.06 2904.20 30.53 30.17 1.05 1.05 27.19 23.0078 21.83 729.918 19.79 19.467 7.82 17.47 15.204 14.95 14.41 12.844 446.8399 10.941 9.115 176.0416 5579 5.10 4241 4.605 3.910 10.59 10.63 3367.19 6937.71 2941.09 509.72 765.28 OpenBenchmarking.org
OpenVKL Benchmark: vklBenchmarkCPU ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 2.0.0 Benchmark: vklBenchmarkCPU ISPC wheeee 150 300 450 600 750 SE +/- 1.00, N = 3 680 MIN: 57 / MAX: 10338
OpenVKL Benchmark: vklBenchmarkCPU Scalar OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 2.0.0 Benchmark: vklBenchmarkCPU Scalar wheeee 60 120 180 240 300 SE +/- 0.33, N = 3 277 MIN: 20 / MAX: 4965
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only wheeee 110 220 330 440 550 SE +/- 0.06, N = 3 488.79
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile wheeee 40 80 120 160 200 SE +/- 0.21, N = 3 167.59
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: CPU-Only wheeee 30 60 90 120 150 SE +/- 0.42, N = 3 149.77
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: CPU-Only wheeee 30 60 90 120 150 SE +/- 0.44, N = 3 133.60
Blender Blend File: Barbershop - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: NVIDIA CUDA wheeee 30 60 90 120 150 SE +/- 0.05, N = 3 127.76
KTX-Software toktx Settings: UASTC 4 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 4 + Zstd Compression 19 wheeee 20 40 60 80 100 SE +/- 0.37, N = 3 107.92
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: NVIDIA OptiX wheeee 20 40 60 80 100 SE +/- 0.02, N = 3 84.47
ASTC Encoder Preset: Very Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough wheeee 0.6892 1.3784 2.0676 2.7568 3.446 SE +/- 0.0070, N = 3 3.0631 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive wheeee 0.422 0.844 1.266 1.688 2.11 SE +/- 0.0081, N = 3 1.8755 1. (CXX) g++ options: -O3 -flto -pthread
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest wheeee 20 40 60 80 100 SE +/- 0.20, N = 3 76.23
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest wheeee 20 40 60 80 100 SE +/- 0.06, N = 3 76.09
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: NVIDIA CUDA wheeee 15 30 45 60 75 SE +/- 0.10, N = 3 68.76
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: CPU-Only wheeee 15 30 45 60 75 SE +/- 0.26, N = 3 67.46
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: CPU-Only wheeee 15 30 45 60 75 SE +/- 0.22, N = 3 67.33
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only wheeee 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 0.50
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: CPU-Only wheeee 11 22 33 44 55 SE +/- 0.29, N = 3 47.06
LuaJIT Test: Composite OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Composite wheeee 600 1200 1800 2400 3000 SE +/- 33.88, N = 3 2904.20 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
Blender Blend File: Classroom - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: NVIDIA CUDA wheeee 7 14 21 28 35 SE +/- 0.03, N = 3 30.53
Blender Blend File: Fishy Cat - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: NVIDIA CUDA wheeee 7 14 21 28 35 SE +/- 0.01, N = 3 30.17
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only wheeee 0.2363 0.4726 0.7089 0.9452 1.1815 SE +/- 0.00, N = 3 1.05
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only wheeee 0.2363 0.4726 0.7089 0.9452 1.1815 SE +/- 0.00, N = 3 1.05
Blender Blend File: Junkshop - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: NVIDIA CUDA wheeee 6 12 18 24 30 SE +/- 0.04, N = 3 27.19
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough wheeee 6 12 18 24 30 SE +/- 0.04, N = 3 23.01 1. (CXX) g++ options: -O3 -flto -pthread
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX wheeee 5 10 15 20 25 SE +/- 0.01, N = 3 21.83
Etcpak Benchmark: Multi-Threaded - Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 wheeee 160 320 480 640 800 SE +/- 1.43, N = 3 729.92 1. (CXX) g++ options: -flto -pthread
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: NVIDIA OptiX wheeee 5 10 15 20 25 SE +/- 0.02, N = 3 19.79
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 3 wheeee 5 10 15 20 25 SE +/- 0.04, N = 3 19.47 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: NVIDIA OptiX wheeee 2 4 6 8 10 SE +/- 0.07, N = 7 7.82
Blender Blend File: Junkshop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: NVIDIA OptiX wheeee 4 8 12 16 20 SE +/- 0.10, N = 3 17.47
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: ETC1S wheeee 4 8 12 16 20 SE +/- 0.03, N = 3 15.20 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Blender Blend File: BMW27 - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: NVIDIA CUDA wheeee 4 8 12 16 20 SE +/- 0.00, N = 3 14.95
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: NVIDIA OptiX wheeee 4 8 12 16 20 SE +/- 0.00, N = 3 14.41
KTX-Software toktx Settings: Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: Zstd Compression 19 wheeee 3 6 9 12 15 SE +/- 0.14, N = 3 12.84
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast wheeee 100 200 300 400 500 SE +/- 0.60, N = 3 446.84 1. (CXX) g++ options: -O3 -flto -pthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 2 wheeee 3 6 9 12 15 SE +/- 0.04, N = 3 10.94 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
KTX-Software toktx Settings: UASTC 3 + Zstd Compression 19 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 wheeee 3 6 9 12 15 SE +/- 0.018, N = 3 9.115
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium wheeee 40 80 120 160 200 SE +/- 0.06, N = 3 176.04 1. (CXX) g++ options: -O3 -flto -pthread
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade wheeee 1200 2400 3600 4800 6000 SE +/- 8.17, N = 3 5579 1. (CXX) g++ options: -O3
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: NVIDIA CUDA OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RTLightmap.hdr.4096x4096 - Device: NVIDIA CUDA wheeee 1.1475 2.295 3.4425 4.59 5.7375 SE +/- 0.00, N = 3 5.10
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion wheeee 900 1800 2700 3600 4500 SE +/- 2.08, N = 3 4241 1. (CXX) g++ options: -O3
KTX-Software toktx Settings: UASTC 3 OpenBenchmarking.org Seconds, Fewer Is Better KTX-Software toktx 4.0 Settings: UASTC 3 wheeee 1.0361 2.0722 3.1083 4.1444 5.1805 SE +/- 0.009, N = 3 4.605
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.13 Settings: UASTC Level 0 wheeee 0.8798 1.7596 2.6394 3.5192 4.399 SE +/- 0.004, N = 3 3.910 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: NVIDIA CUDA OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.hdr_alb_nrm.3840x2160 - Device: NVIDIA CUDA wheeee 3 6 9 12 15 SE +/- 0.01, N = 3 10.59
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: NVIDIA CUDA OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.3 Run: RT.ldr_alb_nrm.3840x2160 - Device: NVIDIA CUDA wheeee 3 6 9 12 15 SE +/- 0.01, N = 3 10.63
LuaJIT Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation wheeee 700 1400 2100 2800 3500 SE +/- 50.34, N = 3 3367.19 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
LuaJIT Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Dense LU Matrix Factorization wheeee 1500 3000 4500 6000 7500 SE +/- 81.89, N = 3 6937.71 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
LuaJIT Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Sparse Matrix Multiply wheeee 600 1200 1800 2400 3000 SE +/- 29.37, N = 3 2941.09 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
LuaJIT Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Fast Fourier Transform wheeee 110 220 330 440 550 SE +/- 17.60, N = 3 509.72 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
LuaJIT Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Monte Carlo wheeee 160 320 480 640 800 SE +/- 9.61, N = 3 765.28 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -U_FORTIFY_SOURCE -fno-stack-protector
Phoronix Test Suite v10.8.5