gpuowl cs2 vkfft

AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and NVIDIA GeForce RTX 3080 10GB on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2402242-PTS-GPUOWLCS40&grr.

gpuowl cs2 vkfftProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionabcAMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS)AMD Device 14d82 x 16GB DRAM-6000MT/s G Skill F5-6000J3038F16G2000GB Samsung SSD 980 PRO 2TB + 4001GB Western Digital WD_BLACK SN850X 4000GBNVIDIA GeForce RTX 3080 10GBNVIDIA GA102 HD AudioDELL U2723QEIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 23.106.7.0-060700-generic (x86_64)GNOME Shell 45.2X Server 1.21.1.7NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203Graphics Details- BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.02.20.00.07OpenCL Details- GPU Compute Cores: 8704Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

gpuowl cs2 vkfftvkfft: FFT + iFFT C2C 1D batched in double precisiongpuowl: 77936867gpuowl: 332220523gpuowl: 57885161vkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C multidimensional in single precisioncs2: 3840 x 2160vkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingcs2: 2560 x 1440vkfft: FFT + iFFT R2C / C2Rcs2: 1920 x 1200cs2: 1920 x 1080opencl-benchmark: Memory Bandwidth Coalesced Writeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: INT8 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: FP64 Computeabc25136532.10115.73723.24148147372446283121.411387413286116216221.951046291.7308.0721.83702.7212.10814.56516.8613.23132.8730.52823693536.19116.65729.39143220376347650121.611395213225116227221.350803292.7311.4721.79702.7812.07814.5616.6663.22232.9150.53125627532.20115.78728.86145097375848385122.811393513358116319221.149951293.7309.8721.72702.8412.17314.56216.9213.22532.7970.527OpenBenchmarking.org

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precisionabc5K10K15K20K25KSE +/- 311.90, N = 112513623693256271. (CXX) g++ options: -O3

GpuOwl

Exponent: 77936867

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 77936867abc120240360480600SE +/- 0.09, N = 3532.10536.19532.201. (CXX) g++ options: -O3 -lgmp -lOpenCL

GpuOwl

Exponent: 332220523

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 332220523abc306090120150SE +/- 0.01, N = 3115.73116.65115.781. (CXX) g++ options: -O3 -lgmp -lOpenCL

GpuOwl

Exponent: 57885161

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 57885161abc160320480640800SE +/- 0.17, N = 3723.24729.39728.861. (CXX) g++ options: -O3 -lgmp -lOpenCL

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionabc30K60K90K120K150KSE +/- 1616.73, N = 151481471432201450971. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precisionabc8001600240032004000SE +/- 8.17, N = 33724376337581. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionabc10K20K30K40K50KSE +/- 368.38, N = 154628347650483851. (CXX) g++ options: -O3

Counter-Strike 2

Resolution: 3840 x 2160

OpenBenchmarking.orgFrames Per Second, More Is BetterCounter-Strike 2Resolution: 3840 x 2160abc306090120150SE +/- 0.29, N = 3121.4121.6122.8MIN: 120.9 / MAX: 121.9

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionabc20K40K60K80K100KSE +/- 22.70, N = 31138741139521139351. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionabc3K6K9K12K15KSE +/- 60.40, N = 31328613225133581. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingabc20K40K60K80K100KSE +/- 85.34, N = 31162161162271163191. (CXX) g++ options: -O3

Counter-Strike 2

Resolution: 2560 x 1440

OpenBenchmarking.orgFrames Per Second, More Is BetterCounter-Strike 2Resolution: 2560 x 1440abc50100150200250SE +/- 0.58, N = 3221.9221.3221.1MIN: 221.3 / MAX: 223.1

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2Rabc11K22K33K44K55KSE +/- 351.55, N = 155104650803499511. (CXX) g++ options: -O3

Counter-Strike 2

Resolution: 1920 x 1200

OpenBenchmarking.orgFrames Per Second, More Is BetterCounter-Strike 2Resolution: 1920 x 1200abc60120180240300SE +/- 1.27, N = 3291.7292.7293.7MIN: 289.4 / MAX: 293.8

Counter-Strike 2

Resolution: 1920 x 1080

OpenBenchmarking.orgFrames Per Second, More Is BetterCounter-Strike 2Resolution: 1920 x 1080abc70140210280350SE +/- 0.38, N = 3308.0311.4309.8MIN: 307.3 / MAX: 308.6

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Write

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced Writeabc160320480640800SE +/- 0.03, N = 3721.83721.79721.721. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Read

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced Readabc150300450600750SE +/- 0.00, N = 3702.72702.78702.841. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT8 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 Computeabc3691215SE +/- 0.03, N = 312.1112.0812.171. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT16 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 Computeabc48121620SE +/- 0.00, N = 314.5714.5614.561. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT32 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 Computeabc48121620SE +/- 0.04, N = 316.8616.6716.921. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT64 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 Computeabc0.7271.4542.1812.9083.635SE +/- 0.009, N = 33.2313.2223.2251. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP32 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 Computeabc816243240SE +/- 0.03, N = 332.8732.9232.801. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP64 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 Computeabc0.11950.2390.35850.4780.5975SE +/- 0.001, N = 30.5280.5310.5271. (CXX) g++ options: -std=c++17 -pthread -lOpenCL


Phoronix Test Suite v10.8.5