gpuowl cs2 vkfft

AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and NVIDIA GeForce RTX 3080 10GB on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2402242-PTS-GPUOWLCS40.

gpuowl cs2 vkfftProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionabcAMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS)AMD Device 14d82 x 16GB DRAM-6000MT/s G Skill F5-6000J3038F16G2000GB Samsung SSD 980 PRO 2TB + 4001GB Western Digital WD_BLACK SN850X 4000GBNVIDIA GeForce RTX 3080 10GBNVIDIA GA102 HD AudioDELL U2723QEIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 23.106.7.0-060700-generic (x86_64)GNOME Shell 45.2X Server 1.21.1.7NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203Graphics Details- BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.02.20.00.07OpenCL Details- GPU Compute Cores: 8704Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

gpuowl cs2 vkfftvkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflinggpuowl: 57885161gpuowl: 77936867gpuowl: 332220523opencl-benchmark: FP64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: INT64 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writecs2: 1920 x 1080cs2: 1920 x 1200cs2: 2560 x 1440cs2: 3840 x 2160abc510461481471328625136113874462833724116216723.24532.10115.730.52832.8733.23116.86114.56512.108702.72721.83308.0291.7221.9121.4508031432201322523693113952476503763116227729.39536.19116.650.53132.9153.22216.66614.5612.078702.78721.79311.4292.7221.3121.6499511450971335825627113935483853758116319728.86532.20115.780.52732.7973.22516.92114.56212.173702.84721.72309.8293.7221.1122.8OpenBenchmarking.org

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2Rabc11K22K33K44K55KSE +/- 351.55, N = 155104650803499511. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionabc30K60K90K120K150KSE +/- 1616.73, N = 151481471432201450971. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionabc3K6K9K12K15KSE +/- 60.40, N = 31328613225133581. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precisionabc5K10K15K20K25KSE +/- 311.90, N = 112513623693256271. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionabc20K40K60K80K100KSE +/- 22.70, N = 31138741139521139351. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionabc10K20K30K40K50KSE +/- 368.38, N = 154628347650483851. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precisionabc8001600240032004000SE +/- 8.17, N = 33724376337581. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingabc20K40K60K80K100KSE +/- 85.34, N = 31162161162271163191. (CXX) g++ options: -O3

GpuOwl

Exponent: 57885161

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 57885161abc160320480640800SE +/- 0.17, N = 3723.24729.39728.861. (CXX) g++ options: -O3 -lgmp -lOpenCL

GpuOwl

Exponent: 77936867

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 77936867abc120240360480600SE +/- 0.09, N = 3532.10536.19532.201. (CXX) g++ options: -O3 -lgmp -lOpenCL

GpuOwl

Exponent: 332220523

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 332220523abc306090120150SE +/- 0.01, N = 3115.73116.65115.781. (CXX) g++ options: -O3 -lgmp -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP64 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 Computeabc0.11950.2390.35850.4780.5975SE +/- 0.001, N = 30.5280.5310.5271. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP32 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 Computeabc816243240SE +/- 0.03, N = 332.8732.9232.801. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT64 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 Computeabc0.7271.4542.1812.9083.635SE +/- 0.009, N = 33.2313.2223.2251. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT32 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 Computeabc48121620SE +/- 0.04, N = 316.8616.6716.921. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT16 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 Computeabc48121620SE +/- 0.00, N = 314.5714.5614.561. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT8 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 Computeabc3691215SE +/- 0.03, N = 312.1112.0812.171. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Read

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced Readabc150300450600750SE +/- 0.00, N = 3702.72702.78702.841. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Write

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced Writeabc160320480640800SE +/- 0.03, N = 3721.83721.79721.721. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Counter-Strike 2

Resolution: 1920 x 1080

OpenBenchmarking.orgFrames Per Second, More Is BetterCounter-Strike 2Resolution: 1920 x 1080abc70140210280350SE +/- 0.38, N = 3308.0311.4309.8MIN: 307.3 / MAX: 308.6

Counter-Strike 2

Resolution: 1920 x 1200

OpenBenchmarking.orgFrames Per Second, More Is BetterCounter-Strike 2Resolution: 1920 x 1200abc60120180240300SE +/- 1.27, N = 3291.7292.7293.7MIN: 289.4 / MAX: 293.8

Counter-Strike 2

Resolution: 2560 x 1440

OpenBenchmarking.orgFrames Per Second, More Is BetterCounter-Strike 2Resolution: 2560 x 1440abc50100150200250SE +/- 0.58, N = 3221.9221.3221.1MIN: 221.3 / MAX: 223.1

Counter-Strike 2

Resolution: 3840 x 2160

OpenBenchmarking.orgFrames Per Second, More Is BetterCounter-Strike 2Resolution: 3840 x 2160abc306090120150SE +/- 0.29, N = 3121.4121.6122.8MIN: 120.9 / MAX: 121.9


Phoronix Test Suite v10.8.4