Benchmarks for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2402256-PTS-GPUCOMP691 gpu comp - Phoronix Test Suite gpu comp Benchmarks for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2402256-PTS-GPUCOMP691&gru&sor .
gpu comp Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution 4080 super a b c d RTX 4060 e f g 4070 TI SUPER h i 4090 j k l m AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads) ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) AMD Device 14d8 2 x 16GB DRAM-6000MT/s G Skill F5-6000J3038F16G 2000GB Samsung SSD 980 PRO 2TB + 4001GB Western Digital WD_BLACK SN850X 4000GB NVIDIA GeForce RTX 4080 SUPER 16GB NVIDIA Device 22bb DELL U2723QE Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.10 6.7.0-060700-generic (x86_64) GNOME Shell 45.2 X Server 1.21.1.7 NVIDIA 550.40.07 4.6.0 OpenCL 3.0 CUDA 12.4.74 GCC 13.2.0 ext4 3840x2160 MSI NVIDIA GeForce RTX 4060 8GB NVIDIA Device 22be 4001GB Western Digital WD_BLACK SN850X 4000GB + 2000GB Samsung SSD 980 PRO 2TB ASUS NVIDIA GeForce RTX 4070 Ti SUPER 16GB NVIDIA Device 22bb 2000GB Samsung SSD 980 PRO 2TB + 4001GB Western Digital WD_BLACK SN850X 4000GB NVIDIA GeForce RTX 4090 24GB NVIDIA AD102 HD Audio OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - 4080 super a: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - b: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - c: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - d: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - RTX 4060: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - e: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - f: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - g: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - 4070 TI SUPER: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - h: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - i: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 - 4090: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203 - j: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203 - k: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203 - l: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203 - m: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203 Graphics Details - 4080 super a: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.44.00.01 - b: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.44.00.01 - c: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.44.00.01 - d: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.44.00.01 - RTX 4060: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3 - e: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3 - f: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3 - g: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3 - 4070 TI SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.45.00.9c - h: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.45.00.9c - i: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.45.00.9c - 4090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 - j: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 - k: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 - l: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 - m: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 OpenCL Details - 4080 super a: GPU Compute Cores: 10240 - b: GPU Compute Cores: 10240 - c: GPU Compute Cores: 10240 - d: GPU Compute Cores: 10240 - RTX 4060: GPU Compute Cores: 3072 - e: GPU Compute Cores: 3072 - f: GPU Compute Cores: 3072 - g: GPU Compute Cores: 3072 - 4070 TI SUPER: GPU Compute Cores: 8448 - h: GPU Compute Cores: 8448 - i: GPU Compute Cores: 8448 - 4090: GPU Compute Cores: 16384 - j: GPU Compute Cores: 16384 - k: GPU Compute Cores: 16384 - l: GPU Compute Cores: 16384 - m: GPU Compute Cores: 16384 Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
gpu comp vkfft: FFT + iFFT R2C / C2R vkfft: FFT + iFFT C2C 1D batched in half precision vkfft: FFT + iFFT C2C Bluestein in single precision vkfft: FFT + iFFT C2C 1D batched in double precision vkfft: FFT + iFFT C2C 1D batched in single precision vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling libplacebo: deband_heavy libplacebo: polar_nocompute libplacebo: hdr_peakdetect libplacebo: hdr_lut libplacebo: av1_grain_lap libplacebo: gaussian opencl-benchmark: Memory Bandwidth Coalesced Read opencl-benchmark: Memory Bandwidth Coalesced Write gpuowl: 57885161 gpuowl: 77936867 gpuowl: 332220523 fluidx3d: FP32-FP32 fluidx3d: FP32-FP16C fluidx3d: FP32-FP16S opencl-benchmark: FP64 Compute opencl-benchmark: FP32 Compute opencl-benchmark: INT64 Compute opencl-benchmark: INT32 Compute opencl-benchmark: INT16 Compute opencl-benchmark: INT8 Compute 4080 super a b c d RTX 4060 e f g 4070 TI SUPER h i 4090 j k l m 67257 143183 16388 31464 106172 73948 5686 107604 1529.33 1919.91 3002.98 2858.68 3475.58 3407.97 680.51 632.69 1191.89511323 896.06 189.68 3972 8127 8077 0.861 53.504 4.237 27.629 23.984 20.822 63956 146963 16963 31775 106348 75029 5706 107741 1527.65 1918.79 3260.62 2858.87 3476.35 3411.13 680.56 631.65 1190.48 896.06 189.83 3968 8121 8079 0.864 53.489 4.252 27.581 23.978 20.819 63887 139429 16838 34346 106208 68642 5712 107838 1528.48 1918.63 3184.6 2856.35 3476.23 3413.41 680.77 629.88 1191.89511323 894.45 189.61 3967 8145 8079 0.864 53.631 4.239 27.63 23.984 20.825 67647 147059 16831 34356 106300 74674 5687 107754 1530.76 1924.72 3251.95 2860.74 3189.61 3412.08 680.67 629.1 1191.89511323 896.06 189.83 3967 8128 8078 0.864 53.655 4.239 27.63 23.982 20.822 35601 82823 10715 12025 42347 37049 2371 43062 514.76 655.96 1570.77 1525.07 1723.15 1512.45 252.9 258.47 376.65 278.55 58.85 1609 3114 3038 0.264 16.505 2.087 8.491 7.414 6.179 35201 82725 10635 12070 42338 38684 2373 43067 517.52 659.34 1546.47 1525.75 1725.79 1512.65 252.89 258.40 378.36 278.55 58.85 1609 3113 3039 0.264 16.506 2.094 8.491 7.417 6.225 35167 82997 10593 12096 42341 38064 2385 43079 514.52 655.72 1506.93 1522.3 1722.45 1511.72 252.89 258.4 376.65 277.32 58.58 1608 3114 3039 0.264 16.505 2.089 8.491 7.417 6.215 35219 83065 10461 12103 42345 38609 2373 43071 514.7 656.2 1503.72 1526.73 1724.71 1511.91 252.89 258.36 376.65 278.55 58.85 1609 3114 3039 0.264 16.505 2.091 8.491 7.419 6.214 67558 138716 14512 30677 102606 66454 4850 104271 1323.8 1672.46 2874.75 2789.69 3538.35 3118.52 619.31 607.15 1003.01 744.60 159.82 3831 7420 6678 0.725 45.068 4.28 23.186 20.217 17.135 66849 127806 16243 23199 102692 67587 4999 104294 1324.36 1671.91 2882.56 2789.73 3516.56 3120.65 619.28 606.59 1004.016064257 745.16 159.87 3830 7420 6681 0.725 45.067 4.272 23.207 20.212 17.135 66874 141164 16357 24125 102682 68137 4983 104266 1324.84 1672.08 2872.21 2772.4 3487.01 3120.9 619.29 607.19 1003.01 745.16 159.82 3830 7421 6683 0.725 45.071 4.308 23.202 20.198 17.134 79195 214165 16151 37812 151611 80489 8183 152245 2294.94 2848.16 3880.87 3542.89 3470.09 4406.25 927.96 896.46 1926.78 1432.664756447 304.51 5736 11063 9906 1.389 85.831 4.346 44.411 38.335 33.234 76372 214418 19398 37573 149674 85349 8211 152220 2270.65 2845.61 3586.11 3546.09 3490.13 4364.76 927.85 902.58 1937.984496124 1440.92 304.79 5733 10882 9785 1.389 85.871 4.344 44.329 38.361 33.224 71755 197629 18205 46461 150527 80257 8223 152066 2293.54 2856.18 3990.18 3545.36 3503.94 4407.68 928.04 902.92 1937.984496124 1434.72 304.51 5737 10985 9806 1.389 85.797 4.347 44.398 38.518 33.22 78780 205813 18369 50684 150378 84200 8210 152986 2295.23 2851.23 3616.57 3536.98 3462.39 4405.21 927.98 903.86 1926.78 1434.72 304.60 5739 10982 9800 1.389 85.849 4.349 44.348 38.361 33.232 75912 206803 18619 45665 151234 74587 8208 152451 2293.81 2855.33 4000.04 3543.51 3453.74 4407.96 928.1 902.16 1934.24 1440.92 305.34 5737 10983 9809 1.389 85.809 4.349 44.33 38.341 33.229 OpenBenchmarking.org
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R 4090 l j m k d 4070 TI SUPER 4080 super a i h b c RTX 4060 g e f 20K 40K 60K 80K 100K SE +/- 24.98, N = 3 79195 78780 76372 75912 71755 67647 67558 67257 66874 66849 63956 63887 35601 35219 35201 35167 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision j 4090 m l k d b 4080 super a i c 4070 TI SUPER h g f RTX 4060 e 50K 100K 150K 200K 250K SE +/- 194.61, N = 3 214418 214165 206803 205813 197629 147059 146963 143183 141164 139429 138716 127806 83065 82997 82823 82725 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision j m l k b c d 4080 super a i h 4090 4070 TI SUPER RTX 4060 e f g 4K 8K 12K 16K 20K SE +/- 43.25, N = 3 19398 18619 18369 18205 16963 16838 16831 16388 16357 16243 16151 14512 10715 10635 10593 10461 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision l k m 4090 j d c b 4080 super a 4070 TI SUPER i h g f e RTX 4060 11K 22K 33K 44K 55K SE +/- 8.95, N = 3 50684 46461 45665 37812 37573 34356 34346 31775 31464 30677 24125 23199 12103 12096 12070 12025 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision 4090 m k l j b d c 4080 super a h i 4070 TI SUPER RTX 4060 g f e 30K 60K 90K 120K 150K SE +/- 0.67, N = 3 151611 151234 150527 150378 149674 106348 106300 106208 106172 102692 102682 102606 42347 42345 42341 42338 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision j l 4090 k b d m 4080 super a c i h 4070 TI SUPER e g f RTX 4060 20K 40K 60K 80K 100K SE +/- 51.67, N = 3 85349 84200 80489 80257 75029 74674 74587 73948 68642 68137 67587 66454 38684 38609 38064 37049 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision k j l m 4090 c b d 4080 super a h i 4070 TI SUPER f g e RTX 4060 2K 4K 6K 8K 10K SE +/- 2.40, N = 3 8223 8211 8210 8208 8183 5712 5706 5687 5686 4999 4983 4850 2385 2373 2373 2371 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling l m 4090 j k c d b 4080 super a h 4070 TI SUPER i f g e RTX 4060 30K 60K 90K 120K 150K SE +/- 2.03, N = 3 152986 152451 152245 152220 152066 107838 107754 107741 107604 104294 104271 104266 43079 43071 43067 43062 1. (CXX) g++ options: -O3
Libplacebo Test: deband_heavy OpenBenchmarking.org FPS, More Is Better Libplacebo 6.338.2 Test: deband_heavy l 4090 m k j d 4080 super a c b i h 4070 TI SUPER e RTX 4060 g f 500 1000 1500 2000 2500 SE +/- 0.15, N = 3 2295.23 2294.94 2293.81 2293.54 2270.65 1530.76 1529.33 1528.48 1527.65 1324.84 1324.36 1323.80 517.52 514.76 514.70 514.52 1. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF
Libplacebo Test: polar_nocompute OpenBenchmarking.org FPS, More Is Better Libplacebo 6.338.2 Test: polar_nocompute k m l 4090 j d 4080 super a b c 4070 TI SUPER i h e g RTX 4060 f 600 1200 1800 2400 3000 SE +/- 0.04, N = 3 2856.18 2855.33 2851.23 2848.16 2845.61 1924.72 1919.91 1918.79 1918.63 1672.46 1672.08 1671.91 659.34 656.20 655.96 655.72 1. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF
Libplacebo Test: hdr_peakdetect OpenBenchmarking.org FPS, More Is Better Libplacebo 6.338.2 Test: hdr_peakdetect m k 4090 l j b d c 4080 super a h 4070 TI SUPER i RTX 4060 e f g 900 1800 2700 3600 4500 SE +/- 19.72, N = 3 4000.04 3990.18 3880.87 3616.57 3586.11 3260.62 3251.95 3184.60 3002.98 2882.56 2874.75 2872.21 1570.77 1546.47 1506.93 1503.72 1. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF
Libplacebo Test: hdr_lut OpenBenchmarking.org FPS, More Is Better Libplacebo 6.338.2 Test: hdr_lut j k m 4090 l d b 4080 super a c h 4070 TI SUPER i g e RTX 4060 f 800 1600 2400 3200 4000 SE +/- 0.11, N = 3 3546.09 3545.36 3543.51 3542.89 3536.98 2860.74 2858.87 2858.68 2856.35 2789.73 2789.69 2772.40 1526.73 1525.75 1525.07 1522.30 1. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF
Libplacebo Test: av1_grain_lap OpenBenchmarking.org FPS, More Is Better Libplacebo 6.338.2 Test: av1_grain_lap 4070 TI SUPER h k j i b c 4080 super a 4090 l m d e g RTX 4060 f 800 1600 2400 3200 4000 SE +/- 0.40, N = 3 3538.35 3516.56 3503.94 3490.13 3487.01 3476.35 3476.23 3475.58 3470.09 3462.39 3453.74 3189.61 1725.79 1724.71 1723.15 1722.45 1. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF
Libplacebo Test: gaussian OpenBenchmarking.org FPS, More Is Better Libplacebo 6.338.2 Test: gaussian m k 4090 l j c d b 4080 super a i h 4070 TI SUPER e RTX 4060 g f 900 1800 2700 3600 4500 SE +/- 0.37, N = 3 4407.96 4407.68 4406.25 4405.21 4364.76 3413.41 3412.08 3411.13 3407.97 3120.90 3120.65 3118.52 1512.65 1512.45 1511.91 1511.72 1. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF
ProjectPhysX OpenCL-Benchmark Operation: Memory Bandwidth Coalesced Read OpenBenchmarking.org GB/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: Memory Bandwidth Coalesced Read m k l 4090 j c d b 4080 super a 4070 TI SUPER i h RTX 4060 g f e 200 400 600 800 1000 SE +/- 0.01, N = 3 928.10 928.04 927.98 927.96 927.85 680.77 680.67 680.56 680.51 619.31 619.29 619.28 252.90 252.89 252.89 252.89 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: Memory Bandwidth Coalesced Write OpenBenchmarking.org GB/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: Memory Bandwidth Coalesced Write l k j m 4090 4080 super a b c d i 4070 TI SUPER h RTX 4060 f e g 200 400 600 800 1000 SE +/- 0.01, N = 3 903.86 902.92 902.58 902.16 896.46 632.69 631.65 629.88 629.10 607.19 607.15 606.59 258.47 258.40 258.40 258.36 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
GpuOwl Exponent: 57885161 OpenBenchmarking.org Iterations / Second, More Is Better GpuOwl 7.5 Exponent: 57885161 k j m l 4090 d c 4080 super a b h i 4070 TI SUPER e g f RTX 4060 400 800 1200 1600 2000 SE +/- 0.00, N = 3 1937.98 1937.98 1934.24 1926.78 1926.78 1191.90 1191.90 1191.90 1190.48 1004.02 1003.01 1003.01 378.36 376.65 376.65 376.65 1. (CXX) g++ options: -O3 -lgmp -lOpenCL
GpuOwl Exponent: 77936867 OpenBenchmarking.org Iterations / Second, More Is Better GpuOwl 7.5 Exponent: 77936867 m j l k 4090 d b 4080 super a c i h 4070 TI SUPER g e RTX 4060 f 300 600 900 1200 1500 SE +/- 0.00, N = 3 1440.92 1440.92 1434.72 1434.72 1432.66 896.06 896.06 896.06 894.45 745.16 745.16 744.60 278.55 278.55 278.55 277.32 1. (CXX) g++ options: -O3 -lgmp -lOpenCL
GpuOwl Exponent: 332220523 OpenBenchmarking.org Iterations / Second, More Is Better GpuOwl 7.5 Exponent: 332220523 m j l k 4090 d b 4080 super a c h i 4070 TI SUPER g e RTX 4060 f 70 140 210 280 350 SE +/- 0.00, N = 3 305.34 304.79 304.60 304.51 304.51 189.83 189.83 189.68 189.61 159.87 159.82 159.82 58.85 58.85 58.85 58.58 1. (CXX) g++ options: -O3 -lgmp -lOpenCL
FluidX3D Test: FP32-FP32 OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP32 l m k 4090 j 4080 super a b d c 4070 TI SUPER i h g e RTX 4060 f 1200 2400 3600 4800 6000 SE +/- 0.00, N = 3 5739 5737 5737 5736 5733 3972 3968 3967 3967 3831 3830 3830 1609 1609 1609 1608
FluidX3D Test: FP32-FP16C OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP16C 4090 k m l j c d 4080 super a b i h 4070 TI SUPER g f RTX 4060 e 2K 4K 6K 8K 10K SE +/- 0.33, N = 3 11063 10985 10983 10982 10882 8145 8128 8127 8121 7421 7420 7420 3114 3114 3114 3113
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP16S 4090 m k l j c b d 4080 super a i h 4070 TI SUPER g f e RTX 4060 2K 4K 6K 8K 10K SE +/- 0.33, N = 3 9906 9809 9806 9800 9785 8079 8079 8078 8077 6683 6681 6678 3039 3039 3039 3038
ProjectPhysX OpenCL-Benchmark Operation: FP64 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: FP64 Compute m l k j 4090 d c b 4080 super a i h 4070 TI SUPER g f e RTX 4060 0.3125 0.625 0.9375 1.25 1.5625 SE +/- 0.000, N = 3 1.389 1.389 1.389 1.389 1.389 0.864 0.864 0.864 0.861 0.725 0.725 0.725 0.264 0.264 0.264 0.264 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: FP32 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: FP32 Compute j l 4090 m k d c 4080 super a b i 4070 TI SUPER h e g f RTX 4060 20 40 60 80 100 SE +/- 0.00, N = 3 85.87 85.85 85.83 85.81 85.80 53.66 53.63 53.50 53.49 45.07 45.07 45.07 16.51 16.51 16.51 16.51 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT64 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT64 Compute m l k 4090 j i 4070 TI SUPER h b d c 4080 super a e g f RTX 4060 0.9785 1.957 2.9355 3.914 4.8925 SE +/- 0.002, N = 3 4.349 4.349 4.347 4.346 4.344 4.308 4.280 4.272 4.252 4.239 4.239 4.237 2.094 2.091 2.089 2.087 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT32 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT32 Compute 4090 k l m j d c 4080 super a b h i 4070 TI SUPER g f e RTX 4060 10 20 30 40 50 SE +/- 0.000, N = 3 44.411 44.398 44.348 44.330 44.329 27.630 27.630 27.629 27.581 23.207 23.202 23.186 8.491 8.491 8.491 8.491 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT16 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT16 Compute k l j m 4090 c 4080 super a d b 4070 TI SUPER h i g f e RTX 4060 9 18 27 36 45 SE +/- 0.001, N = 3 38.518 38.361 38.361 38.341 38.335 23.984 23.984 23.982 23.978 20.217 20.212 20.198 7.419 7.417 7.417 7.414 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT8 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT8 Compute 4090 l m j k c d 4080 super a b h 4070 TI SUPER i e f g RTX 4060 8 16 24 32 40 SE +/- 0.012, N = 3 33.234 33.232 33.229 33.224 33.220 20.825 20.822 20.822 20.819 17.135 17.135 17.134 6.225 6.215 6.214 6.179 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
Phoronix Test Suite v10.8.4