gpu comp

Benchmarks for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2402256-PTS-GPUCOMP691&sor&grw.

gpu compProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen Resolution4080 super abcdRTX 4060efg4070 TI SUPERhi4090jklmAMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS)AMD Device 14d82 x 16GB DRAM-6000MT/s G Skill F5-6000J3038F16G2000GB Samsung SSD 980 PRO 2TB + 4001GB Western Digital WD_BLACK SN850X 4000GBNVIDIA GeForce RTX 4080 SUPER 16GBNVIDIA Device 22bbDELL U2723QEIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 23.106.7.0-060700-generic (x86_64)GNOME Shell 45.2X Server 1.21.1.7NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.0ext43840x2160MSI NVIDIA GeForce RTX 4060 8GBNVIDIA Device 22be4001GB Western Digital WD_BLACK SN850X 4000GB + 2000GB Samsung SSD 980 PRO 2TBASUS NVIDIA GeForce RTX 4070 Ti SUPER 16GBNVIDIA Device 22bb2000GB Samsung SSD 980 PRO 2TB + 4001GB Western Digital WD_BLACK SN850X 4000GBNVIDIA GeForce RTX 4090 24GBNVIDIA AD102 HD AudioOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- 4080 super a: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- b: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- c: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- d: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- RTX 4060: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- e: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- f: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- g: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- 4070 TI SUPER: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- h: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- i: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203- 4090: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203- j: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203- k: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203- l: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203- m: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203Graphics Details- 4080 super a: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.44.00.01- b: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.44.00.01- c: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.44.00.01- d: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.44.00.01- RTX 4060: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3- e: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3- f: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3- g: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3- 4070 TI SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.45.00.9c- h: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.45.00.9c- i: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.45.00.9c- 4090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01- j: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01- k: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01- l: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01- m: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01OpenCL Details- 4080 super a: GPU Compute Cores: 10240- b: GPU Compute Cores: 10240- c: GPU Compute Cores: 10240- d: GPU Compute Cores: 10240- RTX 4060: GPU Compute Cores: 3072- e: GPU Compute Cores: 3072- f: GPU Compute Cores: 3072- g: GPU Compute Cores: 3072- 4070 TI SUPER: GPU Compute Cores: 8448- h: GPU Compute Cores: 8448- i: GPU Compute Cores: 8448- 4090: GPU Compute Cores: 16384- j: GPU Compute Cores: 16384- k: GPU Compute Cores: 16384- l: GPU Compute Cores: 16384- m: GPU Compute Cores: 16384Python Details- Python 3.11.6Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

gpu complibplacebo: deband_heavylibplacebo: polar_nocomputelibplacebo: hdr_peakdetectlibplacebo: hdr_lutlibplacebo: av1_grain_laplibplacebo: gaussiangpuowl: 57885161vkfft: FFT + iFFT R2C / C2Ropencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computeopencl-benchmark: INT64 Computeopencl-benchmark: INT32 Computeopencl-benchmark: FP64 Computegpuowl: 77936867gpuowl: 332220523vkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionopencl-benchmark: Memory Bandwidth Coalesced Writeopencl-benchmark: FP32 Computevkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingfluidx3d: FP32-FP32fluidx3d: FP32-FP16Cfluidx3d: FP32-FP16S4080 super abcdRTX 4060efg4070 TI SUPERhi4090jklm1529.331919.913002.982858.683475.583407.971191.8951132367257680.5123.98420.8224.23727.6290.861896.06189.681431831638831464106172739485686632.6953.5041076043972812780771527.651918.793260.622858.873476.353411.131190.4863956680.5623.97820.8194.25227.5810.864896.06189.831469631696331775106348750295706631.6553.4891077413968812180791528.481918.633184.62856.353476.233413.411191.8951132363887680.7723.98420.8254.23927.630.864894.45189.611394291683834346106208686425712629.8853.6311078383967814580791530.761924.723251.952860.743189.613412.081191.8951132367647680.6723.98220.8224.23927.630.864896.06189.831470591683134356106300746745687629.153.655107754396781288078514.76655.961570.771525.071723.151512.45376.6535601252.97.4146.1792.0878.4910.264278.5558.8582823107151202542347370492371258.4716.50543062160931143038517.52659.341546.471525.751725.791512.65378.3635201252.897.4176.2252.0948.4910.264278.5558.8582725106351207042338386842373258.4016.50643067160931133039514.52655.721506.931522.31722.451511.72376.6535167252.897.4176.2152.0898.4910.264277.3258.5882997105931209642341380642385258.416.50543079160831143039514.7656.21503.721526.731724.711511.91376.6535219252.897.4196.2142.0918.4910.264278.5558.8583065104611210342345386092373258.3616.505430711609311430391323.81672.462874.752789.693538.353118.521003.0167558619.3120.21717.1354.2823.1860.725744.60159.821387161451230677102606664544850607.1545.0681042713831742066781324.361671.912882.562789.733516.563120.651004.01606425766849619.2820.21217.1354.27223.2070.725745.16159.871278061624323199102692675874999606.5945.0671042943830742066811324.841672.082872.212772.43487.013120.91003.0166874619.2920.19817.1344.30823.2020.725745.16159.821411641635724125102682681374983607.1945.0711042663830742166832294.942848.163880.873542.893470.094406.251926.7879195927.9638.33533.2344.34644.4111.3891432.664756447304.512141651615137812151611804898183896.4685.83115224557361106399062270.652845.613586.113546.093490.134364.761937.98449612476372927.8538.36133.2244.34444.3291.3891440.92304.792144181939837573149674853498211902.5885.87115222057331088297852293.542856.183990.183545.363503.944407.681937.98449612471755928.0438.51833.224.34744.3981.3891434.72304.511976291820546461150527802578223902.9285.79715206657371098598062295.232851.233616.573536.983462.394405.211926.7878780927.9838.36133.2324.34944.3481.3891434.72304.602058131836950684150378842008210903.8685.84915298657391098298002293.812855.334000.043543.513453.744407.961934.2475912928.138.34133.2294.34944.331.3891440.92305.342068031861945665151234745878208902.1685.8091524515737109839809OpenBenchmarking.org

Libplacebo

Test: deband_heavy

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 6.338.2Test: deband_heavyl4090mkjd4080 super acbih4070 TI SUPEReRTX 4060gf5001000150020002500SE +/- 0.15, N = 32295.232294.942293.812293.542270.651530.761529.331528.481527.651324.841324.361323.80517.52514.76514.70514.521. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF

Libplacebo

Test: polar_nocompute

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 6.338.2Test: polar_nocomputekml4090jd4080 super abc4070 TI SUPERihegRTX 4060f6001200180024003000SE +/- 0.04, N = 32856.182855.332851.232848.162845.611924.721919.911918.791918.631672.461672.081671.91659.34656.20655.96655.721. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF

Libplacebo

Test: hdr_peakdetect

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 6.338.2Test: hdr_peakdetectmk4090ljbdc4080 super ah4070 TI SUPERiRTX 4060efg9001800270036004500SE +/- 19.72, N = 34000.043990.183880.873616.573586.113260.623251.953184.603002.982882.562874.752872.211570.771546.471506.931503.721. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF

Libplacebo

Test: hdr_lut

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 6.338.2Test: hdr_lutjkm4090ldb4080 super ach4070 TI SUPERigeRTX 4060f8001600240032004000SE +/- 0.11, N = 33546.093545.363543.513542.893536.982860.742858.872858.682856.352789.732789.692772.401526.731525.751525.071522.301. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF

Libplacebo

Test: av1_grain_lap

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 6.338.2Test: av1_grain_lap4070 TI SUPERhkjibc4080 super a4090lmdegRTX 4060f8001600240032004000SE +/- 0.40, N = 33538.353516.563503.943490.133487.013476.353476.233475.583470.093462.393453.743189.611725.791724.711723.151722.451. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF

Libplacebo

Test: gaussian

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 6.338.2Test: gaussianmk4090ljcdb4080 super aih4070 TI SUPEReRTX 4060gf9001800270036004500SE +/- 0.37, N = 34407.964407.684406.254405.214364.763413.413412.083411.133407.973120.903120.653118.521512.651512.451511.911511.721. (CXX) g++ options: -fvisibility=hidden -std=c++20 -O2 -fno-math-errno -fPIC -pthread -MD -MQ -MF

GpuOwl

Exponent: 57885161

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 57885161kjml4090dc4080 super abhi4070 TI SUPERegfRTX 4060400800120016002000SE +/- 0.00, N = 31937.981937.981934.241926.781926.781191.901191.901191.901190.481004.021003.011003.01378.36376.65376.65376.651. (CXX) g++ options: -O3 -lgmp -lOpenCL

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2R4090ljmkd4070 TI SUPER4080 super aihbcRTX 4060gef20K40K60K80K100KSE +/- 24.98, N = 3791957878076372759127175567647675586725766874668496395663887356013521935201351671. (CXX) g++ options: -O3

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Read

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced Readmkl4090jcdb4080 super a4070 TI SUPERihRTX 4060gfe2004006008001000SE +/- 0.01, N = 3928.10928.04927.98927.96927.85680.77680.67680.56680.51619.31619.29619.28252.90252.89252.89252.891. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT16 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 Computekljm4090c4080 super adb4070 TI SUPERhigfeRTX 4060918273645SE +/- 0.001, N = 338.51838.36138.36138.34138.33523.98423.98423.98223.97820.21720.21220.1987.4197.4177.4177.4141. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT8 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 Compute4090lmjkcd4080 super abh4070 TI SUPERiefgRTX 4060816243240SE +/- 0.012, N = 333.23433.23233.22933.22433.22020.82520.82220.82220.81917.13517.13517.1346.2256.2156.2146.1791. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT64 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 Computemlk4090ji4070 TI SUPERhbdc4080 super aegfRTX 40600.97851.9572.93553.9144.8925SE +/- 0.002, N = 34.3494.3494.3474.3464.3444.3084.2804.2724.2524.2394.2394.2372.0942.0912.0892.0871. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT32 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 Compute4090klmjdc4080 super abhi4070 TI SUPERgfeRTX 40601020304050SE +/- 0.000, N = 344.41144.39844.34844.33044.32927.63027.63027.62927.58123.20723.20223.1868.4918.4918.4918.4911. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP64 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 Computemlkj4090dcb4080 super aih4070 TI SUPERgfeRTX 40600.31250.6250.93751.251.5625SE +/- 0.000, N = 31.3891.3891.3891.3891.3890.8640.8640.8640.8610.7250.7250.7250.2640.2640.2640.2641. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

GpuOwl

Exponent: 77936867

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 77936867mjlk4090db4080 super acih4070 TI SUPERgeRTX 4060f30060090012001500SE +/- 0.00, N = 31440.921440.921434.721434.721432.66896.06896.06896.06894.45745.16745.16744.60278.55278.55278.55277.321. (CXX) g++ options: -O3 -lgmp -lOpenCL

GpuOwl

Exponent: 332220523

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 332220523mjlk4090db4080 super achi4070 TI SUPERgeRTX 4060f70140210280350SE +/- 0.00, N = 3305.34304.79304.60304.51304.51189.83189.83189.68189.61159.87159.82159.8258.8558.8558.8558.581. (CXX) g++ options: -O3 -lgmp -lOpenCL

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionj4090mlkdb4080 super aic4070 TI SUPERhgfRTX 4060e50K100K150K200K250KSE +/- 194.61, N = 3214418214165206803205813197629147059146963143183141164139429138716127806830658299782823827251. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionjmlkbcd4080 super aih40904070 TI SUPERRTX 4060efg4K8K12K16K20KSE +/- 43.25, N = 3193981861918369182051696316838168311638816357162431615114512107151063510593104611. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precisionlkm4090jdcb4080 super a4070 TI SUPERihgfeRTX 406011K22K33K44K55KSE +/- 8.95, N = 3506844646145665378123757334356343463177531464306772412523199121031209612070120251. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision4090mkljbdc4080 super ahi4070 TI SUPERRTX 4060gfe30K60K90K120K150KSE +/- 0.67, N = 3151611151234150527150378149674106348106300106208106172102692102682102606423474234542341423381. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionjl4090kbdm4080 super acih4070 TI SUPERegfRTX 406020K40K60K80K100KSE +/- 51.67, N = 3853498420080489802577502974674745877394868642681376758766454386843860938064370491. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precisionkjlm4090cbd4080 super ahi4070 TI SUPERfgeRTX 40602K4K6K8K10KSE +/- 2.40, N = 382238211821082088183571257065687568649994983485023852373237323711. (CXX) g++ options: -O3

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Write

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced Writelkjm40904080 super abcdi4070 TI SUPERhRTX 4060feg2004006008001000SE +/- 0.01, N = 3903.86902.92902.58902.16896.46632.69631.65629.88629.10607.19607.15606.59258.47258.40258.40258.361. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP32 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 Computejl4090mkdc4080 super abi4070 TI SUPERhegfRTX 406020406080100SE +/- 0.00, N = 385.8785.8585.8385.8185.8053.6653.6353.5053.4945.0745.0745.0716.5116.5116.5116.511. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflinglm4090jkcdb4080 super ah4070 TI SUPERifgeRTX 406030K60K90K120K150KSE +/- 2.03, N = 3152986152451152245152220152066107838107754107741107604104294104271104266430794307143067430621. (CXX) g++ options: -O3

FluidX3D

Test: FP32-FP32

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 2.9Test: FP32-FP32lmk4090j4080 super abdc4070 TI SUPERihgeRTX 4060f12002400360048006000SE +/- 0.00, N = 35739573757375736573339723968396739673831383038301609160916091608

FluidX3D

Test: FP32-FP16C

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 2.9Test: FP32-FP16C4090kmljcd4080 super abih4070 TI SUPERgfRTX 4060e2K4K6K8K10KSE +/- 0.33, N = 3110631098510983109821088281458128812781217421742074203114311431143113

FluidX3D

Test: FP32-FP16S

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 2.9Test: FP32-FP16S4090mkljcbd4080 super aih4070 TI SUPERgfeRTX 40602K4K6K8K10KSE +/- 0.33, N = 39906980998069800978580798079807880776683668166783039303930393038


Phoronix Test Suite v10.8.4