RTX 4070 SUPER

sudo apt install vulkan-headers vulkan-tools libvulkan-dev

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2412102-NE-INTELGPU716
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Additional Graphs

Show Perf Per Core/Thread Calculation Graphs Where Applicable
Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
NVIDIA RTX 4070 SUPER
January 25
  21 Hours, 7 Minutes
Intel ARC A770 8Gb
December 07
  11 Hours, 47 Minutes
Intel ARC A750
December 07
  1 Day, 7 Hours, 54 Minutes
intel-gpu
December 05
  10 Minutes
nvidia-gpu
December 05
  10 Minutes
Intel ARC A580
December 09
  23 Hours, 35 Minutes
Invert Behavior (Only Show Selected Data)
  14 Hours, 47 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


RTX 4070 SUPERProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750intel-gpunvidia-gpuIntel ARC A580Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS)Intel Device 7a2732GB4001GB Seagate ZP4000GP304001ASUS NVIDIA GeForce RTX 4070 SUPER 12GBRealtek ALC1220ARZOPAIntel I226-V + Intel Device 7a70EndeavourOS rolling6.7.1-arch1-1 (x86_64)KDE Plasma 5.27.10X Server 1.21.1.11NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.1 20230801ext41920x1080Intel Core Ultra 9 285K @ 5.10GHz (24 Cores)MSI MEG Z890 UNIFY-X (MS-7E20) v1.0 (1.A10 BIOS)Intel Device ae7f2 x 16GB DDR5-6000MT/s Corsair CMH32GX5M2B6000Z301024GB Wodposit NVMe SSDMSI Intel Arc A770 DG2 8GBIntel DG2 AudioPiKVM V3Realtek Device 5000 + Intel Wi-Fi 7Ubuntu 24.106.12.1-061201-generic (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.3.1 kisak-mesa PPAGCC 14.2.0Intel Arc A750 DG2 8GBOpenCL 3.0Intel Core i5-10300H @ 4.50GHz (4 Cores / 8 Threads)CML Stonic_CMS (V1.00 BIOS)Intel Comet Lake PCH16GB1000GB CT1000P3SSD8 + 256GB Western Digital PC SN530 SDBPNPZ-256G-1014Intel UHD CML GT2 4GB (1350/6000MHz)Intel Comet Lake PCH cAVSRealtek Killer E2600 GbE + Intel Comet Lake PCH CNVi WiFiUbuntu 24.046.8.0-49-generic (x86_64)GNOME Shell 46.0X Server 1.20.13NVIDIA 535.183.014.6 Mesa 24.0.9-0ubuntu0.2GCC 13.2.0NVIDIA GeForce GTX 1650 Ti 4GB4.6.0Intel Core Ultra 9 285K @ 5.10GHz (24 Cores)MSI MEG Z890 UNIFY-X (MS-7E20) v1.0 (1.A10 BIOS)Intel Device ae7f2 x 16GB DDR5-6000MT/s Corsair CMH32GX5M2B6000Z301024GB Wodposit NVMe SSDIntel Arc A580 DG2 8GBIntel DG2 AudioPiKVM V3Realtek Device 5000 + Intel Wi-Fi 7Ubuntu 24.106.12.1-061201-generic (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.3.1 kisak-mesa PPAOpenCL 3.0GCC 14.2.01280x720OpenBenchmarking.orgKernel Details- NVIDIA RTX 4070 SUPER: Transparent Huge Pages: always- Intel ARC A770 8Gb: Transparent Huge Pages: madvise- Intel ARC A750: Transparent Huge Pages: madvise- intel-gpu: Transparent Huge Pages: madvise- nvidia-gpu: Transparent Huge Pages: madvise- Intel ARC A580: Transparent Huge Pages: madviseCompiler Details- NVIDIA RTX 4070 SUPER: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Intel ARC A770 8Gb: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A750: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A580: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- NVIDIA RTX 4070 SUPER: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11d- Intel ARC A770 8Gb: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC A750: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- intel-gpu: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xfc - Thermald 2.5.6- nvidia-gpu: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xfc - Thermald 2.5.6- Intel ARC A580: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8Graphics Details- NVIDIA RTX 4070 SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.04.69.00.c1- intel-gpu: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.17.4c.00.1d- nvidia-gpu: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.17.4c.00.1dSecurity Details- NVIDIA RTX 4070 SUPER: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A770 8Gb: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A750: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - intel-gpu: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Mitigation of Microcode + tsx_async_abort: Not affected - nvidia-gpu: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Mitigation of Microcode + tsx_async_abort: Not affected - Intel ARC A580: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected Python Details- Intel ARC A770 8Gb, Intel ARC A750, Intel ARC A580: Python 3.12.7Environment Details- nvidia-gpu: __GLX_VENDOR_LIBRARY_NAME=nvidia

RTX 4070 SUPERvkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingvkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingspecviewperf2020: 1920 x 1080 - SNX-04specviewperf2020: 1920 x 1080 - CREO-03specviewperf2020: 1920 x 1080 - MAYA-06specviewperf2020: 1920 x 1080 - CATIA-06specviewperf2020: 1920 x 1080 - ENERGY-03specviewperf2020: 1920 x 1080 - MEDICAL-O3specviewperf2020: 1920 x 1080 - SOLIDWORKS-07neatbench: GPUparaview: Many Spheres - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080paraview: Wavelet Contour - 1920 x 1080unigine-valley: 1920 x 1080 - Fullscreen - OpenGLopenarena: 1920 x 1080unigine-heaven: 1920 x 1080 - Fullscreen - OpenGLxonotic: 1920 x 1080 - Lowxonotic: 1920 x 1080 - Highxonotic: 1920 x 1080 - Ultraxonotic: 1920 x 1080 - Ultimateopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writecl-mem: Copycl-mem: Readcl-mem: Writeviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tshoc: OpenCL - Triadshoc: OpenCL - Reductionshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthclpeak: Global Memory Bandwidthclpeak: Single-Precision Floatclpeak: Double-Precision Doublevkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4shoc: OpenCL - S3Dshoc: OpenCL - FFT SPshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTshoc: OpenCL - MD5 Hashclpeak: Integer Compute INTvkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4hashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTStensorflow: GPU - 1 - VGG-16tensorflow: GPU - 1 - AlexNettensorflow: GPU - 16 - VGG-16tensorflow: GPU - 32 - VGG-16tensorflow: GPU - 64 - VGG-16tensorflow: GPU - 16 - AlexNettensorflow: GPU - 32 - AlexNettensorflow: GPU - 64 - AlexNettensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 1 - ResNet-50tensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 64 - ResNet-50tensorflow: GPU - 1 - VGG-16tensorflow: GPU - 1 - AlexNettensorflow: GPU - 16 - VGG-16tensorflow: GPU - 32 - VGG-16tensorflow: GPU - 64 - VGG-16tensorflow: GPU - 16 - AlexNettensorflow: GPU - 32 - AlexNettensorflow: GPU - 64 - AlexNettensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 1 - ResNet-50tensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 64 - ResNet-50indigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarindigobench: CPU - Bedroomindigobench: CPU - Supercarparaview: Many Spheres - 1920 x 1080paraview: Wavelet Contour - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080fahbench: mandelgpu: GPUluxmark: GPU - Hotelluxmark: CPU+GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRluxmark: CPU+GPU - Microphoneluxmark: CPU+GPU - Luxball HDRopencl-benchmark: FP64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: FP16 Computeopencl-benchmark: INT64 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computevkresample: 2x - Doublevkresample: 2x - Singlefinancebench: Black-Scholes OpenCLncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3realsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yesblender: BMW27 - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - NVIDIA OptiXdarktable: Boat - OpenCLdarktable: Boat - CPU-onlydarktable: Masskrug - OpenCLdarktable: Masskrug - CPU-onlydarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLdarktable: Server Rack - CPU-onlydarktable: Server Room - CPU-onlyNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750intel-gpunvidia-gpuIntel ARC A58054794131705151662431773929502994451750784070464.86455.01331.8446.2407.513215616570.887.296.8102109334392370423437458210389437.6535492.69630.1111911711512257758459961318170.546758303333322132600000117646732327333338029671.3513.921.481.5031.5933.433.9712.624.3515.675.4615.615.5115.525.5519.80152.813366.0576587219538.20.62138.5944.21419.88917.17014.307339.59318.4895.9128.623.032.252.313.855.070.8411.04117.818.9716.1746.2663.826.8611.11844.612.866.32334.8852.8555.5712.609.4551.3014.2983.912280.656.878.877.610110617337.5411171.1324468.2738575.111341271381334675.064853.469201.649624.6618.19779.7957.3015.465.9317.2995.5527.01105.3546.0447.1123.3093.1849.15100.40453.80119.0588.9979.799.80861.8645.14632544100426557358800330116400431218705715425586253153263113172.0362.65165.7647.8531.1740.9773.1372.27243.83278.15226.621469.5224.518932.7899439777.9338302735.3422777551.3503947204.23400.79269.6153.7280.183.712281.356.478.776.710110598.888.513218.035574.527218.656522.4417883.462396.7211380.9115200.279779.1921377.7433768.6216868.9313898.3221379.2222840.73224.7171187.362049.20280485213412513913322.72494885.364081.954242.108052.948426.423101050000054018500002466339434666673284001.7015.792.392.422.4339.8342.7944.8215.485.2724.057.8326.478.0727.848.231.7115.932.402.432.4339.6243.0145.5715.755.2523.957.8726.528.0827.958.209.30219.7084.93212.5277245.4472898.6743901.283279683403.213236132624598860251458446053910.25115.6610.9635.16025.1089.49118.8944.84280.2557.8611.7111.4424.10101.9128.29102.7546.4447.5423.3694.1349.60101.26432.61119.0094.3880.2510.35666.0665.2912.9002.9051.6901.6890.1721.3620.1731.3819.9830474.81642994169895509358986301966353030388723465134584263052163157168.7561.80159.4245.8427.7439.1370.7265.03230.75252.66216.331524.2205.2501051.5133602856.7241455809.4562041575.5149048187.20408.90259.7144.0292.883.312281.356.578.677.810110511498.319917.772969.475118.655922.4252762.703388.859758.8613055.928389.6218359.4928957.5214354.2111908.7318324.9219567.54216.3851192.991777.18234522413312713713319.52804133.833504.133636.356905.617230.542653920000045760250002496678072666672779001.6915.782.392.4239.5542.8515.715.2324.107.8326.461.7315.812.402.422.4339.7443.0944.6115.685.2324.137.8726.349.25720.1664.97812.8586519.4522633.0443692.084248630520.11101511010381575058838224507889.00213.6351.0134.09624.0568.70120.4124.81680.5351.0429.0437.0515.70106.5624.06106.3246.1747.5823.4193.4849.56101.57462.80119.0093.5480.5311.09972.2825.5942.9162.9201.6801.6880.1761.3850.1721.373OpenBenchmarking.org

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2RNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58012K24K36K48K60KSE +/- 702.53, N = 15SE +/- 57.59, N = 3SE +/- 257.48, N = 15547943254429941-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT R2C / C2R

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in half precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58030K60K90K120K150KSE +/- 159.17, N = 3SE +/- 82.39, N = 3SE +/- 1784.34, N = 1213170510042669895-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in half precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein in single precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5803K6K9K12K15KSE +/- 102.52, N = 3SE +/- 3.76, N = 3SE +/- 66.00, N = 121516655735093-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C Bluestein in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in double precisionNVIDIA RTX 4070 SUPER5K10K15K20K25KSE +/- 146.69, N = 3243171. (CXX) g++ options: -O3 -lrt

Test: FFT + iFFT C2C 1D batched in double precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precisionNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A75016K32K48K64K80KSE +/- 7.94, N = 3SE +/- 54.76, N = 3SE +/- 38.85, N = 3739295898658800-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C multidimensional in single precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58011K22K33K44K55KSE +/- 407.19, N = 15SE +/- 17.37, N = 3SE +/- 408.80, N = 13502993301130196-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C multidimensional in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein benchmark in double precisionNVIDIA RTX 4070 SUPER10002000300040005000SE +/- 12.55, N = 344511. (CXX) g++ options: -O3 -lrt

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58016K32K48K64K80KSE +/- 37.77, N = 3SE +/- 48.89, N = 3SE +/- 4.48, N = 3750786400463530-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2RIntel ARC A750Intel ARC A5807K14K21K28K35KSE +/- 232.00, N = 15SE +/- 118.37, N = 331218303881. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionIntel ARC A580Intel ARC A75015K30K45K60K75KSE +/- 1441.23, N = 12SE +/- 2534.09, N = 1572346705711. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionIntel ARC A750Intel ARC A58012002400360048006000SE +/- 58.43, N = 3SE +/- 18.21, N = 3542551341. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in double precision

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionIntel ARC A750Intel ARC A58013K26K39K52K65KSE +/- 70.72, N = 3SE +/- 583.21, N = 358625584261. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionIntel ARC A750Intel ARC A5807K14K21K28K35KSE +/- 251.38, N = 12SE +/- 20.17, N = 331532305211. (CXX) g++ options: -O3

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingIntel ARC A580Intel ARC A75014K28K42K56K70KSE +/- 36.04, N = 3SE +/- 527.31, N = 363157631131. (CXX) g++ options: -O3

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SNX-04Intel ARC A750Intel ARC A5804080120160200SE +/- 0.08, N = 3SE +/- 0.33, N = 3172.03168.75

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CREO-03Intel ARC A750Intel ARC A5801428425670SE +/- 0.06, N = 3SE +/- 0.04, N = 362.6561.80

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MAYA-06Intel ARC A750Intel ARC A5804080120160200SE +/- 0.32, N = 3SE +/- 0.08, N = 3165.76159.42

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CATIA-06Intel ARC A750Intel ARC A5801122334455SE +/- 0.05, N = 3SE +/- 0.10, N = 347.8545.84

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: ENERGY-03Intel ARC A750Intel ARC A580714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 331.1727.74

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MEDICAL-O3Intel ARC A750Intel ARC A580918273645SE +/- 0.00, N = 3SE +/- 0.00, N = 340.9739.13

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SOLIDWORKS-07Intel ARC A750Intel ARC A5801632486480SE +/- 0.01, N = 3SE +/- 0.00, N = 373.1370.72

NeatBench

NeatBench is a benchmark of the cross-platform Neat Video software on the CPU and optional GPU (OpenCL / CUDA) support. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUNVIDIA RTX 4070 SUPER9001800270036004500SE +/- 0.00, N = 34070

Acceleration: GPU

Intel ARC A770 8Gb: The test run did not produce a result. E: Failed to load CUDA driver ("/usr/lib64/libcuda.so.1")

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

ParaView

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A750Intel ARC A5801632486480SE +/- 0.80, N = 5SE +/- 0.98, N = 1572.2765.03

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A750Intel ARC A58050100150200250SE +/- 4.59, N = 15SE +/- 2.19, N = 15243.83230.75

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A750Intel ARC A58060120180240300SE +/- 0.99, N = 3SE +/- 2.75, N = 4278.15252.66

Unigine Valley

This test calculates the average frame-rate within the Valley demo for the Unigine engine, released in February 2013. This engine is extremely demanding on the system's graphics card. Unigine Valley relies upon an OpenGL 3 core profile context. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Valley 1.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A750Intel ARC A580nvidia-gpuintel-gpu50100150200250SE +/- 0.53294, N = 3SE +/- 0.14945, N = 3SE +/- 0.36174, N = 3SE +/- 0.00191, N = 3226.62100216.3310074.816409.98304

OpenArena

This is a test of OpenArena, a popular open-source first-person shooter. This game is based upon ioquake3, which in turn uses the GPL version of id Software's Quake 3 engine. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterOpenArena 0.8.8Resolution: 1920 x 1080Intel ARC A580Intel ARC A750110220330440550SE +/- 4.78, N = 15SE +/- 6.89, N = 15524.2469.5MIN: 1MIN: 1

Unigine Heaven

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Heaven 4.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A750Intel ARC A58050100150200250SE +/- 0.14, N = 3SE +/- 0.23, N = 3224.52205.25

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: LowIntel ARC A580Intel ARC A7502004006008001000SE +/- 11.78, N = 3SE +/- 7.95, N = 31051.51932.79MIN: 704 / MAX: 1770MIN: 597 / MAX: 1516

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: HighIntel ARC A580Intel ARC A7502004006008001000SE +/- 6.14, N = 15SE +/- 3.18, N = 3856.72777.93MIN: 458 / MAX: 1375MIN: 437 / MAX: 1200

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltraIntel ARC A580Intel ARC A7502004006008001000SE +/- 3.16, N = 3SE +/- 8.54, N = 3809.46735.34MIN: 319 / MAX: 1303MIN: 308 / MAX: 1171

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltimateIntel ARC A580Intel ARC A750120240360480600SE +/- 5.72, N = 3SE +/- 2.85, N = 3575.51551.35MIN: 106 / MAX: 1264MIN: 110 / MAX: 1221

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced ReadNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580100200300400500SE +/- 0.01, N = 3SE +/- 0.16, N = 3SE +/- 0.63, N = 3464.86204.23187.201. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: Memory Bandwidth Coalesced Read

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced WriteNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A750100200300400500SE +/- 0.14, N = 3SE +/- 0.52, N = 3SE +/- 1.01, N = 3455.01408.90401.031. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: Memory Bandwidth Coalesced Write

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58070140210280350SE +/- 0.03, N = 3SE +/- 0.13, N = 3SE +/- 0.03, N = 3331.8269.6259.71. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Copy

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580100200300400500SE +/- 0.12, N = 3SE +/- 0.10, N = 3SE +/- 0.03, N = 3446.2153.7144.01. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Read

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A75090180270360450SE +/- 1.11, N = 3SE +/- 0.03, N = 3SE +/- 0.15, N = 3407.5292.8280.11. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Write

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A580306090120150SE +/- 1.20, N = 3SE +/- 0.41, N = 3SE +/- 0.20, N = 3SE +/- 0.20, N = 3132.083.983.783.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A750Intel ARC A770 8Gb306090120150SE +/- 2.19, N = 3SE +/- 0.88, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 31561221221221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A750Intel ARC A770 8Gb4080120160200SE +/- 2.73, N = 3SE +/- 0.92, N = 3SE +/- 0.17, N = 3SE +/- 0.27, N = 3165.081.381.380.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580Intel ARC A7501632486480SE +/- 0.32, N = 3SE +/- 0.12, N = 3SE +/- 0.07, N = 3SE +/- 0.06, N = 370.856.856.556.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A58020406080100SE +/- 0.12, N = 3SE +/- 0.03, N = 3SE +/- 0.09, N = 3SE +/- 0.03, N = 387.278.878.778.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A770 8GbIntel ARC A75020406080100SE +/- 0.09, N = 3SE +/- 0.27, N = 3SE +/- 0.53, N = 3SE +/- 0.15, N = 396.877.877.676.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A750Intel ARC A770 8Gb20406080100SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 31021011011011. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580Intel ARC A75020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 31091061051051. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A75070140210280350SE +/- 0.33, N = 3SE +/- 0.88, N = 15SE +/- 0.60, N = 3334.0117.098.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A75090180270360450SE +/- 0.00, N = 3SE +/- 0.32, N = 3SE +/- 0.27, N = 3392.098.388.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A75080160240320400SE +/- 0.00, N = 3SE +/- 7.33, N = 3SE +/- 1.33, N = 33701991361. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYNVIDIA RTX 4070 SUPER90180270360450SE +/- 0.33, N = 34231. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYNVIDIA RTX 4070 SUPER90180270360450SE +/- 0.00, N = 34371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTNVIDIA RTX 4070 SUPER100200300400500SE +/- 0.00, N = 34581. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NNVIDIA RTX 4070 SUPER50100150200250SE +/- 0.33, N = 32101. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TNVIDIA RTX 4070 SUPER80160240320400SE +/- 0.00, N = 33891. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadIntel ARC A750Intel ARC A58048121620SE +/- 0.01, N = 3SE +/- 0.02, N = 318.0417.771. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionIntel ARC A750Intel ARC A58020406080100SE +/- 0.04, N = 3SE +/- 0.64, N = 774.5369.481. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadIntel ARC A750Intel ARC A580510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 318.6618.661. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackIntel ARC A750Intel ARC A580510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 322.4422.431. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthIntel ARC A750Intel ARC A5802004006008001000SE +/- 0.34, N = 3SE +/- 0.14, N = 3883.46762.701. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58090180270360450SE +/- 0.02, N = 3SE +/- 0.13, N = 3SE +/- 0.13, N = 3437.65396.72388.851. (CXX) g++ options: -O3

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5808K16K24K32K40KSE +/- 0.99, N = 3SE +/- 3.31, N = 3SE +/- 1.40, N = 335492.6911380.919758.861. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleNVIDIA RTX 4070 SUPER140280420560700SE +/- 0.98, N = 3630.111. (CXX) g++ options: -O3

OpenCL Test: Double-Precision Double

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-scalarIntel ARC A770 8GbIntel ARC A750Intel ARC A5804K8K12K16K20KSE +/- 3.56, N = 3SE +/- 0.08, N = 3SE +/- 0.91, N = 317337.5415200.6213055.92

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-vec4Intel ARC A770 8GbIntel ARC A750Intel ARC A5802K4K6K8K10KSE +/- 0.27, N = 3SE +/- 0.07, N = 3SE +/- 0.09, N = 311171.139779.718389.62

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-scalarIntel ARC A770 8GbIntel ARC A750Intel ARC A5805K10K15K20K25KSE +/- 0.22, N = 3SE +/- 0.30, N = 3SE +/- 0.10, N = 324468.2721413.2918359.49

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-vec4Intel ARC A770 8GbIntel ARC A750Intel ARC A5808K16K24K32K40KSE +/- 0.43, N = 3SE +/- 1.72, N = 3SE +/- 0.72, N = 338575.1133769.5028957.52

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-scalarIntel ARC A750Intel ARC A5804K8K12K16K20KSE +/- 26.85, N = 3SE +/- 6.23, N = 316868.9314354.21

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-vec4Intel ARC A750Intel ARC A5803K6K9K12K15KSE +/- 0.79, N = 3SE +/- 0.40, N = 313898.3211908.73

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarIntel ARC A750Intel ARC A5805K10K15K20K25KSE +/- 1.00, N = 3SE +/- 0.21, N = 321379.2218324.92

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4Intel ARC A750Intel ARC A5805K10K15K20K25KSE +/- 0.55, N = 3SE +/- 0.47, N = 322840.7319567.54

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DIntel ARC A750Intel ARC A58050100150200250SE +/- 0.74, N = 3SE +/- 0.20, N = 3224.72216.391. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPIntel ARC A580Intel ARC A75030060090012001500SE +/- 12.21, N = 3SE +/- 9.13, N = 31192.991187.361. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NIntel ARC A750Intel ARC A580400800120016002000SE +/- 15.19, N = 15SE +/- 3.79, N = 32049.201777.181. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsIntel ARC A750Intel ARC A580600K1200K1800K2400K3000KSE +/- 132865.40, N = 12SE +/- 154699.96, N = 15280485223452241. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNIntel ARC A750Intel ARC A770 8GbIntel ARC A580NVIDIA RTX 4070 SUPER306090120150SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.67, N = 3SE +/- 4.04, N = 31341341331191. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTIntel ARC A580Intel ARC A770 8GbIntel ARC A750NVIDIA RTX 4070 SUPER306090120150SE +/- 1.20, N = 3SE +/- 0.33, N = 3SE +/- 1.53, N = 3SE +/- 2.08, N = 31271271251171. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNIntel ARC A750Intel ARC A770 8GbIntel ARC A580NVIDIA RTX 4070 SUPER306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 1.86, N = 3SE +/- 1.00, N = 21391381371151. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTIntel ARC A580Intel ARC A750Intel ARC A770 8GbNVIDIA RTX 4070 SUPER306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.58, N = 3SE +/- 2.08, N = 31331331331221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNNVIDIA RTX 4070 SUPER120240360480600SE +/- 0.00, N = 35771. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 35841. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 35991. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 36131. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashIntel ARC A750Intel ARC A580510152025SE +/- 0.01, N = 3SE +/- 0.03, N = 322.7219.531. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5804K8K12K16K20KSE +/- 3.14, N = 3SE +/- 2.34, N = 3SE +/- 3.17, N = 318170.544885.364133.831. (CXX) g++ options: -O3

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-scalarIntel ARC A770 8GbIntel ARC A750Intel ARC A58010002000300040005000SE +/- 0.04, N = 3SE +/- 0.14, N = 3SE +/- 0.11, N = 34675.064082.113504.13

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-vec4Intel ARC A770 8GbIntel ARC A750Intel ARC A58010002000300040005000SE +/- 0.03, N = 3SE +/- 0.15, N = 3SE +/- 0.06, N = 34853.464242.303636.35

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-scalarIntel ARC A770 8GbIntel ARC A750Intel ARC A5802K4K6K8K10KSE +/- 0.19, N = 3SE +/- 0.12, N = 3SE +/- 0.07, N = 39201.648053.116905.61

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-vec4Intel ARC A770 8GbIntel ARC A750Intel ARC A5802K4K6K8K10KSE +/- 0.11, N = 3SE +/- 0.15, N = 3SE +/- 0.13, N = 39624.668426.597230.54

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58014000M28000M42000M56000M70000MSE +/- 22430807.19, N = 3SE +/- 260097949.50, N = 3SE +/- 254278908.55, N = 3675830333333101050000026539200000

Benchmark: MD5

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5805000M10000M15000M20000M25000MSE +/- 5140363.15, N = 3SE +/- 65365351.42, N = 4SE +/- 49049760.02, N = 42213260000054018500004576025000

Benchmark: SHA1

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A750300K600K900K1200K1500KSE +/- 1991.93, N = 3SE +/- 66.67, N = 3SE +/- 240.37, N = 31176467249667246633

Benchmark: 7-Zip

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580700M1400M2100M2800M3500MSE +/- 1530068.99, N = 3SE +/- 5228554.08, N = 3SE +/- 1166666.67, N = 33232733333943466667807266667

Benchmark: SHA-512

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580200K400K600K800K1000KSE +/- 633.33, N = 3SE +/- 200.00, N = 3SE +/- 57.74, N = 3802967328400277900

Benchmark: TrueCrypt RIPEMD160 + XTS

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: VGG-16Intel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER0.38250.7651.14751.531.9125SE +/- 0.01, N = 3SE +/- 0.00, N = 31.701.691.35

Device: GPU - Batch Size: 1 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: AlexNetIntel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER48121620SE +/- 0.16, N = 3SE +/- 0.16, N = 3SE +/- 0.22, N = 215.7915.7813.92

Device: GPU - Batch Size: 1 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: VGG-16Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER0.53781.07561.61342.15122.689SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 22.392.391.48

Device: GPU - Batch Size: 16 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: VGG-16Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER0.54451.0891.63352.1782.7225SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 32.422.421.50

Device: GPU - Batch Size: 32 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: VGG-16Intel ARC A7500.54681.09361.64042.18722.734SE +/- 0.00, N = 32.43

Device: GPU - Batch Size: 64 - Model: VGG-16

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: UnboundLocalError: cannot access local variable 'decorators' where it is not associated with a value

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: AlexNetIntel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER918273645SE +/- 0.42, N = 3SE +/- 0.40, N = 3SE +/- 0.17, N = 339.8339.5531.59

Device: GPU - Batch Size: 16 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: AlexNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER1020304050SE +/- 0.03, N = 3SE +/- 0.19, N = 3SE +/- 0.15, N = 242.8542.7933.40

Device: GPU - Batch Size: 32 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: AlexNetIntel ARC A750NVIDIA RTX 4070 SUPER1020304050SE +/- 0.18, N = 344.8233.97

Device: GPU - Batch Size: 64 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: GoogLeNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER48121620SE +/- 0.05, N = 3SE +/- 0.13, N = 3SE +/- 0.17, N = 215.7115.4812.62

Device: GPU - Batch Size: 1 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: ResNet-50Intel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER1.18582.37163.55744.74325.929SE +/- 0.06, N = 3SE +/- 0.04, N = 85.275.234.35

Device: GPU - Batch Size: 1 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: GoogLeNetIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER612182430SE +/- 0.20, N = 3SE +/- 0.08, N = 3SE +/- 0.03, N = 324.1024.0515.67

Device: GPU - Batch Size: 16 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: ResNet-50Intel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER246810SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 27.837.835.46

Device: GPU - Batch Size: 16 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: GoogLeNetIntel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER612182430SE +/- 0.04, N = 3SE +/- 0.06, N = 3SE +/- 0.01, N = 226.4726.4615.61

Device: GPU - Batch Size: 32 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: ResNet-50Intel ARC A750NVIDIA RTX 4070 SUPER246810SE +/- 0.03, N = 3SE +/- 0.01, N = 28.075.51

Device: GPU - Batch Size: 32 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: GoogLeNetIntel ARC A750NVIDIA RTX 4070 SUPER714212835SE +/- 0.06, N = 327.8415.52

Device: GPU - Batch Size: 64 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: ResNet-50Intel ARC A750NVIDIA RTX 4070 SUPER246810SE +/- 0.01, N = 3SE +/- 0.01, N = 28.235.55

Device: GPU - Batch Size: 64 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: VGG-16Intel ARC A580Intel ARC A7500.38930.77861.16791.55721.9465SE +/- 0.01, N = 3SE +/- 0.01, N = 31.731.71

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: AlexNetIntel ARC A750Intel ARC A58048121620SE +/- 0.11, N = 12SE +/- 0.15, N = 615.9315.81

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: VGG-16Intel ARC A580Intel ARC A7500.541.081.622.162.7SE +/- 0.00, N = 3SE +/- 0.00, N = 32.402.40

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: VGG-16Intel ARC A750Intel ARC A5800.54681.09361.64042.18722.734SE +/- 0.00, N = 3SE +/- 0.01, N = 32.432.42

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: VGG-16Intel ARC A580Intel ARC A7500.54681.09361.64042.18722.734SE +/- 0.01, N = 2SE +/- 0.01, N = 32.432.43

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: AlexNetIntel ARC A580Intel ARC A750918273645SE +/- 0.33, N = 8SE +/- 0.26, N = 1539.7439.62

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: AlexNetIntel ARC A580Intel ARC A7501020304050SE +/- 0.27, N = 3SE +/- 0.23, N = 343.0943.01

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: AlexNetIntel ARC A750Intel ARC A5801020304050SE +/- 0.18, N = 3SE +/- 0.12, N = 345.5744.61

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: GoogLeNetIntel ARC A750Intel ARC A58048121620SE +/- 0.10, N = 3SE +/- 0.07, N = 315.7515.68

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: ResNet-50Intel ARC A750Intel ARC A5801.18132.36263.54394.72525.9065SE +/- 0.05, N = 8SE +/- 0.05, N = 45.255.23

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: GoogLeNetIntel ARC A580Intel ARC A750612182430SE +/- 0.20, N = 3SE +/- 0.03, N = 324.1323.95

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: ResNet-50Intel ARC A580Intel ARC A750246810SE +/- 0.01, N = 3SE +/- 0.02, N = 37.877.87

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: GoogLeNetIntel ARC A750Intel ARC A580612182430SE +/- 0.03, N = 3SE +/- 0.06, N = 326.5226.34

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: ResNet-50Intel ARC A750246810SE +/- 0.06, N = 38.08

Device: GPU - Batch Size: 32 - Model: ResNet-50

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: GoogLeNetIntel ARC A750714212835SE +/- 0.01, N = 327.95

Device: GPU - Batch Size: 64 - Model: GoogLeNet

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: ResNet-50Intel ARC A750246810SE +/- 0.02, N = 38.20

Device: GPU - Batch Size: 64 - Model: ResNet-50

Intel ARC A580: The test quit with a non-zero exit status.

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580510152025SE +/- 0.009, N = 3SE +/- 0.011, N = 3SE +/- 0.004, N = 319.8019.3029.257

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarNVIDIA RTX 4070 SUPERIntel ARC A580Intel ARC A7501224364860SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 352.8120.1719.71

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: BedroomIntel ARC A580Intel ARC A7501.12012.24023.36034.48045.6005SE +/- 0.015, N = 3SE +/- 0.046, N = 34.9784.932

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: SupercarIntel ARC A580Intel ARC A7503691215SE +/- 0.13, N = 3SE +/- 0.03, N = 312.8612.53

ParaView

OpenBenchmarking.orgMiPolys / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A750Intel ARC A58016003200480064008000SE +/- 80.48, N = 5SE +/- 98.62, N = 157245.456519.45

OpenBenchmarking.orgMiPolys / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A750Intel ARC A5806001200180024003000SE +/- 10.34, N = 3SE +/- 28.63, N = 42898.672633.04

OpenBenchmarking.orgMiVoxels / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A750Intel ARC A5808001600240032004000SE +/- 73.43, N = 15SE +/- 35.11, N = 153901.283692.08

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2NVIDIA RTX 4070 SUPER80160240320400SE +/- 0.39, N = 3366.06

Intel ARC A770 8Gb: The test run did not produce a result.

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

MandelGPU

MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580130M260M390M520M650MSE +/- 467034.80, N = 3SE +/- 5085622.05, N = 15SE +/- 600066.13, N = 3587219538.2279683403.2248630520.11. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

GLmark2

This is a test of GLmark2, a basic OpenGL and OpenGL ES 2.0 benchmark supporting various windowing/display back-ends. Learn more via the OpenBenchmarking.org test page.

Resolution: $VIDEO_WIDTH x $VIDEO_HEIGHT

Intel ARC A750: The test quit with a non-zero exit status. E: ./glmark2: 2: ./bin/glmark2: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./glmark2: 2: ./bin/glmark2: not found

LuxMark

LuxMark is a multi-platform OpenGL benchmark using LuxRender. LuxMark supports targeting different OpenCL devices and has multiple scenes available for rendering. LuxMark is a fully open-source OpenCL program with real-world rendering examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelIntel ARC A750Intel ARC A5803K6K9K12K15KSE +/- 25.67, N = 3SE +/- 11.33, N = 31323611015

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: HotelIntel ARC A750Intel ARC A5803K6K9K12K15KSE +/- 0.33, N = 3SE +/- 1.76, N = 31326211010

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: MicrophoneIntel ARC A750Intel ARC A58010K20K30K40K50KSE +/- 191.00, N = 3SE +/- 58.92, N = 34598838157

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRIntel ARC A750Intel ARC A58013K26K39K52K65KSE +/- 144.44, N = 3SE +/- 167.49, N = 36025150588

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: MicrophoneIntel ARC A750Intel ARC A58010K20K30K40K50KSE +/- 14.17, N = 3SE +/- 9.50, N = 34584438224

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: Luxball HDRIntel ARC A750Intel ARC A58013K26K39K52K65KSE +/- 154.09, N = 3SE +/- 62.27, N = 36053950788

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 ComputeNVIDIA RTX 4070 SUPER0.13970.27940.41910.55880.6985SE +/- 0.000, N = 30.6211. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: FP64 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580918273645SE +/- 0.031, N = 3SE +/- 0.077, N = 3SE +/- 0.013, N = 338.59410.2829.0021. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: FP32 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP16 ComputeIntel ARC A750Intel ARC A58048121620SE +/- 0.01, N = 3SE +/- 0.03, N = 315.6613.641. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5800.94821.89642.84463.79284.741SE +/- 0.015, N = 3SE +/- 0.075, N = 3SE +/- 0.004, N = 34.2141.0221.0131. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT64 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580510152025SE +/- 0.002, N = 3SE +/- 0.039, N = 3SE +/- 0.018, N = 319.8895.1604.0961. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT32 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 ComputeIntel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER612182430SE +/- 1.03, N = 3SE +/- 0.17, N = 3SE +/- 0.00, N = 325.3924.0617.171. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT16 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 ComputeNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58048121620SE +/- 0.046, N = 3SE +/- 0.058, N = 3SE +/- 0.057, N = 314.3079.5508.7011. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT8 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

Test: Conjugate Gradient OpenCL

NVIDIA RTX 4070 SUPER: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: arrayfire: line 3: ./cg_opencl: No such file or directory

Intel ARC A750: The test run did not produce a result. E: ./arrayfire: 3: ./cg_opencl: not found

Intel ARC A580: The test run did not produce a result. E: ./arrayfire: 3: ./cg_opencl: not found

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7dd7c6de3450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7b5ea59be450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7670bcda4450 google::LogMessageFatal::~LogMessageFatal()

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x73552c3e3450 google::LogMessageFatal::~LogMessageFatal()

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7d7151816450 google::LogMessageFatal::~LogMessageFatal()

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x74746a490450 google::LogMessageFatal::~LogMessageFatal()

OpenArena

OpenBenchmarking.orgMilliseconds, Fewer Is BetterOpenArena 0.8.8Resolution: 1920 x 1080 - Total Frame TimeIntel ARC A580Intel ARC A75048121620Min: 1 / Avg: 1.99 / Max: 14Min: 1 / Avg: 2.05 / Max: 14

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleNVIDIA RTX 4070 SUPER70140210280350SE +/- 0.30, N = 3339.591. (CXX) g++ options: -O3

Upscale: 2x - Precision: Double

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleIntel ARC A770 8GbNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.04, N = 3SE +/- 0.02, N = 318.2018.4918.8920.411. (CXX) g++ options: -O3

FinanceBench

FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLIntel ARC A580Intel ARC A750NVIDIA RTX 4070 SUPER1.33022.66043.99065.32086.651SE +/- 0.279, N = 15SE +/- 0.213, N = 12SE +/- 0.114, N = 154.8164.8425.9121. (CXX) g++ options: -O3 -march=native -fopenmp

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Target: Vulkan GPU

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ncnn: line 3: ./benchncnn: No such file or directory

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.47, N = 9SE +/- 0.84, N = 3SE +/- 0.36, N = 3SE +/- 0.40, N = 38.6276.4479.7980.53MIN: 6.42 / MAX: 1101.3MIN: 9.56 / MAX: 84MIN: 21.72 / MAX: 84.6MIN: 12.67 / MAX: 84.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8Gb1428425670SE +/- 0.44, N = 9SE +/- 2.78, N = 3SE +/- 2.81, N = 3SE +/- 5.70, N = 33.0338.3351.0457.30MIN: 4.08 / MAX: 72.04MIN: 4.14 / MAX: 72.47MIN: 4.06 / MAX: 71.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A580714212835SE +/- 0.16, N = 9SE +/- 3.80, N = 3SE +/- 5.47, N = 3SE +/- 13.05, N = 32.259.3715.4629.04MIN: 4.43 / MAX: 84.15MIN: 4.32 / MAX: 85.45MIN: 4.39 / MAX: 85.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A580918273645SE +/- 0.34, N = 8SE +/- 0.01, N = 3SE +/- 0.64, N = 3SE +/- 23.69, N = 32.314.875.9337.05MIN: 4.71 / MAX: 5.19MIN: 4.68 / MAX: 91.15MIN: 4.64 / MAX: 94.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8Gb714212835SE +/- 1.31, N = 9SE +/- 0.84, N = 3SE +/- 7.50, N = 3SE +/- 4.53, N = 33.8512.0015.7017.29MIN: 1.89 / MAX: 1093.29MIN: 4.04 / MAX: 70.29MIN: 3.96 / MAX: 70.57MIN: 3.85 / MAX: 70.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.97, N = 9SE +/- 2.82, N = 3SE +/- 1.81, N = 3SE +/- 3.44, N = 35.0754.5695.55106.56MIN: 6.64 / MAX: 119.78MIN: 6.7 / MAX: 121.28MIN: 6.77 / MAX: 121.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580Intel ARC A770 8Gb714212835SE +/- 0.04, N = 9SE +/- 2.78, N = 3SE +/- 9.84, N = 3SE +/- 13.01, N = 30.847.7124.0627.01MIN: 2.58 / MAX: 55.58MIN: 2.55 / MAX: 57MIN: 2.51 / MAX: 56.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A58020406080100SE +/- 1.21, N = 9SE +/- 2.20, N = 3SE +/- 0.15, N = 3SE +/- 1.35, N = 311.0493.10105.35106.32MIN: 5.28 / MAX: 1769.19MIN: 8.42 / MAX: 113.98MIN: 8.44 / MAX: 114.62MIN: 8.23 / MAX: 114.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16Intel ARC A750Intel ARC A770 8GbIntel ARC A580NVIDIA RTX 4070 SUPER306090120150SE +/- 0.12, N = 3SE +/- 0.14, N = 3SE +/- 0.28, N = 3SE +/- 29.60, N = 945.5446.0446.17117.81MIN: 23.39 / MAX: 48.99MIN: 28.48 / MAX: 48.53MIN: 29.78 / MAX: 48.34MIN: 17.16 / MAX: 647.671. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A5801122334455SE +/- 3.49, N = 9SE +/- 0.87, N = 3SE +/- 0.89, N = 3SE +/- 0.19, N = 38.9743.8147.1147.58MIN: 3.94 / MAX: 922.04MIN: 5.02 / MAX: 51.96MIN: 4.95 / MAX: 51.24MIN: 5.09 / MAX: 52.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A580612182430SE +/- 5.86, N = 9SE +/- 0.16, N = 3SE +/- 0.18, N = 3SE +/- 0.05, N = 316.1722.0023.3023.41MIN: 3.52 / MAX: 436.52MIN: 3.54 / MAX: 25.29MIN: 3.6 / MAX: 25.53MIN: 3.58 / MAX: 25.271. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A58020406080100SE +/- 14.70, N = 9SE +/- 1.03, N = 3SE +/- 1.06, N = 3SE +/- 0.83, N = 346.2687.5693.1893.48MIN: 7.71 / MAX: 1829.99MIN: 10.81 / MAX: 101.73MIN: 10.54 / MAX: 100.7MIN: 10.45 / MAX: 101.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyIntel ARC A750Intel ARC A770 8GbIntel ARC A580NVIDIA RTX 4070 SUPER1428425670SE +/- 0.09, N = 3SE +/- 0.10, N = 3SE +/- 0.02, N = 3SE +/- 10.56, N = 948.7749.1549.5663.82MIN: 16.94 / MAX: 52.55MIN: 20.41 / MAX: 52.43MIN: 18.28 / MAX: 52.1MIN: 10.28 / MAX: 858.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A58020406080100SE +/- 1.76, N = 9SE +/- 1.33, N = 3SE +/- 0.61, N = 3SE +/- 0.58, N = 36.8689.54100.40101.57MIN: 7.64 / MAX: 108.48MIN: 7.63 / MAX: 107.86MIN: 7.63 / MAX: 108.581. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A580100200300400500SE +/- 3.28, N = 9SE +/- 13.66, N = 3SE +/- 11.18, N = 3SE +/- 2.21, N = 311.11243.66453.80462.80MIN: 23.78 / MAX: 525.86MIN: 23.74 / MAX: 528.85MIN: 23.85 / MAX: 530.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerIntel ARC A750Intel ARC A580Intel ARC A770 8GbNVIDIA RTX 4070 SUPER2004006008001000SE +/- 0.39, N = 3SE +/- 0.39, N = 3SE +/- 0.24, N = 3SE +/- 87.53, N = 9116.35119.00119.05844.61MIN: 46.34 / MAX: 1866.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.29, N = 9SE +/- 4.25, N = 3SE +/- 2.65, N = 3SE +/- 1.06, N = 32.8661.4788.9993.54MIN: 5.39 / MAX: 102MIN: 5.39 / MAX: 101.65MIN: 5.44 / MAX: 102.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3Intel ARC A750Intel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.84, N = 3SE +/- 0.36, N = 3SE +/- 0.40, N = 376.4479.7980.53MIN: 9.56 / MAX: 84MIN: 21.72 / MAX: 84.6MIN: 12.67 / MAX: 84.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A5803691215SE +/- 0.150, N = 15SE +/- 0.018, N = 3SE +/- 0.008, N = 3SE +/- 0.021, N = 36.3239.80810.35611.099

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A5801632486480SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 334.8961.8666.0772.28

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

Scale: 2x - Denoise: 3 - TAA: No

NVIDIA RTX 4070 SUPER: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Intel ARC A770 8Gb: The test run did not produce a result.

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A5801.25872.51743.77615.03486.2935SE +/- 0.014, N = 3SE +/- 0.019, N = 3SE +/- 0.006, N = 3SE +/- 0.011, N = 32.8555.1465.2915.594

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: BMW27 - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1.25332.50663.75995.01326.2665SE +/- 0.06, N = 135.57

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Classroom - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.00, N = 312.60

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Fishy Cat - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.06, N = 139.45

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Barbershop - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1224364860SE +/- 0.10, N = 351.30

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER48121620SE +/- 0.03, N = 314.29

Betsy GPU Compressor

Betsy is an open-source GPU compressor of various GPU compression techniques. Betsy is written in GLSL for Vulkan/OpenGL (compute shader) support for GPU-based texture compression. Learn more via the OpenBenchmarking.org test page.

Codec: ETC1 - Quality: Highest

Intel ARC A750: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Codec: ETC2 RGB - Quality: Highest

Intel ARC A750: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: OpenCLIntel ARC A750Intel ARC A5800.65611.31221.96832.62443.2805SE +/- 0.011, N = 3SE +/- 0.004, N = 32.9002.916

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: CPU-onlyIntel ARC A750Intel ARC A5800.6571.3141.9712.6283.285SE +/- 0.012, N = 3SE +/- 0.005, N = 32.9052.920

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: OpenCLIntel ARC A580Intel ARC A7500.38030.76061.14091.52121.9015SE +/- 0.006, N = 3SE +/- 0.005, N = 31.6801.690

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: CPU-onlyIntel ARC A580Intel ARC A7500.380.761.141.521.9SE +/- 0.005, N = 3SE +/- 0.009, N = 31.6881.689

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: OpenCLIntel ARC A750Intel ARC A5800.03960.07920.11880.15840.198SE +/- 0.000, N = 3SE +/- 0.002, N = 30.1720.176

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: OpenCLIntel ARC A750Intel ARC A5800.31160.62320.93481.24641.558SE +/- 0.001, N = 3SE +/- 0.005, N = 31.3621.385

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: CPU-onlyIntel ARC A580Intel ARC A7500.03890.07780.11670.15560.1945SE +/- 0.001, N = 3SE +/- 0.001, N = 30.1720.173

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: CPU-onlyIntel ARC A580Intel ARC A7500.31070.62140.93211.24281.5535SE +/- 0.004, N = 3SE +/- 0.008, N = 31.3731.381

184 Results Shown

VkFFT:
  FFT + iFFT R2C / C2R
  FFT + iFFT C2C 1D batched in half precision
  FFT + iFFT C2C Bluestein in single precision
  FFT + iFFT C2C 1D batched in double precision
  FFT + iFFT C2C 1D batched in single precision
  FFT + iFFT C2C multidimensional in single precision
  FFT + iFFT C2C Bluestein benchmark in double precision
  FFT + iFFT C2C 1D batched in single precision, no reshuffling
  FFT + iFFT R2C / C2R
  FFT + iFFT C2C 1D batched in half precision
  FFT + iFFT C2C Bluestein in single precision
  FFT + iFFT C2C 1D batched in single precision
  FFT + iFFT C2C multidimensional in single precision
  FFT + iFFT C2C 1D batched in single precision, no reshuffling
SPECViewPerf 2020:
  1920 x 1080 - SNX-04
  1920 x 1080 - CREO-03
  1920 x 1080 - MAYA-06
  1920 x 1080 - CATIA-06
  1920 x 1080 - ENERGY-03
  1920 x 1080 - MEDICAL-O3
  1920 x 1080 - SOLIDWORKS-07
NeatBench
ParaView:
  Many Spheres - 1920 x 1080
  Wavelet Volume - 1920 x 1080
  Wavelet Contour - 1920 x 1080
Unigine Valley
OpenArena
Unigine Heaven
Xonotic:
  1920 x 1080 - Low
  1920 x 1080 - High
  1920 x 1080 - Ultra
  1920 x 1080 - Ultimate
ProjectPhysX OpenCL-Benchmark:
  Memory Bandwidth Coalesced Read
  Memory Bandwidth Coalesced Write
cl-mem:
  Copy
  Read
  Write
ViennaCL:
  CPU BLAS - sCOPY
  CPU BLAS - sAXPY
  CPU BLAS - sDOT
  CPU BLAS - dCOPY
  CPU BLAS - dAXPY
  CPU BLAS - dDOT
  CPU BLAS - dGEMV-N
  CPU BLAS - dGEMV-T
  OpenCL BLAS - sCOPY
  OpenCL BLAS - sAXPY
  OpenCL BLAS - sDOT
  OpenCL BLAS - dCOPY
  OpenCL BLAS - dAXPY
  OpenCL BLAS - dDOT
  OpenCL BLAS - dGEMV-N
  OpenCL BLAS - dGEMV-T
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Triad
  OpenCL - Reduction
  OpenCL - Bus Speed Download
  OpenCL - Bus Speed Readback
  OpenCL - Texture Read Bandwidth
clpeak:
  Global Memory Bandwidth
  Single-Precision Float
  Double-Precision Double
vkpeak:
  fp32-scalar
  fp32-vec4
  fp16-scalar
  fp16-vec4
vkpeak:
  fp32-scalar
  fp32-vec4
  fp16-scalar
  fp16-vec4
SHOC Scalable HeterOgeneous Computing:
  OpenCL - S3D
  OpenCL - FFT SP
  OpenCL - GEMM SGEMM_N
  OpenCL - Max SP Flops
ViennaCL:
  CPU BLAS - dGEMM-NN
  CPU BLAS - dGEMM-NT
  CPU BLAS - dGEMM-TN
  CPU BLAS - dGEMM-TT
  OpenCL BLAS - dGEMM-NN
  OpenCL BLAS - dGEMM-NT
  OpenCL BLAS - dGEMM-TN
  OpenCL BLAS - dGEMM-TT
SHOC Scalable HeterOgeneous Computing
clpeak
vkpeak:
  int32-scalar
  int32-vec4
  int16-scalar
  int16-vec4
Hashcat:
  MD5
  SHA1
  7-Zip
  SHA-512
  TrueCrypt RIPEMD160 + XTS
TensorFlow:
  GPU - 1 - VGG-16
  GPU - 1 - AlexNet
  GPU - 16 - VGG-16
  GPU - 32 - VGG-16
  GPU - 64 - VGG-16
  GPU - 16 - AlexNet
  GPU - 32 - AlexNet
  GPU - 64 - AlexNet
  GPU - 1 - GoogLeNet
  GPU - 1 - ResNet-50
  GPU - 16 - GoogLeNet
  GPU - 16 - ResNet-50
  GPU - 32 - GoogLeNet
  GPU - 32 - ResNet-50
  GPU - 64 - GoogLeNet
  GPU - 64 - ResNet-50
  GPU - 1 - VGG-16
  GPU - 1 - AlexNet
  GPU - 16 - VGG-16
  GPU - 32 - VGG-16
  GPU - 64 - VGG-16
  GPU - 16 - AlexNet
  GPU - 32 - AlexNet
  GPU - 64 - AlexNet
  GPU - 1 - GoogLeNet
  GPU - 1 - ResNet-50
  GPU - 16 - GoogLeNet
  GPU - 16 - ResNet-50
  GPU - 32 - GoogLeNet
  GPU - 32 - ResNet-50
  GPU - 64 - GoogLeNet
  GPU - 64 - ResNet-50
IndigoBench:
  OpenCL GPU - Bedroom
  OpenCL GPU - Supercar
  CPU - Bedroom
  CPU - Supercar
ParaView:
  Many Spheres - 1920 x 1080
  Wavelet Contour - 1920 x 1080
  Wavelet Volume - 1920 x 1080
FAHBench
MandelGPU
LuxMark:
  GPU - Hotel
  CPU+GPU - Hotel
  GPU - Microphone
  GPU - Luxball HDR
  CPU+GPU - Microphone
  CPU+GPU - Luxball HDR
ProjectPhysX OpenCL-Benchmark:
  FP64 Compute
  FP32 Compute
  FP16 Compute
  INT64 Compute
  INT32 Compute
  INT16 Compute
  INT8 Compute
OpenArena
VkResample:
  2x - Double
  2x - Single
FinanceBench
NCNN:
  Vulkan GPU - mobilenet
  Vulkan GPU-v2-v2 - mobilenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU - shufflenet-v2
  Vulkan GPU - mnasnet
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - blazeface
  Vulkan GPU - googlenet
  Vulkan GPU - vgg16
  Vulkan GPU - resnet18
  Vulkan GPU - alexnet
  Vulkan GPU - resnet50
  Vulkan GPU - yolov4-tiny
  Vulkan GPU - squeezenet_ssd
  Vulkan GPU - regnety_400m
  Vulkan GPU - vision_transformer
  Vulkan GPU - FastestDet
  Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3
RealSR-NCNN:
  4x - No
  4x - Yes
Waifu2x-NCNN Vulkan
Blender:
  BMW27 - NVIDIA OptiX
  Classroom - NVIDIA OptiX
  Fishy Cat - NVIDIA OptiX
  Barbershop - NVIDIA OptiX
  Pabellon Barcelona - NVIDIA OptiX
Darktable:
  Boat - OpenCL
  Boat - CPU-only
  Masskrug - OpenCL
  Masskrug - CPU-only
  Server Rack - OpenCL
  Server Room - OpenCL
  Server Rack - CPU-only
  Server Room - CPU-only