RTX 4070 SUPER

sudo apt install vulkan-headers vulkan-tools libvulkan-dev

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2412092-NE-INTELGPU196
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Additional Graphs

Show Perf Per Core/Thread Calculation Graphs Where Applicable
Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
NVIDIA RTX 4070 SUPER
January 25
  22 Hours, 46 Minutes
Intel ARC A770 8Gb
December 07
  11 Hours, 47 Minutes
Intel ARC A750
December 07
  1 Day, 8 Hours, 31 Minutes
intel-gpu
December 05
  10 Minutes
nvidia-gpu
December 05
  10 Minutes
Invert Behavior (Only Show Selected Data)
  13 Hours, 29 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


RTX 4070 SUPERProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNVIDIA RTX 4070 SUPERintel-gpunvidia-gpuIntel ARC A770 8GbIntel ARC A750Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS)Intel Device 7a2732GB4001GB Seagate ZP4000GP304001ASUS NVIDIA GeForce RTX 4070 SUPER 12GBRealtek ALC1220ARZOPAIntel I226-V + Intel Device 7a70EndeavourOS rolling6.7.1-arch1-1 (x86_64)KDE Plasma 5.27.10X Server 1.21.1.11NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.1 20230801ext41920x1080Intel Core i5-10300H @ 4.50GHz (4 Cores / 8 Threads)CML Stonic_CMS (V1.00 BIOS)Intel Comet Lake PCH16GB1000GB CT1000P3SSD8 + 256GB Western Digital PC SN530 SDBPNPZ-256G-1014Intel UHD CML GT2 4GB (1350/6000MHz)Intel Comet Lake PCH cAVSRealtek Killer E2600 GbE + Intel Comet Lake PCH CNVi WiFiUbuntu 24.046.8.0-49-generic (x86_64)GNOME Shell 46.0X Server 1.20.13NVIDIA 535.183.014.6 Mesa 24.0.9-0ubuntu0.2GCC 13.2.0NVIDIA GeForce GTX 1650 Ti 4GB4.6.0Intel Core Ultra 9 285K @ 5.10GHz (24 Cores)MSI MEG Z890 UNIFY-X (MS-7E20) v1.0 (1.A10 BIOS)Intel Device ae7f2 x 16GB DDR5-6000MT/s Corsair CMH32GX5M2B6000Z301024GB Wodposit NVMe SSDMSI Intel Arc A770 DG2 8GBIntel DG2 AudioPiKVM V3Realtek Device 5000 + Intel Wi-Fi 7Ubuntu 24.106.12.1-061201-generic (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.3.1 kisak-mesa PPAGCC 14.2.0Intel Arc A750 DG2 8GBOpenCL 3.0OpenBenchmarking.orgKernel Details- NVIDIA RTX 4070 SUPER: Transparent Huge Pages: always- intel-gpu: Transparent Huge Pages: madvise- nvidia-gpu: Transparent Huge Pages: madvise- Intel ARC A770 8Gb: Transparent Huge Pages: madvise- Intel ARC A750: Transparent Huge Pages: madviseCompiler Details- NVIDIA RTX 4070 SUPER: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Intel ARC A770 8Gb: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A750: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- NVIDIA RTX 4070 SUPER: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11d- intel-gpu: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xfc - Thermald 2.5.6- nvidia-gpu: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xfc - Thermald 2.5.6- Intel ARC A770 8Gb: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC A750: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8Graphics Details- NVIDIA RTX 4070 SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.04.69.00.c1- intel-gpu: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.17.4c.00.1d- nvidia-gpu: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.17.4c.00.1dSecurity Details- NVIDIA RTX 4070 SUPER: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - intel-gpu: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Mitigation of Microcode + tsx_async_abort: Not affected - nvidia-gpu: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Mitigation of Microcode + tsx_async_abort: Not affected - Intel ARC A770 8Gb: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A750: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected Environment Details- nvidia-gpu: __GLX_VENDOR_LIBRARY_NAME=nvidiaPython Details- Intel ARC A770 8Gb, Intel ARC A750: Python 3.12.7

RTX 4070 SUPERncnn: Vulkan GPU - googlenetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - regnety_400mhashcat: 7-Zipviennacl: OpenCL BLAS - sAXPYhashcat: SHA1opencl-benchmark: INT32 Computeopencl-benchmark: FP32 Computeclpeak: Integer Compute INThashcat: SHA-512viennacl: OpenCL BLAS - sCOPYclpeak: Single-Precision Floatunigine-valley: 1920 x 1080 - Fullscreen - OpenGLcl-mem: Readviennacl: OpenCL BLAS - sDOTvkfft: FFT + iFFT C2C Bluestein in single precisionindigobench: OpenCL GPU - Supercarhashcat: TrueCrypt RIPEMD160 + XTSopencl-benchmark: Memory Bandwidth Coalesced Readhashcat: MD5indigobench: OpenCL GPU - Bedroomviennacl: CPU BLAS - sDOTncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - squeezenet_ssdrealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yesvkfft: FFT + iFFT R2C / C2Rviennacl: CPU BLAS - sCOPYvkfft: FFT + iFFT C2C multidimensional in single precisionopencl-benchmark: INT8 Computencnn: Vulkan GPU - alexnetcl-mem: Writevkfft: FFT + iFFT C2C 1D batched in half precisionncnn: Vulkan GPU - yolov4-tinyviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - dDOTvkfft: FFT + iFFT C2C 1D batched in single precisionviennacl: CPU BLAS - dCOPYcl-mem: Copyviennacl: CPU BLAS - dGEMM-TNvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingvkpeak: int32-scalarvkpeak: int32-vec4vkpeak: fp32-vec4opencl-benchmark: Memory Bandwidth Coalesced Writevkpeak: fp16-scalarvkpeak: int16-scalarvkpeak: fp16-vec4vkpeak: int16-vec4vkpeak: fp32-scalarviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dAXPYclpeak: Global Memory Bandwidthviennacl: CPU BLAS - dGEMM-TTviennacl: CPU BLAS - dGEMM-NTncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3vkresample: 2x - Singleviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMV-Nopencl-benchmark: FP16 Computeshoc: OpenCL - Texture Read Bandwidthshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Reductionshoc: OpenCL - MD5 Hashshoc: OpenCL - FFT SPshoc: OpenCL - Triadshoc: OpenCL - S3Ddarktable: Server Room - CPU-onlydarktable: Server Rack - CPU-onlydarktable: Server Room - OpenCLdarktable: Server Rack - OpenCLdarktable: Masskrug - CPU-onlydarktable: Masskrug - OpenCLdarktable: Boat - CPU-onlydarktable: Boat - OpenCLspecviewperf2020: 1920 x 1080 - SOLIDWORKS-07specviewperf2020: 1920 x 1080 - MEDICAL-O3specviewperf2020: 1920 x 1080 - ENERGY-03specviewperf2020: 1920 x 1080 - CATIA-06specviewperf2020: 1920 x 1080 - MAYA-06specviewperf2020: 1920 x 1080 - CREO-03specviewperf2020: 1920 x 1080 - SNX-04luxmark: CPU+GPU - Luxball HDRluxmark: CPU+GPU - Microphoneluxmark: GPU - Luxball HDRluxmark: GPU - Microphoneluxmark: CPU+GPU - Hotelluxmark: GPU - Hotelindigobench: CPU - Supercarindigobench: CPU - Bedroomtensorflow: GPU - 512 - GoogLeNettensorflow: GPU - 256 - ResNet-50tensorflow: GPU - 256 - GoogLeNettensorflow: GPU - 64 - ResNet-50tensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 512 - AlexNettensorflow: GPU - 256 - AlexNettensorflow: GPU - 1 - ResNet-50tensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 64 - AlexNettensorflow: GPU - 32 - AlexNettensorflow: GPU - 16 - AlexNettensorflow: GPU - 64 - VGG-16tensorflow: GPU - 32 - VGG-16tensorflow: GPU - 16 - VGG-16tensorflow: GPU - 1 - AlexNettensorflow: GPU - 1 - VGG-16paraview: Wavelet Contour - 1920 x 1080paraview: Many Spheres - 1920 x 1080xonotic: 1920 x 1080 - Ultimatexonotic: 1920 x 1080 - Ultraxonotic: 1920 x 1080 - Highxonotic: 1920 x 1080 - Lowunigine-heaven: 1920 x 1080 - Fullscreen - OpenGLopenarena: 1920 x 1080vkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT R2C / C2Rvkpeak: fp16-vec4vkpeak: fp16-scalarvkpeak: fp32-vec4vkpeak: fp32-scalartensorflow: GPU - 64 - ResNet-50tensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 512 - AlexNettensorflow: GPU - 256 - AlexNettensorflow: GPU - 1 - ResNet-50tensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 64 - AlexNettensorflow: GPU - 32 - AlexNettensorflow: GPU - 16 - AlexNettensorflow: GPU - 32 - VGG-16tensorflow: GPU - 16 - VGG-16tensorflow: GPU - 1 - AlexNettensorflow: GPU - 1 - VGG-16neatbench: GPUblender: Pabellon Barcelona - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: BMW27 - NVIDIA OptiXviennacl: OpenCL BLAS - dGEMM-TTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dCOPYclpeak: Double-Precision Doublefahbench: vkresample: 2x - Doublevkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in double precisionopencl-benchmark: FP64 Computeshoc: OpenCL - Max SP Flopsparaview: Wavelet Contour - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080paraview: Many Spheres - 1920 x 1080vkfft: FFT + iFFT C2C 1D batched in half precisionncnn: Vulkan GPU - FastestDetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU-v2-v2 - mobilenet-v2mandelgpu: GPUfinancebench: Black-Scholes OpenCLrealsr-ncnn: 4x - Noopencl-benchmark: INT16 Computeopencl-benchmark: INT64 ComputeNVIDIA RTX 4070 SUPERintel-gpunvidia-gpuIntel ARC A770 8GbIntel ARC A75011.048.62844.618.9711.1111764673922213260000019.88938.59418170.54323273333333435492.69446.23701516652.813802967464.866758303333319.80116546.266.8634.8852.855547941325029914.30716.17407.513170563.8215696.87392970.8331.811575078455.0111987.2437.6512211718.4891091025.5515.525.5115.615.4615.6735.1034.164.3512.6233.9733.431.591.501.4813.921.35407014.2951.309.4512.605.57613599584577389210458437423630.11366.0576339.5934451243170.6212.86117.810.845.073.852.312.253.03587219538.25.9126.32317.1704.2149.9830474.8164105.3579.79119.0547.11453.8080.693.18100.4061.8645.14683.923.3049.1512277.656.81384675.064853.4611171.1324468.279201.6438575.119624.6617337.5413478.813312779.7918.19710610188.9946.0427.0195.5517.295.9315.4657.309.80893.1076.44116.3543.81243.6624663388.154018500005.10710.2284885.3694346666798.011380.91226.173153.7133557319.708328400203.73310105000009.30281.387.5689.5466.0665.2913254483.7330119.47022.00280.110042648.7712276.75880056.4269.6139640044082.114242.309779.71398.1521413.298053.0833769.388426.5915200.6213478.7396.7213312576.4418.89410510115.616883.46222.441718.65652049.2074.527222.72491187.3618.0355224.7171.3810.1731.3620.1721.6891.6902.9052.90073.1340.9731.1747.85165.7662.65172.0360539458446025145988132621323612.5274.93229.438.3329.328.2027.958.0826.527.8723.9547.7746.355.2515.7545.5743.0139.622.432.432.4015.931.71278.1572.27551.3503947735.3422777777.9338302932.7899439224.518469.563113315325862554253121822840.7321379.2213898.3216868.9328048522898.6743901.283243.837245.4477057161.4745.5411.5154.5612.005.049.3738.33279683403.24.84210.35625.3900.945OpenBenchmarking.org

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb20406080100SE +/- 0.29, N = 3SE +/- 1.21, N = 9SE +/- 0.15, N = 3106.0711.04105.35MIN: 8.51 / MAX: 115.08MIN: 5.28 / MAX: 1769.19MIN: 8.44 / MAX: 114.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb20406080100SE +/- 0.38, N = 3SE +/- 0.47, N = 9SE +/- 0.36, N = 380.118.6279.79MIN: 9.8 / MAX: 84.4MIN: 6.42 / MAX: 1101.3MIN: 21.72 / MAX: 84.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb2004006008001000SE +/- 0.44, N = 3SE +/- 87.53, N = 9SE +/- 0.24, N = 3118.97844.61119.05MIN: 46.34 / MAX: 1866.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb1122334455SE +/- 0.13, N = 3SE +/- 3.49, N = 9SE +/- 0.89, N = 347.308.9747.11MIN: 4.97 / MAX: 51.26MIN: 3.94 / MAX: 922.04MIN: 4.95 / MAX: 51.241. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb100200300400500SE +/- 4.65, N = 3SE +/- 3.28, N = 9SE +/- 11.18, N = 3454.6711.11453.80MIN: 23.93 / MAX: 530.68MIN: 23.74 / MAX: 528.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipNVIDIA RTX 4070 SUPERIntel ARC A750300K600K900K1200K1500KSE +/- 1991.93, N = 3SE +/- 240.37, N = 31176467246633

Benchmark: 7-Zip

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYIntel ARC A750NVIDIA RTX 4070 SUPER90180270360450SE +/- 0.12, N = 3SE +/- 0.00, N = 388.0392.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1NVIDIA RTX 4070 SUPERIntel ARC A7505000M10000M15000M20000M25000MSE +/- 5140363.15, N = 3SE +/- 65365351.42, N = 4221326000005401850000

Benchmark: SHA1

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 ComputeIntel ARC A750NVIDIA RTX 4070 SUPER510152025SE +/- 0.016, N = 3SE +/- 0.002, N = 35.08219.8891. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT32 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 ComputeIntel ARC A750NVIDIA RTX 4070 SUPER918273645SE +/- 0.06, N = 3SE +/- 0.03, N = 310.2638.591. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: FP32 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTNVIDIA RTX 4070 SUPERIntel ARC A7504K8K12K16K20KSE +/- 3.14, N = 3SE +/- 2.34, N = 318170.544885.361. (CXX) g++ options: -O3

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512NVIDIA RTX 4070 SUPERIntel ARC A750700M1400M2100M2800M3500MSE +/- 1530068.99, N = 3SE +/- 5228554.08, N = 33232733333943466667

Benchmark: SHA-512

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYIntel ARC A750NVIDIA RTX 4070 SUPER70140210280350SE +/- 0.35, N = 3SE +/- 0.33, N = 398.2334.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatNVIDIA RTX 4070 SUPERIntel ARC A7508K16K24K32K40KSE +/- 0.99, N = 3SE +/- 3.31, N = 335492.6911380.911. (CXX) g++ options: -O3

Unigine Valley

This test calculates the average frame-rate within the Valley demo for the Unigine engine, released in February 2013. This engine is extremely demanding on the system's graphics card. Unigine Valley relies upon an OpenGL 3 core profile context. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Valley 1.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A750intel-gpunvidia-gpu50100150200250SE +/- 0.53294, N = 3SE +/- 0.00191, N = 3SE +/- 0.36174, N = 3226.621009.9830474.81640

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadNVIDIA RTX 4070 SUPERIntel ARC A750100200300400500SE +/- 0.12, N = 3SE +/- 0.10, N = 3446.2153.71. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Read

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTIntel ARC A750NVIDIA RTX 4070 SUPER80160240320400SE +/- 1.33, N = 3SE +/- 0.00, N = 31343701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein in single precisionNVIDIA RTX 4070 SUPERIntel ARC A7503K6K9K12K15KSE +/- 102.52, N = 3SE +/- 3.76, N = 3151665573-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C Bluestein in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarNVIDIA RTX 4070 SUPERIntel ARC A7501224364860SE +/- 0.03, N = 3SE +/- 0.02, N = 352.8119.71

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSNVIDIA RTX 4070 SUPERIntel ARC A750200K400K600K800K1000KSE +/- 633.33, N = 3SE +/- 200.00, N = 3802967328400

Benchmark: TrueCrypt RIPEMD160 + XTS

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced ReadIntel ARC A750NVIDIA RTX 4070 SUPER100200300400500SE +/- 0.74, N = 3SE +/- 0.01, N = 3202.92464.861. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: Memory Bandwidth Coalesced Read

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5NVIDIA RTX 4070 SUPERIntel ARC A75014000M28000M42000M56000M70000MSE +/- 22430807.19, N = 3SE +/- 260097949.50, N = 36758303333331010500000

Benchmark: MD5

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomNVIDIA RTX 4070 SUPERIntel ARC A750510152025SE +/- 0.009, N = 3SE +/- 0.011, N = 319.8019.302

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A7504080120160200SE +/- 2.73, N = 3SE +/- 0.27, N = 3SE +/- 0.17, N = 3165.080.681.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb20406080100SE +/- 0.98, N = 3SE +/- 14.70, N = 9SE +/- 1.06, N = 392.8346.2693.18MIN: 10.52 / MAX: 101.84MIN: 7.71 / MAX: 1829.99MIN: 10.54 / MAX: 100.71. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb20406080100SE +/- 0.48, N = 3SE +/- 1.76, N = 9SE +/- 0.61, N = 3100.626.86100.40MIN: 7.79 / MAX: 107.95MIN: 7.63 / MAX: 107.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenArena

OpenBenchmarking.orgMilliseconds, Fewer Is BetterOpenArena 0.8.8Resolution: 1920 x 1080 - Total Frame TimeIntel ARC A75048121620Min: 1 / Avg: 2.05 / Max: 14

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A7501530456075SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 334.8961.8666.07

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A7501.19052.3813.57154.7625.9525SE +/- 0.014, N = 3SE +/- 0.019, N = 3SE +/- 0.006, N = 32.8555.1465.291

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2RNVIDIA RTX 4070 SUPERIntel ARC A75012K24K36K48K60KSE +/- 702.53, N = 15SE +/- 57.59, N = 35479432544-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT R2C / C2R

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750306090120150SE +/- 1.20, N = 3SE +/- 0.41, N = 3SE +/- 0.20, N = 3132.083.983.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C multidimensional in single precisionNVIDIA RTX 4070 SUPERIntel ARC A75011K22K33K44K55KSE +/- 407.19, N = 15SE +/- 17.37, N = 35029933011-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C multidimensional in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 ComputeIntel ARC A750NVIDIA RTX 4070 SUPER48121620SE +/- 0.031, N = 3SE +/- 0.046, N = 39.54814.3071. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT8 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb612182430SE +/- 0.07, N = 3SE +/- 5.86, N = 9SE +/- 0.18, N = 323.4216.1723.30MIN: 3.57 / MAX: 25.48MIN: 3.52 / MAX: 436.52MIN: 3.6 / MAX: 25.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteNVIDIA RTX 4070 SUPERIntel ARC A75090180270360450SE +/- 1.11, N = 3SE +/- 0.15, N = 3407.5280.11. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Write

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in half precisionNVIDIA RTX 4070 SUPERIntel ARC A75030K60K90K120K150KSE +/- 159.17, N = 3SE +/- 82.39, N = 3131705100426-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in half precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb1428425670SE +/- 0.22, N = 3SE +/- 10.56, N = 9SE +/- 0.10, N = 349.2063.8249.15MIN: 22.47 / MAX: 52.7MIN: 10.28 / MAX: 858.44MIN: 20.41 / MAX: 52.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750306090120150SE +/- 2.19, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 31561221221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A75020406080100SE +/- 0.09, N = 3SE +/- 0.53, N = 3SE +/- 0.15, N = 396.877.676.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precisionNVIDIA RTX 4070 SUPERIntel ARC A75016K32K48K64K80KSE +/- 7.94, N = 3SE +/- 38.85, N = 37392958800-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A7501632486480SE +/- 0.32, N = 3SE +/- 0.12, N = 3SE +/- 0.06, N = 370.856.856.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyNVIDIA RTX 4070 SUPERIntel ARC A75070140210280350SE +/- 0.03, N = 3SE +/- 0.13, N = 3331.8269.61. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Copy

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750306090120150SE +/- 1.00, N = 2SE +/- 0.33, N = 3SE +/- 0.33, N = 31151381391. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingNVIDIA RTX 4070 SUPERIntel ARC A75016K32K48K64K80KSE +/- 37.77, N = 3SE +/- 48.89, N = 37507864004-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-scalarIntel ARC A750Intel ARC A770 8Gb10002000300040005000SE +/- 0.07, N = 3SE +/- 0.04, N = 34081.974675.06

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-vec4Intel ARC A750Intel ARC A770 8Gb10002000300040005000SE +/- 0.17, N = 3SE +/- 0.03, N = 34242.144853.46

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-vec4Intel ARC A750Intel ARC A770 8Gb2K4K6K8K10KSE +/- 0.13, N = 3SE +/- 0.27, N = 39779.3611171.13

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced WriteIntel ARC A750NVIDIA RTX 4070 SUPER100200300400500SE +/- 1.65, N = 3SE +/- 0.14, N = 3400.34455.011. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: Memory Bandwidth Coalesced Write

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-scalarIntel ARC A750Intel ARC A770 8Gb5K10K15K20K25KSE +/- 0.35, N = 3SE +/- 0.22, N = 321412.3624468.27

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-scalarIntel ARC A750Intel ARC A770 8Gb2K4K6K8K10KSE +/- 0.12, N = 3SE +/- 0.19, N = 38053.119201.64

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-vec4Intel ARC A750Intel ARC A770 8Gb8K16K24K32K40KSE +/- 0.50, N = 3SE +/- 0.43, N = 333768.5238575.11

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-vec4Intel ARC A750Intel ARC A770 8Gb2K4K6K8K10KSE +/- 0.19, N = 3SE +/- 0.11, N = 38426.529624.66

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-scalarIntel ARC A750Intel ARC A770 8Gb4K8K12K16K20KSE +/- 0.69, N = 3SE +/- 3.56, N = 315200.5917337.54

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750306090120150SE +/- 4.04, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 31191341341. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A75020406080100SE +/- 0.12, N = 3SE +/- 0.03, N = 3SE +/- 0.09, N = 387.278.878.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthNVIDIA RTX 4070 SUPERIntel ARC A75090180270360450SE +/- 0.02, N = 3SE +/- 0.13, N = 3437.65396.721. (CXX) g++ options: -O3

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750306090120150SE +/- 2.08, N = 3SE +/- 0.58, N = 3SE +/- 0.33, N = 31221331331. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750306090120150SE +/- 2.08, N = 3SE +/- 0.33, N = 3SE +/- 1.53, N = 31171271251. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3Intel ARC A750Intel ARC A770 8Gb20406080100SE +/- 0.38, N = 3SE +/- 0.36, N = 380.1179.79MIN: 9.8 / MAX: 84.4MIN: 21.72 / MAX: 84.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.04, N = 318.4918.2018.891. (CXX) g++ options: -O3

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A75020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 31091061051. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A75020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 31021011011. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP16 ComputeIntel ARC A75048121620SE +/- 0.03, N = 315.601. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthIntel ARC A7502004006008001000SE +/- 0.34, N = 3883.461. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackIntel ARC A750510152025SE +/- 0.00, N = 322.441. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadIntel ARC A750510152025SE +/- 0.00, N = 318.661. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NIntel ARC A750400800120016002000SE +/- 15.19, N = 152049.201. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionIntel ARC A75020406080100SE +/- 0.04, N = 374.531. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashIntel ARC A750510152025SE +/- 0.01, N = 322.721. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPIntel ARC A75030060090012001500SE +/- 9.13, N = 31187.361. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadIntel ARC A75048121620SE +/- 0.01, N = 318.041. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DIntel ARC A75050100150200250SE +/- 0.74, N = 3224.721. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: CPU-onlyIntel ARC A7500.31070.62140.93211.24281.5535SE +/- 0.008, N = 31.381

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: CPU-onlyIntel ARC A7500.03890.07780.11670.15560.1945SE +/- 0.001, N = 30.173

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: OpenCLIntel ARC A7500.30650.6130.91951.2261.5325SE +/- 0.001, N = 31.362

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: OpenCLIntel ARC A7500.03870.07740.11610.15480.1935SE +/- 0.000, N = 30.172

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: CPU-onlyIntel ARC A7500.380.761.141.521.9SE +/- 0.009, N = 31.689

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: OpenCLIntel ARC A7500.38030.76061.14091.52121.9015SE +/- 0.005, N = 31.690

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: CPU-onlyIntel ARC A7500.65361.30721.96082.61443.268SE +/- 0.012, N = 32.905

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: OpenCLIntel ARC A7500.65251.3051.95752.613.2625SE +/- 0.011, N = 32.900

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SOLIDWORKS-07Intel ARC A7501632486480SE +/- 0.01, N = 373.13

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MEDICAL-O3Intel ARC A750918273645SE +/- 0.00, N = 340.97

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: ENERGY-03Intel ARC A750714212835SE +/- 0.00, N = 331.17

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CATIA-06Intel ARC A7501122334455SE +/- 0.05, N = 347.85

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MAYA-06Intel ARC A7504080120160200SE +/- 0.32, N = 3165.76

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CREO-03Intel ARC A7501428425670SE +/- 0.06, N = 362.65

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SNX-04Intel ARC A7504080120160200SE +/- 0.08, N = 3172.03

LuxMark

LuxMark is a multi-platform OpenGL benchmark using LuxRender. LuxMark supports targeting different OpenCL devices and has multiple scenes available for rendering. LuxMark is a fully open-source OpenCL program with real-world rendering examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: Luxball HDRIntel ARC A75013K26K39K52K65KSE +/- 154.09, N = 360539

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: MicrophoneIntel ARC A75010K20K30K40K50KSE +/- 14.17, N = 345844

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRIntel ARC A75013K26K39K52K65KSE +/- 144.44, N = 360251

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: MicrophoneIntel ARC A75010K20K30K40K50KSE +/- 191.00, N = 345988

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: HotelIntel ARC A7503K6K9K12K15KSE +/- 0.33, N = 313262

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelIntel ARC A7503K6K9K12K15KSE +/- 25.67, N = 313236

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: SupercarIntel ARC A7503691215SE +/- 0.03, N = 312.53

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: BedroomIntel ARC A7501.10972.21943.32914.43885.5485SE +/- 0.046, N = 34.932

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: GoogLeNetIntel ARC A750714212835SE +/- 0.01, N = 329.43

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: ResNet-50Intel ARC A750246810SE +/- 0.04, N = 38.33

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: GoogLeNetIntel ARC A750714212835SE +/- 0.01, N = 329.32

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: ResNet-50Intel ARC A750246810SE +/- 0.02, N = 38.20

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: GoogLeNetIntel ARC A750714212835SE +/- 0.01, N = 327.95

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: ResNet-50Intel ARC A750246810SE +/- 0.06, N = 38.08

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: GoogLeNetIntel ARC A750612182430SE +/- 0.03, N = 326.52

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: ResNet-50Intel ARC A750246810SE +/- 0.02, N = 37.87

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: GoogLeNetIntel ARC A750612182430SE +/- 0.03, N = 323.95

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: AlexNetIntel ARC A7501122334455SE +/- 0.01, N = 347.77

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: AlexNetIntel ARC A7501122334455SE +/- 0.09, N = 346.35

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: ResNet-50Intel ARC A7501.18132.36263.54394.72525.9065SE +/- 0.05, N = 85.25

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: GoogLeNetIntel ARC A75048121620SE +/- 0.10, N = 315.75

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: AlexNetIntel ARC A7501020304050SE +/- 0.18, N = 345.57

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: AlexNetIntel ARC A7501020304050SE +/- 0.23, N = 343.01

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: AlexNetIntel ARC A750918273645SE +/- 0.26, N = 1539.62

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: VGG-16Intel ARC A7500.54681.09361.64042.18722.734SE +/- 0.01, N = 32.43

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: VGG-16Intel ARC A7500.54681.09361.64042.18722.734SE +/- 0.00, N = 32.43

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: VGG-16Intel ARC A7500.541.081.622.162.7SE +/- 0.00, N = 32.40

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: AlexNetIntel ARC A75048121620SE +/- 0.11, N = 1215.93

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: VGG-16Intel ARC A7500.38480.76961.15441.53921.924SE +/- 0.01, N = 31.71

ParaView

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A75060120180240300SE +/- 0.99, N = 3278.15

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A7501632486480SE +/- 0.80, N = 572.27

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltimateIntel ARC A750120240360480600SE +/- 2.85, N = 3551.35MIN: 110 / MAX: 1221

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltraIntel ARC A750160320480640800SE +/- 8.54, N = 3735.34MIN: 308 / MAX: 1171

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: HighIntel ARC A7502004006008001000SE +/- 3.18, N = 3777.93MIN: 437 / MAX: 1200

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: LowIntel ARC A7502004006008001000SE +/- 7.95, N = 3932.79MIN: 597 / MAX: 1516

Unigine Heaven

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Heaven 4.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A75050100150200250SE +/- 0.14, N = 3224.52

OpenArena

This is a test of OpenArena, a popular open-source first-person shooter. This game is based upon ioquake3, which in turn uses the GPL version of id Software's Quake 3 engine. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterOpenArena 0.8.8Resolution: 1920 x 1080Intel ARC A750100200300400500SE +/- 6.89, N = 15469.5MIN: 1

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingIntel ARC A75014K28K42K56K70KSE +/- 527.31, N = 3631131. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionIntel ARC A7507K14K21K28K35KSE +/- 251.38, N = 12315321. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionIntel ARC A75013K26K39K52K65KSE +/- 70.72, N = 3586251. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionIntel ARC A75012002400360048006000SE +/- 58.43, N = 354251. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2RIntel ARC A7507K14K21K28K35KSE +/- 232.00, N = 15312181. (CXX) g++ options: -O3

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4Intel ARC A7505K10K15K20K25KSE +/- 0.55, N = 322840.73

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarIntel ARC A7505K10K15K20K25KSE +/- 1.00, N = 321379.22

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-vec4Intel ARC A7503K6K9K12K15KSE +/- 0.79, N = 313898.32

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-scalarIntel ARC A7504K8K12K16K20KSE +/- 26.85, N = 316868.93

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: ResNet-50NVIDIA RTX 4070 SUPER1.24882.49763.74644.99526.244SE +/- 0.01, N = 25.55

Device: GPU - Batch Size: 64 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: GoogLeNetNVIDIA RTX 4070 SUPER4812162015.52

Device: GPU - Batch Size: 64 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: ResNet-50NVIDIA RTX 4070 SUPER1.23982.47963.71944.95926.199SE +/- 0.01, N = 25.51

Device: GPU - Batch Size: 32 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: GoogLeNetNVIDIA RTX 4070 SUPER48121620SE +/- 0.01, N = 215.61

Device: GPU - Batch Size: 32 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: ResNet-50NVIDIA RTX 4070 SUPER1.22852.4573.68554.9146.1425SE +/- 0.00, N = 25.46

Device: GPU - Batch Size: 16 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: GoogLeNetNVIDIA RTX 4070 SUPER48121620SE +/- 0.03, N = 315.67

Device: GPU - Batch Size: 16 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 512 - Model: AlexNetNVIDIA RTX 4070 SUPER816243240SE +/- 0.02, N = 235.10

Device: GPU - Batch Size: 512 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 256 - Model: AlexNetNVIDIA RTX 4070 SUPER816243240SE +/- 0.01, N = 334.16

Device: GPU - Batch Size: 256 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: ResNet-50NVIDIA RTX 4070 SUPER0.97881.95762.93643.91524.8944.35

Device: GPU - Batch Size: 1 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: GoogLeNetNVIDIA RTX 4070 SUPER3691215SE +/- 0.17, N = 212.62

Device: GPU - Batch Size: 1 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: AlexNetNVIDIA RTX 4070 SUPER81624324033.97

Device: GPU - Batch Size: 64 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: AlexNetNVIDIA RTX 4070 SUPER816243240SE +/- 0.15, N = 233.4

Device: GPU - Batch Size: 32 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: AlexNetNVIDIA RTX 4070 SUPER714212835SE +/- 0.17, N = 331.59

Device: GPU - Batch Size: 16 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: VGG-16NVIDIA RTX 4070 SUPER0.33750.6751.01251.351.6875SE +/- 0.00, N = 31.50

Device: GPU - Batch Size: 32 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: VGG-16NVIDIA RTX 4070 SUPER0.3330.6660.9991.3321.665SE +/- 0.00, N = 21.48

Device: GPU - Batch Size: 16 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: AlexNetNVIDIA RTX 4070 SUPER48121620SE +/- 0.22, N = 213.92

Device: GPU - Batch Size: 1 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: VGG-16NVIDIA RTX 4070 SUPER0.30380.60760.91141.21521.5191.35

Device: GPU - Batch Size: 1 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

NeatBench

NeatBench is a benchmark of the cross-platform Neat Video software on the CPU and optional GPU (OpenCL / CUDA) support. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUNVIDIA RTX 4070 SUPER9001800270036004500SE +/- 0.00, N = 34070

Acceleration: GPU

Intel ARC A770 8Gb: The test run did not produce a result. E: Failed to load CUDA driver ("/usr/lib64/libcuda.so.1")

Intel ARC A750: The test run did not produce a result.

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER48121620SE +/- 0.03, N = 314.29

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Barbershop - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1224364860SE +/- 0.10, N = 351.30

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Fishy Cat - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.06, N = 139.45

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Classroom - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.00, N = 312.60

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: BMW27 - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1.25332.50663.75995.01326.2665SE +/- 0.06, N = 135.57

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 36131. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 35991. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 35841. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNNVIDIA RTX 4070 SUPER120240360480600SE +/- 0.00, N = 35771. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TNVIDIA RTX 4070 SUPER80160240320400SE +/- 0.00, N = 33891. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NNVIDIA RTX 4070 SUPER50100150200250SE +/- 0.33, N = 32101. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTNVIDIA RTX 4070 SUPER100200300400500SE +/- 0.00, N = 34581. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYNVIDIA RTX 4070 SUPER90180270360450SE +/- 0.00, N = 34371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYNVIDIA RTX 4070 SUPER90180270360450SE +/- 0.33, N = 34231. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleNVIDIA RTX 4070 SUPER140280420560700SE +/- 0.98, N = 3630.111. (CXX) g++ options: -O3

OpenCL Test: Double-Precision Double

Intel ARC A750: The test run did not produce a result.

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2NVIDIA RTX 4070 SUPER80160240320400SE +/- 0.39, N = 3366.06

Intel ARC A770 8Gb: The test run did not produce a result.

Intel ARC A750: The test run did not produce a result.

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleNVIDIA RTX 4070 SUPER70140210280350SE +/- 0.30, N = 3339.591. (CXX) g++ options: -O3

Upscale: 2x - Precision: Double

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

Intel ARC A750: The test quit with a non-zero exit status.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein benchmark in double precisionNVIDIA RTX 4070 SUPER10002000300040005000SE +/- 12.55, N = 344511. (CXX) g++ options: -O3 -lrt

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Intel ARC A750: The test quit with a non-zero exit status.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in double precisionNVIDIA RTX 4070 SUPER5K10K15K20K25KSE +/- 146.69, N = 3243171. (CXX) g++ options: -O3 -lrt

Test: FFT + iFFT C2C 1D batched in double precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Intel ARC A750: The test quit with a non-zero exit status.

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 ComputeNVIDIA RTX 4070 SUPER0.13970.27940.41910.55880.6985SE +/- 0.000, N = 30.6211. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: FP64 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsIntel ARC A750600K1200K1800K2400K3000KSE +/- 132865.40, N = 1228048521. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

Device: GPU - Batch Size: 512 - Model: ResNet-50

Intel ARC A750: The test quit with a non-zero exit status. E: Fatal Python error: Segmentation fault

Device: GPU - Batch Size: 512 - Model: VGG-16

Intel ARC A750: The test quit with a non-zero exit status.

Device: GPU - Batch Size: 256 - Model: VGG-16

Intel ARC A750: The test quit with a non-zero exit status.

ParaView

OpenBenchmarking.orgMiPolys / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A7506001200180024003000SE +/- 10.34, N = 32898.67

OpenBenchmarking.orgMiVoxels / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A7508001600240032004000SE +/- 73.43, N = 153901.28

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A75050100150200250SE +/- 4.59, N = 15243.83

OpenBenchmarking.orgMiPolys / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A75016003200480064008000SE +/- 80.48, N = 57245.45

GLmark2

This is a test of GLmark2, a basic OpenGL and OpenGL ES 2.0 benchmark supporting various windowing/display back-ends. Learn more via the OpenBenchmarking.org test page.

Resolution: $VIDEO_WIDTH x $VIDEO_HEIGHT

Intel ARC A750: The test quit with a non-zero exit status. E: ./glmark2: 2: ./bin/glmark2: not found

Betsy GPU Compressor

Betsy is an open-source GPU compressor of various GPU compression techniques. Betsy is written in GLSL for Vulkan/OpenGL (compute shader) support for GPU-based texture compression. Learn more via the OpenBenchmarking.org test page.

Codec: ETC2 RGB - Quality: Highest

Intel ARC A750: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Codec: ETC1 - Quality: Highest

Intel ARC A750: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Intel ARC A750: The test quit with a non-zero exit status.

Test: FFT + iFFT C2C 1D batched in double precision

Intel ARC A750: The test quit with a non-zero exit status.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionIntel ARC A75015K30K45K60K75KSE +/- 2534.09, N = 15705711. (CXX) g++ options: -O3

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb20406080100SE +/- 4.31, N = 3SE +/- 0.29, N = 9SE +/- 2.65, N = 392.942.8688.99MIN: 5.37 / MAX: 102.7MIN: 5.39 / MAX: 101.651. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb306090120150SE +/- 0.17, N = 3SE +/- 29.60, N = 9SE +/- 0.14, N = 346.28117.8146.04MIN: 27.5 / MAX: 48.94MIN: 17.16 / MAX: 647.67MIN: 28.48 / MAX: 48.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb612182430SE +/- 2.78, N = 3SE +/- 0.04, N = 9SE +/- 13.01, N = 37.710.8427.01MIN: 2.58 / MAX: 55.58MIN: 2.51 / MAX: 56.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb20406080100SE +/- 3.68, N = 3SE +/- 0.97, N = 9SE +/- 1.81, N = 388.855.0795.55MIN: 6.65 / MAX: 121.5MIN: 6.7 / MAX: 121.281. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb714212835SE +/- 1.57, N = 3SE +/- 1.31, N = 9SE +/- 4.53, N = 328.583.8517.29MIN: 3.92 / MAX: 70.92MIN: 1.89 / MAX: 1093.29MIN: 3.85 / MAX: 70.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb3691215SE +/- 2.08, N = 3SE +/- 0.34, N = 8SE +/- 0.64, N = 37.642.315.93MIN: 4.66 / MAX: 92.69MIN: 1.76 / MAX: 421.42MIN: 4.68 / MAX: 91.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb612182430SE +/- 16.89, N = 3SE +/- 0.16, N = 9SE +/- 5.47, N = 327.222.2515.46MIN: 4.4 / MAX: 85.17MIN: 4.32 / MAX: 85.451. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8Gb1428425670SE +/- 1.62, N = 3SE +/- 0.44, N = 9SE +/- 5.70, N = 359.993.0357.30MIN: 4.11 / MAX: 72.49MIN: 4.06 / MAX: 71.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

Device: GPU - Batch Size: 512 - Model: VGG-16

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: Fatal Python error: Segmentation fault

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 256 - Model: VGG-16

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 64 - Model: VGG-16

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: UnboundLocalError: cannot access local variable 'decorators' where it is not associated with a value

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A750: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

MandelGPU

MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUNVIDIA RTX 4070 SUPERIntel ARC A750130M260M390M520M650MSE +/- 467034.80, N = 3SE +/- 5085622.05, N = 15587219538.2279683403.21. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Target: Vulkan GPU

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ncnn: line 3: ./benchncnn: No such file or directory

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x74746a490450 google::LogMessageFatal::~LogMessageFatal()

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7d7151816450 google::LogMessageFatal::~LogMessageFatal()

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x73552c3e3450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7670bcda4450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7b5ea59be450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7dd7c6de3450 google::LogMessageFatal::~LogMessageFatal()

FinanceBench

FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLNVIDIA RTX 4070 SUPERIntel ARC A7501.33022.66043.99065.32086.651SE +/- 0.114, N = 15SE +/- 0.213, N = 125.9124.8421. (CXX) g++ options: -O3 -march=native -fopenmp

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

Test: Conjugate Gradient OpenCL

NVIDIA RTX 4070 SUPER: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: arrayfire: line 3: ./cg_opencl: No such file or directory

Intel ARC A750: The test run did not produce a result. E: ./arrayfire: 3: ./cg_opencl: not found

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

Scale: 2x - Denoise: 3 - TAA: No

NVIDIA RTX 4070 SUPER: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Intel ARC A770 8Gb: The test run did not produce a result.

Intel ARC A750: The test run did not produce a result.

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A7503691215SE +/- 0.150, N = 15SE +/- 0.018, N = 3SE +/- 0.008, N = 36.3239.80810.356

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 ComputeIntel ARC A750NVIDIA RTX 4070 SUPER612182430SE +/- 0.20, N = 3SE +/- 0.00, N = 324.4517.171. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT16 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 ComputeIntel ARC A750NVIDIA RTX 4070 SUPER0.94821.89642.84463.79284.741SE +/- 0.037, N = 3SE +/- 0.015, N = 30.9584.2141. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT64 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

190 Results Shown

NCNN:
  Vulkan GPU - googlenet
  Vulkan GPU - mobilenet
  Vulkan GPU - vision_transformer
  Vulkan GPU - resnet18
  Vulkan GPU - regnety_400m
Hashcat
ViennaCL
Hashcat
ProjectPhysX OpenCL-Benchmark:
  INT32 Compute
  FP32 Compute
clpeak
Hashcat
ViennaCL
clpeak
Unigine Valley
cl-mem
ViennaCL
VkFFT
IndigoBench
Hashcat
ProjectPhysX OpenCL-Benchmark
Hashcat
IndigoBench
ViennaCL
NCNN:
  Vulkan GPU - resnet50
  Vulkan GPU - squeezenet_ssd
OpenArena
RealSR-NCNN
Waifu2x-NCNN Vulkan
VkFFT
ViennaCL
VkFFT
ProjectPhysX OpenCL-Benchmark
NCNN
cl-mem
VkFFT
NCNN
ViennaCL:
  CPU BLAS - sAXPY
  CPU BLAS - dDOT
VkFFT
ViennaCL
cl-mem
ViennaCL
VkFFT
vkpeak:
  int32-scalar
  int32-vec4
  fp32-vec4
ProjectPhysX OpenCL-Benchmark
vkpeak:
  fp16-scalar
  int16-scalar
  fp16-vec4
  int16-vec4
  fp32-scalar
ViennaCL:
  CPU BLAS - dGEMM-NN
  CPU BLAS - dAXPY
clpeak
ViennaCL:
  CPU BLAS - dGEMM-TT
  CPU BLAS - dGEMM-NT
NCNN
VkResample
ViennaCL:
  CPU BLAS - dGEMV-T
  CPU BLAS - dGEMV-N
ProjectPhysX OpenCL-Benchmark
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Texture Read Bandwidth
  OpenCL - Bus Speed Readback
  OpenCL - Bus Speed Download
  OpenCL - GEMM SGEMM_N
  OpenCL - Reduction
  OpenCL - MD5 Hash
  OpenCL - FFT SP
  OpenCL - Triad
  OpenCL - S3D
Darktable:
  Server Room - CPU-only
  Server Rack - CPU-only
  Server Room - OpenCL
  Server Rack - OpenCL
  Masskrug - CPU-only
  Masskrug - OpenCL
  Boat - CPU-only
  Boat - OpenCL
SPECViewPerf 2020:
  1920 x 1080 - SOLIDWORKS-07
  1920 x 1080 - MEDICAL-O3
  1920 x 1080 - ENERGY-03
  1920 x 1080 - CATIA-06
  1920 x 1080 - MAYA-06
  1920 x 1080 - CREO-03
  1920 x 1080 - SNX-04
LuxMark:
  CPU+GPU - Luxball HDR
  CPU+GPU - Microphone
  GPU - Luxball HDR
  GPU - Microphone
  CPU+GPU - Hotel
  GPU - Hotel
IndigoBench:
  CPU - Supercar
  CPU - Bedroom
TensorFlow:
  GPU - 512 - GoogLeNet
  GPU - 256 - ResNet-50
  GPU - 256 - GoogLeNet
  GPU - 64 - ResNet-50
  GPU - 64 - GoogLeNet
  GPU - 32 - ResNet-50
  GPU - 32 - GoogLeNet
  GPU - 16 - ResNet-50
  GPU - 16 - GoogLeNet
  GPU - 512 - AlexNet
  GPU - 256 - AlexNet
  GPU - 1 - ResNet-50
  GPU - 1 - GoogLeNet
  GPU - 64 - AlexNet
  GPU - 32 - AlexNet
  GPU - 16 - AlexNet
  GPU - 64 - VGG-16
  GPU - 32 - VGG-16
  GPU - 16 - VGG-16
  GPU - 1 - AlexNet
  GPU - 1 - VGG-16
ParaView:
  Wavelet Contour - 1920 x 1080
  Many Spheres - 1920 x 1080
Xonotic:
  1920 x 1080 - Ultimate
  1920 x 1080 - Ultra
  1920 x 1080 - High
  1920 x 1080 - Low
Unigine Heaven
OpenArena
VkFFT:
  FFT + iFFT C2C 1D batched in single precision, no reshuffling
  FFT + iFFT C2C multidimensional in single precision
  FFT + iFFT C2C 1D batched in single precision
  FFT + iFFT C2C Bluestein in single precision
  FFT + iFFT R2C / C2R
vkpeak:
  fp16-vec4
  fp16-scalar
  fp32-vec4
  fp32-scalar
TensorFlow:
  GPU - 64 - ResNet-50
  GPU - 64 - GoogLeNet
  GPU - 32 - ResNet-50
  GPU - 32 - GoogLeNet
  GPU - 16 - ResNet-50
  GPU - 16 - GoogLeNet
  GPU - 512 - AlexNet
  GPU - 256 - AlexNet
  GPU - 1 - ResNet-50
  GPU - 1 - GoogLeNet
  GPU - 64 - AlexNet
  GPU - 32 - AlexNet
  GPU - 16 - AlexNet
  GPU - 32 - VGG-16
  GPU - 16 - VGG-16
  GPU - 1 - AlexNet
  GPU - 1 - VGG-16
NeatBench
Blender:
  Pabellon Barcelona - NVIDIA OptiX
  Barbershop - NVIDIA OptiX
  Fishy Cat - NVIDIA OptiX
  Classroom - NVIDIA OptiX
  BMW27 - NVIDIA OptiX
ViennaCL:
  OpenCL BLAS - dGEMM-TT
  OpenCL BLAS - dGEMM-TN
  OpenCL BLAS - dGEMM-NT
  OpenCL BLAS - dGEMM-NN
  OpenCL BLAS - dGEMV-T
  OpenCL BLAS - dGEMV-N
  OpenCL BLAS - dDOT
  OpenCL BLAS - dAXPY
  OpenCL BLAS - dCOPY
clpeak
FAHBench
VkResample
VkFFT:
  FFT + iFFT C2C Bluestein benchmark in double precision
  FFT + iFFT C2C 1D batched in double precision
ProjectPhysX OpenCL-Benchmark
SHOC Scalable HeterOgeneous Computing
ParaView:
  Wavelet Contour - 1920 x 1080
  Wavelet Volume - 1920 x 1080
  Wavelet Volume - 1920 x 1080
  Many Spheres - 1920 x 1080
VkFFT
NCNN:
  Vulkan GPU - FastestDet
  Vulkan GPU - vgg16
  Vulkan GPU - blazeface
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - mnasnet
  Vulkan GPU - shufflenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU-v2-v2 - mobilenet-v2
MandelGPU
FinanceBench
RealSR-NCNN
ProjectPhysX OpenCL-Benchmark:
  INT16 Compute
  INT64 Compute