RTX 4070 SUPER

sudo apt install vulkan-headers vulkan-tools libvulkan-dev

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2412102-NE-INTELGPU716
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Additional Graphs

Show Perf Per Core/Thread Calculation Graphs Where Applicable
Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
NVIDIA RTX 4070 SUPER
January 25
  21 Hours, 7 Minutes
Intel ARC A770 8Gb
December 07
  11 Hours, 47 Minutes
Intel ARC A750
December 07
  1 Day, 7 Hours, 54 Minutes
intel-gpu
December 05
  10 Minutes
nvidia-gpu
December 05
  10 Minutes
Intel ARC A580
December 09
  23 Hours, 35 Minutes
Invert Behavior (Only Show Selected Data)
  14 Hours, 47 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


RTX 4070 SUPERProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNVIDIA RTX 4070 SUPERintel-gpunvidia-gpuIntel ARC A770 8GbIntel ARC A750Intel ARC A580Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS)Intel Device 7a2732GB4001GB Seagate ZP4000GP304001ASUS NVIDIA GeForce RTX 4070 SUPER 12GBRealtek ALC1220ARZOPAIntel I226-V + Intel Device 7a70EndeavourOS rolling6.7.1-arch1-1 (x86_64)KDE Plasma 5.27.10X Server 1.21.1.11NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.1 20230801ext41920x1080Intel Core i5-10300H @ 4.50GHz (4 Cores / 8 Threads)CML Stonic_CMS (V1.00 BIOS)Intel Comet Lake PCH16GB1000GB CT1000P3SSD8 + 256GB Western Digital PC SN530 SDBPNPZ-256G-1014Intel UHD CML GT2 4GB (1350/6000MHz)Intel Comet Lake PCH cAVSRealtek Killer E2600 GbE + Intel Comet Lake PCH CNVi WiFiUbuntu 24.046.8.0-49-generic (x86_64)GNOME Shell 46.0X Server 1.20.13NVIDIA 535.183.014.6 Mesa 24.0.9-0ubuntu0.2GCC 13.2.0NVIDIA GeForce GTX 1650 Ti 4GB4.6.0Intel Core Ultra 9 285K @ 5.10GHz (24 Cores)MSI MEG Z890 UNIFY-X (MS-7E20) v1.0 (1.A10 BIOS)Intel Device ae7f2 x 16GB DDR5-6000MT/s Corsair CMH32GX5M2B6000Z301024GB Wodposit NVMe SSDMSI Intel Arc A770 DG2 8GBIntel DG2 AudioPiKVM V3Realtek Device 5000 + Intel Wi-Fi 7Ubuntu 24.106.12.1-061201-generic (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.3.1 kisak-mesa PPAGCC 14.2.0Intel Arc A750 DG2 8GBOpenCL 3.0Intel Arc A580 DG2 8GB1280x720OpenBenchmarking.orgKernel Details- NVIDIA RTX 4070 SUPER: Transparent Huge Pages: always- intel-gpu: Transparent Huge Pages: madvise- nvidia-gpu: Transparent Huge Pages: madvise- Intel ARC A770 8Gb: Transparent Huge Pages: madvise- Intel ARC A750: Transparent Huge Pages: madvise- Intel ARC A580: Transparent Huge Pages: madviseCompiler Details- NVIDIA RTX 4070 SUPER: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Intel ARC A770 8Gb: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A750: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Intel ARC A580: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- NVIDIA RTX 4070 SUPER: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11d- intel-gpu: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xfc - Thermald 2.5.6- nvidia-gpu: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xfc - Thermald 2.5.6- Intel ARC A770 8Gb: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC A750: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8- Intel ARC A580: Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x110 - Thermald 2.5.8Graphics Details- NVIDIA RTX 4070 SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.04.69.00.c1- intel-gpu: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.17.4c.00.1d- nvidia-gpu: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.17.4c.00.1dSecurity Details- NVIDIA RTX 4070 SUPER: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - intel-gpu: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Mitigation of Microcode + tsx_async_abort: Not affected - nvidia-gpu: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Mitigation of Microcode + tsx_async_abort: Not affected - Intel ARC A770 8Gb: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A750: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Intel ARC A580: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected Environment Details- nvidia-gpu: __GLX_VENDOR_LIBRARY_NAME=nvidiaPython Details- Intel ARC A770 8Gb, Intel ARC A750, Intel ARC A580: Python 3.12.7

RTX 4070 SUPERtensorflow: GPU - 64 - VGG-16indigobench: OpenCL GPU - Bedroomtensorflow: GPU - 32 - VGG-16tensorflow: GPU - 32 - VGG-16tensorflow: GPU - 64 - VGG-16tensorflow: GPU - 16 - VGG-16ncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - FastestDetncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - googlenetncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU - mobilenetncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3tensorflow: GPU - 16 - VGG-16tensorflow: GPU - 64 - ResNet-50tensorflow: GPU - 64 - ResNet-50specviewperf2020: 1920 x 1080 - CREO-03unigine-heaven: 1920 x 1080 - Fullscreen - OpenGLspecviewperf2020: 1920 x 1080 - SOLIDWORKS-07specviewperf2020: 1920 x 1080 - MAYA-06tensorflow: GPU - 32 - ResNet-50specviewperf2020: 1920 x 1080 - MEDICAL-O3tensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 16 - ResNet-50specviewperf2020: 1920 x 1080 - SNX-04unigine-valley: 1920 x 1080 - Fullscreen - OpenGLspecviewperf2020: 1920 x 1080 - CATIA-06tensorflow: GPU - 16 - AlexNettensorflow: GPU - 16 - ResNet-50vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-vec4vkpeak: int16-scalarvkpeak: fp16-vec4vkpeak: fp16-scalarvkpeak: fp32-vec4vkpeak: fp32-scalartensorflow: GPU - 64 - AlexNetvkfft: FFT + iFFT C2C 1D batched in half precisionxonotic: 1920 x 1080 - Highspecviewperf2020: 1920 x 1080 - ENERGY-03tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 64 - GoogLeNetluxmark: GPU - Microphoneluxmark: GPU - Hotelluxmark: CPU+GPU - Microphoneluxmark: GPU - Luxball HDRluxmark: CPU+GPU - Luxball HDRluxmark: CPU+GPU - Hoteltensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 64 - GoogLeNetindigobench: OpenCL GPU - Supercartensorflow: GPU - 32 - AlexNetvkfft: FFT + iFFT C2C multidimensional in single precisiontensorflow: GPU - 16 - GoogLeNetxonotic: 1920 x 1080 - Ultimateshoc: OpenCL - Texture Read Bandwidthtensorflow: GPU - 1 - VGG-16tensorflow: GPU - 16 - GoogLeNetvkfft: FFT + iFFT C2C 1D batched in half precisionindigobench: CPU - Bedroomindigobench: CPU - Supercartensorflow: GPU - 64 - AlexNetvkfft: FFT + iFFT C2C 1D batched in single precisiontensorflow: GPU - 32 - AlexNetvkpeak: fp16-vec4vkpeak: fp16-scalarvkpeak: fp32-vec4vkpeak: fp32-scalarvkfft: FFT + iFFT C2C 1D batched in single precisionrealsr-ncnn: 4x - Yesblender: Barbershop - NVIDIA OptiXvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingxonotic: 1920 x 1080 - Ultravkfft: FFT + iFFT C2C Bluestein in single precisiontensorflow: GPU - 1 - ResNet-50vkfft: FFT + iFFT R2C / C2Rblender: Fishy Cat - NVIDIA OptiXxonotic: 1920 x 1080 - Lowtensorflow: GPU - 1 - VGG-16openarena: 1920 x 1080vkfft: FFT + iFFT C2C Bluestein benchmark in double precisiontensorflow: GPU - 16 - AlexNetparaview: Many Spheres - 1920 x 1080paraview: Many Spheres - 1920 x 1080vkfft: FFT + iFFT C2C multidimensional in single precisionopencl-benchmark: FP16 Computetensorflow: GPU - 1 - AlexNetblender: BMW27 - NVIDIA OptiXfahbench: tensorflow: GPU - 1 - ResNet-50vkfft: FFT + iFFT C2C 1D batched in double precisionopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: INT16 Computeopencl-benchmark: Memory Bandwidth Coalesced Writeopencl-benchmark: INT32 Computeopencl-benchmark: INT64 Computeopencl-benchmark: FP32 Computeshoc: OpenCL - Max SP Flopsviennacl: CPU BLAS - dGEMM-TTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sCOPYvkfft: FFT + iFFT R2C / C2Rviennacl: OpenCL BLAS - dGEMM-TTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dCOPYvkresample: 2x - Doublevkfft: FFT + iFFT C2C Bluestein in single precisionrealsr-ncnn: 4x - Noblender: Pabellon Barcelona - NVIDIA OptiXviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYblender: Classroom - NVIDIA OptiXviennacl: OpenCL BLAS - sDOTclpeak: Single-Precision Floatclpeak: Integer Compute INThashcat: SHA1hashcat: 7-Zipclpeak: Double-Precision Doubletensorflow: GPU - 1 - GoogLeNetvkresample: 2x - Singlehashcat: SHA-512mandelgpu: GPUhashcat: MD5darktable: Boat - OpenCLparaview: Wavelet Volume - 1920 x 1080paraview: Wavelet Volume - 1920 x 1080tensorflow: GPU - 1 - GoogLeNetcl-mem: Readcl-mem: Writeshoc: OpenCL - GEMM SGEMM_Ntensorflow: GPU - 1 - AlexNetopencl-benchmark: FP64 Computeshoc: OpenCL - Reductionwaifu2x-ncnn: 2x - 3 - Yescl-mem: Copyhashcat: TrueCrypt RIPEMD160 + XTSdarktable: Boat - CPU-onlyparaview: Wavelet Contour - 1920 x 1080paraview: Wavelet Contour - 1920 x 1080clpeak: Global Memory Bandwidthdarktable: Masskrug - OpenCLdarktable: Server Room - OpenCLfinancebench: Black-Scholes OpenCLdarktable: Masskrug - CPU-onlydarktable: Server Room - CPU-onlyshoc: OpenCL - Triadshoc: OpenCL - MD5 Hashshoc: OpenCL - Bus Speed Readbackdarktable: Server Rack - OpenCLshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - FFT SPshoc: OpenCL - S3Ddarktable: Server Rack - CPU-onlyneatbench: GPUbetsy: ETC2 RGB - HighestNVIDIA RTX 4070 SUPERintel-gpunvidia-gpuIntel ARC A770 8GbIntel ARC A750Intel ARC A58019.8011.50117.812.86844.6111.116.8663.8246.2616.178.9711.040.845.073.852.312.253.038.621.485.555.515.4615.6115.5252.8135029915.6713170533.9733.47392934.88551.307507815166547949.451.35445131.595.57366.05764.352431714.307464.8617.170455.0119.8894.21438.59412211511711910910296.887.270.8165156132613599584577389210458437423339.5936.32314.2933439212.6037035492.6918170.54221326000001176467630.1118.4893232733333587219538.26758303333312.62446.2407.513.920.6212.855331.8802967437.655.91240709.9830474.816446.0488.99119.05453.80100.4049.1593.1823.3047.11105.3527.0195.5517.295.9315.4657.3079.7979.794675.064853.469624.669201.6438575.1124468.2711171.1317337.5461.86413313812713410610177.678.856.880.612283.99.80818.1975.1462.439.3022.432.422.432.4045.5461.47116.35243.6689.5448.7787.5622.0043.8193.1011.5154.5612.005.049.3738.3376.4476.442.398.208.2362.65224.51873.13165.768.0740.978.087.87172.03226.17347.8539.627.834082.114242.308426.598053.0833769.3821413.299779.7115200.6245.5770571777.933830231.1726.5227.9545988132364584460251605391326226.4727.8419.70843.013301123.95551.3503947883.4621.7124.051004264.93212.52744.825862542.7922840.7321379.2213898.3216868.935880066.0666311364004735.342277755735.2532544932.78994391.70469.539.837245.44772.273153215.61615.935.279.470203.7325.390398.155.1070.94510.228280485213313912513410510176.778.756.481.312283.731218542510.35698.088.113311380.914885.36540185000024663315.7518.894943466667279683403.2310105000002.9003901.283243.8315.48153.7280.12049.2015.7974.52725.291269.63284002.9052898.674278.15396.721.6901.3624.8421.6891.38118.035522.724922.44170.17218.65651187.36224.7170.1732.439.2572.422.422.4046.1793.54119.00462.80101.5749.5693.4823.4147.58106.3224.06106.5615.7037.0529.0451.0480.5380.532.3961.80205.25070.72159.4239.137.87168.75216.33145.8439.747.833504.133636.357230.546905.6128957.5218359.498389.6213055.9244.6172346856.724145527.7426.3438157110153822450588507881101026.4620.16643.093019624.13575.5149048762.7031.7324.10698954.97812.8585842642.8519567.5418324.9211908.7314354.215898672.2826315763530809.456204150935.23299411051.51336021.69524.239.556519.45265.033052113.63515.815.238.701187.2024.056408.904.0961.0139.002234522413313712713310510177.878.656.581.312283.330388513411.09911694.81869758.864133.83457602500024966715.6820.412807266667248630520.1265392000002.9163692.084230.7515.71144.0292.81777.1815.7869.47515.594259.72779002.9202633.044252.66388.851.6801.3854.8161.6881.37317.772919.528022.42520.17618.65591192.99216.3850.172OpenBenchmarking.org

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: VGG-16Intel ARC A750Intel ARC A5800.54681.09361.64042.18722.734SE +/- 0.01, N = 3SE +/- 0.01, N = 22.432.43

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580510152025SE +/- 0.009, N = 3SE +/- 0.011, N = 3SE +/- 0.004, N = 319.8019.3029.257

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: VGG-16Intel ARC A750Intel ARC A5800.54681.09361.64042.18722.734SE +/- 0.00, N = 3SE +/- 0.01, N = 32.432.42

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: VGG-16NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5800.54451.0891.63352.1782.7225SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 31.502.422.42

Device: GPU - Batch Size: 32 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: VGG-16Intel ARC A7500.54681.09361.64042.18722.734SE +/- 0.00, N = 32.43

Device: GPU - Batch Size: 64 - Model: VGG-16

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: UnboundLocalError: cannot access local variable 'decorators' where it is not associated with a value

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: VGG-16Intel ARC A750Intel ARC A5800.541.081.622.162.7SE +/- 0.00, N = 3SE +/- 0.00, N = 32.402.40

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580306090120150SE +/- 0.17, N = 3SE +/- 29.60, N = 9SE +/- 0.14, N = 3SE +/- 0.28, N = 346.28117.8146.0446.17MIN: 27.5 / MAX: 48.94MIN: 17.16 / MAX: 647.67MIN: 28.48 / MAX: 48.53MIN: 29.78 / MAX: 48.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A58020406080100SE +/- 4.31, N = 3SE +/- 0.29, N = 9SE +/- 2.65, N = 3SE +/- 1.06, N = 392.942.8688.9993.54MIN: 5.37 / MAX: 102.7MIN: 5.39 / MAX: 101.65MIN: 5.44 / MAX: 102.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A5802004006008001000SE +/- 0.44, N = 3SE +/- 87.53, N = 9SE +/- 0.24, N = 3SE +/- 0.39, N = 3118.97844.61119.05119.00MIN: 46.34 / MAX: 1866.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580100200300400500SE +/- 4.65, N = 3SE +/- 3.28, N = 9SE +/- 11.18, N = 3SE +/- 2.21, N = 3454.6711.11453.80462.80MIN: 23.93 / MAX: 530.68MIN: 23.74 / MAX: 528.85MIN: 23.85 / MAX: 530.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.48, N = 3SE +/- 1.76, N = 9SE +/- 0.61, N = 3SE +/- 0.58, N = 3100.626.86100.40101.57MIN: 7.79 / MAX: 107.95MIN: 7.63 / MAX: 107.86MIN: 7.63 / MAX: 108.581. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A5801428425670SE +/- 0.22, N = 3SE +/- 10.56, N = 9SE +/- 0.10, N = 3SE +/- 0.02, N = 349.2063.8249.1549.56MIN: 22.47 / MAX: 52.7MIN: 10.28 / MAX: 858.44MIN: 20.41 / MAX: 52.43MIN: 18.28 / MAX: 52.11. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.98, N = 3SE +/- 14.70, N = 9SE +/- 1.06, N = 3SE +/- 0.83, N = 392.8346.2693.1893.48MIN: 10.52 / MAX: 101.84MIN: 7.71 / MAX: 1829.99MIN: 10.54 / MAX: 100.7MIN: 10.45 / MAX: 101.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580612182430SE +/- 0.07, N = 3SE +/- 5.86, N = 9SE +/- 0.18, N = 3SE +/- 0.05, N = 323.4216.1723.3023.41MIN: 3.57 / MAX: 25.48MIN: 3.52 / MAX: 436.52MIN: 3.6 / MAX: 25.53MIN: 3.58 / MAX: 25.271. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A5801122334455SE +/- 0.13, N = 3SE +/- 3.49, N = 9SE +/- 0.89, N = 3SE +/- 0.19, N = 347.308.9747.1147.58MIN: 4.97 / MAX: 51.26MIN: 3.94 / MAX: 922.04MIN: 4.95 / MAX: 51.24MIN: 5.09 / MAX: 52.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.29, N = 3SE +/- 1.21, N = 9SE +/- 0.15, N = 3SE +/- 1.35, N = 3106.0711.04105.35106.32MIN: 8.51 / MAX: 115.08MIN: 5.28 / MAX: 1769.19MIN: 8.44 / MAX: 114.62MIN: 8.23 / MAX: 114.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580714212835SE +/- 2.78, N = 3SE +/- 0.04, N = 9SE +/- 13.01, N = 3SE +/- 9.84, N = 37.710.8427.0124.06MIN: 2.58 / MAX: 55.58MIN: 2.51 / MAX: 56.02MIN: 2.55 / MAX: 571. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A58020406080100SE +/- 3.68, N = 3SE +/- 0.97, N = 9SE +/- 1.81, N = 3SE +/- 3.44, N = 388.855.0795.55106.56MIN: 6.65 / MAX: 121.5MIN: 6.7 / MAX: 121.28MIN: 6.77 / MAX: 121.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580714212835SE +/- 1.57, N = 3SE +/- 1.31, N = 9SE +/- 4.53, N = 3SE +/- 7.50, N = 328.583.8517.2915.70MIN: 3.92 / MAX: 70.92MIN: 1.89 / MAX: 1093.29MIN: 3.85 / MAX: 70.64MIN: 3.96 / MAX: 70.571. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580918273645SE +/- 2.08, N = 3SE +/- 0.34, N = 8SE +/- 0.64, N = 3SE +/- 23.69, N = 37.642.315.9337.05MIN: 4.66 / MAX: 92.69MIN: 4.68 / MAX: 91.15MIN: 4.64 / MAX: 94.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A580714212835SE +/- 16.89, N = 3SE +/- 0.16, N = 9SE +/- 5.47, N = 3SE +/- 13.05, N = 327.222.2515.4629.04MIN: 4.4 / MAX: 85.17MIN: 4.32 / MAX: 85.45MIN: 4.39 / MAX: 85.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2Intel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A5801428425670SE +/- 1.62, N = 3SE +/- 0.44, N = 9SE +/- 5.70, N = 3SE +/- 2.81, N = 359.993.0357.3051.04MIN: 4.11 / MAX: 72.49MIN: 4.06 / MAX: 71.75MIN: 4.14 / MAX: 72.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.38, N = 3SE +/- 0.47, N = 9SE +/- 0.36, N = 3SE +/- 0.40, N = 380.118.6279.7980.53MIN: 9.8 / MAX: 84.4MIN: 6.42 / MAX: 1101.3MIN: 21.72 / MAX: 84.6MIN: 12.67 / MAX: 84.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3Intel ARC A750Intel ARC A770 8GbIntel ARC A58020406080100SE +/- 0.38, N = 3SE +/- 0.36, N = 3SE +/- 0.40, N = 380.1179.7980.53MIN: 9.8 / MAX: 84.4MIN: 21.72 / MAX: 84.6MIN: 12.67 / MAX: 84.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: VGG-16NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5800.53781.07561.61342.15122.689SE +/- 0.00, N = 2SE +/- 0.00, N = 3SE +/- 0.01, N = 31.482.392.39

Device: GPU - Batch Size: 16 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: ResNet-50Intel ARC A750246810SE +/- 0.02, N = 38.20

Device: GPU - Batch Size: 64 - Model: ResNet-50

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: ResNet-50NVIDIA RTX 4070 SUPERIntel ARC A750246810SE +/- 0.01, N = 2SE +/- 0.01, N = 35.558.23

Device: GPU - Batch Size: 64 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CREO-03Intel ARC A750Intel ARC A5801428425670SE +/- 0.06, N = 3SE +/- 0.04, N = 362.6561.80

Unigine Heaven

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Heaven 4.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A750Intel ARC A58050100150200250SE +/- 0.14, N = 3SE +/- 0.23, N = 3224.52205.25

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SOLIDWORKS-07Intel ARC A750Intel ARC A5801632486480SE +/- 0.01, N = 3SE +/- 0.00, N = 373.1370.72

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MAYA-06Intel ARC A750Intel ARC A5804080120160200SE +/- 0.32, N = 3SE +/- 0.08, N = 3165.76159.42

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: ResNet-50NVIDIA RTX 4070 SUPERIntel ARC A750246810SE +/- 0.01, N = 2SE +/- 0.03, N = 35.518.07

Device: GPU - Batch Size: 32 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: MEDICAL-O3Intel ARC A750Intel ARC A580918273645SE +/- 0.00, N = 3SE +/- 0.00, N = 340.9739.13

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: ResNet-50Intel ARC A750246810SE +/- 0.06, N = 38.08

Device: GPU - Batch Size: 32 - Model: ResNet-50

Intel ARC A580: The test quit with a non-zero exit status.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: ResNet-50Intel ARC A750Intel ARC A580246810SE +/- 0.02, N = 3SE +/- 0.01, N = 37.877.87

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: SNX-04Intel ARC A750Intel ARC A5804080120160200SE +/- 0.08, N = 3SE +/- 0.33, N = 3172.03168.75

Unigine Valley

This test calculates the average frame-rate within the Valley demo for the Unigine engine, released in February 2013. This engine is extremely demanding on the system's graphics card. Unigine Valley relies upon an OpenGL 3 core profile context. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Valley 1.0Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGLIntel ARC A750intel-gpunvidia-gpuIntel ARC A58050100150200250SE +/- 0.53294, N = 3SE +/- 0.00191, N = 3SE +/- 0.36174, N = 3SE +/- 0.14945, N = 3226.621009.9830474.81640216.33100

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: CATIA-06Intel ARC A750Intel ARC A5801122334455SE +/- 0.05, N = 3SE +/- 0.10, N = 347.8545.84

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: AlexNetIntel ARC A750Intel ARC A580918273645SE +/- 0.26, N = 15SE +/- 0.33, N = 839.6239.74

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: ResNet-50NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580246810SE +/- 0.00, N = 2SE +/- 0.03, N = 3SE +/- 0.02, N = 35.467.837.83

Device: GPU - Batch Size: 16 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-scalarIntel ARC A750Intel ARC A770 8GbIntel ARC A58010002000300040005000SE +/- 0.07, N = 3SE +/- 0.04, N = 3SE +/- 0.11, N = 34081.974675.063504.13

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-vec4Intel ARC A750Intel ARC A770 8GbIntel ARC A58010002000300040005000SE +/- 0.17, N = 3SE +/- 0.03, N = 3SE +/- 0.06, N = 34242.144853.463636.35

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-vec4Intel ARC A750Intel ARC A770 8GbIntel ARC A5802K4K6K8K10KSE +/- 0.19, N = 3SE +/- 0.11, N = 3SE +/- 0.13, N = 38426.529624.667230.54

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-scalarIntel ARC A750Intel ARC A770 8GbIntel ARC A5802K4K6K8K10KSE +/- 0.12, N = 3SE +/- 0.19, N = 3SE +/- 0.07, N = 38053.119201.646905.61

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-vec4Intel ARC A750Intel ARC A770 8GbIntel ARC A5808K16K24K32K40KSE +/- 0.50, N = 3SE +/- 0.43, N = 3SE +/- 0.72, N = 333768.5238575.1128957.52

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-scalarIntel ARC A750Intel ARC A770 8GbIntel ARC A5805K10K15K20K25KSE +/- 0.35, N = 3SE +/- 0.22, N = 3SE +/- 0.10, N = 321412.3624468.2718359.49

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-vec4Intel ARC A750Intel ARC A770 8GbIntel ARC A5802K4K6K8K10KSE +/- 0.13, N = 3SE +/- 0.27, N = 3SE +/- 0.09, N = 39779.3611171.138389.62

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-scalarIntel ARC A750Intel ARC A770 8GbIntel ARC A5804K8K12K16K20KSE +/- 0.69, N = 3SE +/- 3.56, N = 3SE +/- 0.91, N = 315200.5917337.5413055.92

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: AlexNetIntel ARC A750Intel ARC A5801020304050SE +/- 0.18, N = 3SE +/- 0.12, N = 345.5744.61

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionIntel ARC A750Intel ARC A58015K30K45K60K75KSE +/- 2534.09, N = 15SE +/- 1441.23, N = 1270571723461. (CXX) g++ options: -O3

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: HighIntel ARC A750Intel ARC A5802004006008001000SE +/- 3.18, N = 3SE +/- 6.14, N = 15777.93856.72MIN: 437 / MAX: 1200MIN: 458 / MAX: 1375

SPECViewPerf 2020

This test runs SPECViewPerf 2020 if available on your system. SPECViewPerf is made up of real-world OpenGL workstation tests such as CATIA and SolidWorks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgComposite Score, More Is BetterSPECViewPerf 2020 3.0Resolution: 1920 x 1080 - Viewset: ENERGY-03Intel ARC A750Intel ARC A580714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 331.1727.74

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: GoogLeNetIntel ARC A750Intel ARC A580612182430SE +/- 0.03, N = 3SE +/- 0.06, N = 326.5226.34

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: GoogLeNetIntel ARC A750714212835SE +/- 0.01, N = 327.95

Device: GPU - Batch Size: 64 - Model: GoogLeNet

Intel ARC A580: The test quit with a non-zero exit status.

LuxMark

LuxMark is a multi-platform OpenGL benchmark using LuxRender. LuxMark supports targeting different OpenCL devices and has multiple scenes available for rendering. LuxMark is a fully open-source OpenCL program with real-world rendering examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: MicrophoneIntel ARC A750Intel ARC A58010K20K30K40K50KSE +/- 191.00, N = 3SE +/- 58.92, N = 34598838157

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelIntel ARC A750Intel ARC A5803K6K9K12K15KSE +/- 25.67, N = 3SE +/- 11.33, N = 31323611015

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: MicrophoneIntel ARC A750Intel ARC A58010K20K30K40K50KSE +/- 14.17, N = 3SE +/- 9.50, N = 34584438224

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRIntel ARC A750Intel ARC A58013K26K39K52K65KSE +/- 144.44, N = 3SE +/- 167.49, N = 36025150588

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: Luxball HDRIntel ARC A750Intel ARC A58013K26K39K52K65KSE +/- 154.09, N = 3SE +/- 62.27, N = 36053950788

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: HotelIntel ARC A750Intel ARC A5803K6K9K12K15KSE +/- 0.33, N = 3SE +/- 1.76, N = 31326211010

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: GoogLeNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580612182430SE +/- 0.01, N = 2SE +/- 0.04, N = 3SE +/- 0.06, N = 315.6126.4726.46

Device: GPU - Batch Size: 32 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: GoogLeNetNVIDIA RTX 4070 SUPERIntel ARC A750714212835SE +/- 0.06, N = 315.5227.84

Device: GPU - Batch Size: 64 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5801224364860SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 352.8119.7120.17

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: AlexNetIntel ARC A750Intel ARC A5801020304050SE +/- 0.23, N = 3SE +/- 0.27, N = 343.0143.09

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C multidimensional in single precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58011K22K33K44K55KSE +/- 407.19, N = 15SE +/- 17.37, N = 3SE +/- 408.80, N = 13502993301130196-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C multidimensional in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: GoogLeNetIntel ARC A750Intel ARC A580612182430SE +/- 0.03, N = 3SE +/- 0.20, N = 323.9524.13

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltimateIntel ARC A750Intel ARC A580120240360480600SE +/- 2.85, N = 3SE +/- 5.72, N = 3551.35575.51MIN: 110 / MAX: 1221MIN: 106 / MAX: 1264

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthIntel ARC A750Intel ARC A5802004006008001000SE +/- 0.34, N = 3SE +/- 0.14, N = 3883.46762.701. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: VGG-16Intel ARC A750Intel ARC A5800.38930.77861.16791.55721.9465SE +/- 0.01, N = 3SE +/- 0.01, N = 31.711.73

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: GoogLeNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580612182430SE +/- 0.03, N = 3SE +/- 0.08, N = 3SE +/- 0.20, N = 315.6724.0524.10

Device: GPU - Batch Size: 16 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in half precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58030K60K90K120K150KSE +/- 159.17, N = 3SE +/- 82.39, N = 3SE +/- 1784.34, N = 1213170510042669895-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in half precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: BedroomIntel ARC A750Intel ARC A5801.12012.24023.36034.48045.6005SE +/- 0.046, N = 3SE +/- 0.015, N = 34.9324.978

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: CPU - Scene: SupercarIntel ARC A750Intel ARC A5803691215SE +/- 0.03, N = 3SE +/- 0.13, N = 312.5312.86

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 64 - Model: AlexNetNVIDIA RTX 4070 SUPERIntel ARC A7501020304050SE +/- 0.18, N = 333.9744.82

Device: GPU - Batch Size: 64 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Intel ARC A580: The test quit with a non-zero exit status.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionIntel ARC A750Intel ARC A58013K26K39K52K65KSE +/- 70.72, N = 3SE +/- 583.21, N = 358625584261. (CXX) g++ options: -O3

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 32 - Model: AlexNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5801020304050SE +/- 0.15, N = 2SE +/- 0.19, N = 3SE +/- 0.03, N = 333.4042.7942.85

Device: GPU - Batch Size: 32 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4Intel ARC A750Intel ARC A5805K10K15K20K25KSE +/- 0.55, N = 3SE +/- 0.47, N = 322840.7319567.54

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarIntel ARC A750Intel ARC A5805K10K15K20K25KSE +/- 1.00, N = 3SE +/- 0.21, N = 321379.2218324.92

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-vec4Intel ARC A750Intel ARC A5803K6K9K12K15KSE +/- 0.79, N = 3SE +/- 0.40, N = 313898.3211908.73

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-scalarIntel ARC A750Intel ARC A5804K8K12K16K20KSE +/- 26.85, N = 3SE +/- 6.23, N = 316868.9314354.21

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58016K32K48K64K80KSE +/- 7.94, N = 3SE +/- 38.85, N = 3SE +/- 54.76, N = 3739295880058986-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A5801632486480SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 334.8961.8666.0772.28

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Barbershop - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1224364860SE +/- 0.10, N = 351.30

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingIntel ARC A750Intel ARC A58014K28K42K56K70KSE +/- 527.31, N = 3SE +/- 36.04, N = 363113631571. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58016K32K48K64K80KSE +/- 37.77, N = 3SE +/- 48.89, N = 3SE +/- 4.48, N = 3750786400463530-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: UltraIntel ARC A750Intel ARC A5802004006008001000SE +/- 8.54, N = 3SE +/- 3.16, N = 3735.34809.46MIN: 308 / MAX: 1171MIN: 319 / MAX: 1303

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein in single precisionNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5803K6K9K12K15KSE +/- 102.52, N = 3SE +/- 3.76, N = 3SE +/- 66.00, N = 121516655735093-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT C2C Bluestein in single precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: ResNet-50Intel ARC A750Intel ARC A5801.18132.36263.54394.72525.9065SE +/- 0.05, N = 8SE +/- 0.05, N = 45.255.23

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2RNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58012K24K36K48K60KSE +/- 702.53, N = 15SE +/- 57.59, N = 3SE +/- 257.48, N = 15547943254429941-lrt1. (CXX) g++ options: -O3

Test: FFT + iFFT R2C / C2R

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Fishy Cat - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.06, N = 139.45

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game for this open-source first person shooter title. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.6Resolution: 1920 x 1080 - Effects Quality: LowIntel ARC A750Intel ARC A5802004006008001000SE +/- 7.95, N = 3SE +/- 11.78, N = 3932.791051.51MIN: 597 / MAX: 1516MIN: 704 / MAX: 1770

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: VGG-16NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5800.38250.7651.14751.531.9125SE +/- 0.01, N = 3SE +/- 0.00, N = 31.351.701.69

Device: GPU - Batch Size: 1 - Model: VGG-16

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

OpenArena

OpenBenchmarking.orgMilliseconds, Fewer Is BetterOpenArena 0.8.8Resolution: 1920 x 1080 - Total Frame TimeIntel ARC A750Intel ARC A58048121620Min: 1 / Avg: 2.05 / Max: 14Min: 1 / Avg: 1.99 / Max: 14

OpenBenchmarking.orgFrames Per Second, More Is BetterOpenArena 0.8.8Resolution: 1920 x 1080Intel ARC A750Intel ARC A580110220330440550SE +/- 6.89, N = 15SE +/- 4.78, N = 15469.5524.2MIN: 1MIN: 1

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein benchmark in double precisionNVIDIA RTX 4070 SUPER10002000300040005000SE +/- 12.55, N = 344511. (CXX) g++ options: -O3 -lrt

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 16 - Model: AlexNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580918273645SE +/- 0.17, N = 3SE +/- 0.42, N = 3SE +/- 0.40, N = 331.5939.8339.55

Device: GPU - Batch Size: 16 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

ParaView

OpenBenchmarking.orgMiPolys / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A750Intel ARC A58016003200480064008000SE +/- 80.48, N = 5SE +/- 98.62, N = 157245.456519.45

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Many Spheres - Resolution: 1920 x 1080Intel ARC A750Intel ARC A5801632486480SE +/- 0.80, N = 5SE +/- 0.98, N = 1572.2765.03

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionIntel ARC A750Intel ARC A5807K14K21K28K35KSE +/- 251.38, N = 12SE +/- 20.17, N = 331532305211. (CXX) g++ options: -O3

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP16 ComputeIntel ARC A750Intel ARC A58048121620SE +/- 0.03, N = 3SE +/- 0.03, N = 315.6013.641. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: AlexNetIntel ARC A750Intel ARC A58048121620SE +/- 0.11, N = 12SE +/- 0.15, N = 615.9315.81

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: BMW27 - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER1.25332.50663.75995.01326.2665SE +/- 0.06, N = 135.57

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2NVIDIA RTX 4070 SUPER80160240320400SE +/- 0.39, N = 3366.06

Intel ARC A770 8Gb: The test run did not produce a result.

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: ResNet-50NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5801.18582.37163.55744.74325.929SE +/- 0.06, N = 3SE +/- 0.04, N = 84.355.275.23

Device: GPU - Batch Size: 1 - Model: ResNet-50

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in double precisionNVIDIA RTX 4070 SUPER5K10K15K20K25KSE +/- 146.69, N = 3243171. (CXX) g++ options: -O3 -lrt

Test: FFT + iFFT C2C 1D batched in double precision

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./vkfft: 3: ./Vulkan_FFT: not found

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 ComputeIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A58048121620SE +/- 0.031, N = 3SE +/- 0.046, N = 3SE +/- 0.057, N = 39.54814.3078.7011. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT8 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced ReadIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A580100200300400500SE +/- 0.74, N = 3SE +/- 0.01, N = 3SE +/- 0.63, N = 3202.92464.86187.201. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: Memory Bandwidth Coalesced Read

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 ComputeIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A580612182430SE +/- 0.20, N = 3SE +/- 0.00, N = 3SE +/- 0.17, N = 324.4517.1724.061. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT16 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced WriteIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A580100200300400500SE +/- 1.65, N = 3SE +/- 0.14, N = 3SE +/- 0.52, N = 3400.34455.01408.901. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: Memory Bandwidth Coalesced Write

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 ComputeIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A580510152025SE +/- 0.016, N = 3SE +/- 0.002, N = 3SE +/- 0.018, N = 35.08219.8894.0961. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT32 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 ComputeIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A5800.94821.89642.84463.79284.741SE +/- 0.037, N = 3SE +/- 0.015, N = 3SE +/- 0.004, N = 30.9584.2141.0131. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: INT64 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 ComputeIntel ARC A750NVIDIA RTX 4070 SUPERIntel ARC A580918273645SE +/- 0.065, N = 3SE +/- 0.031, N = 3SE +/- 0.013, N = 310.25938.5949.0021. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: FP32 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsIntel ARC A750Intel ARC A580600K1200K1800K2400K3000KSE +/- 132865.40, N = 12SE +/- 154699.96, N = 15280485223452241. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A580306090120150SE +/- 2.08, N = 3SE +/- 0.58, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 31221331331331. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A580306090120150SE +/- 1.00, N = 2SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 1.86, N = 31151381391371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A580306090120150SE +/- 2.08, N = 3SE +/- 0.33, N = 3SE +/- 1.53, N = 3SE +/- 1.20, N = 31171271251271. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A580306090120150SE +/- 4.04, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.67, N = 31191341341331. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A58020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 31091061051051. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A58020406080100SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31021011011011. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A58020406080100SE +/- 0.09, N = 3SE +/- 0.53, N = 3SE +/- 0.15, N = 3SE +/- 0.27, N = 396.877.676.777.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A58020406080100SE +/- 0.12, N = 3SE +/- 0.03, N = 3SE +/- 0.09, N = 3SE +/- 0.03, N = 387.278.878.778.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A5801632486480SE +/- 0.32, N = 3SE +/- 0.12, N = 3SE +/- 0.06, N = 3SE +/- 0.07, N = 370.856.856.456.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A5804080120160200SE +/- 2.73, N = 3SE +/- 0.27, N = 3SE +/- 0.17, N = 3SE +/- 0.92, N = 3165.080.681.381.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A580306090120150SE +/- 2.19, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.88, N = 31561221221221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A580306090120150SE +/- 1.20, N = 3SE +/- 0.41, N = 3SE +/- 0.20, N = 3SE +/- 0.20, N = 3132.083.983.783.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2RIntel ARC A750Intel ARC A5807K14K21K28K35KSE +/- 232.00, N = 15SE +/- 118.37, N = 331218303881. (CXX) g++ options: -O3

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 36131. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 35991. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTNVIDIA RTX 4070 SUPER130260390520650SE +/- 0.00, N = 35841. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNNVIDIA RTX 4070 SUPER120240360480600SE +/- 0.00, N = 35771. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TNVIDIA RTX 4070 SUPER80160240320400SE +/- 0.00, N = 33891. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NNVIDIA RTX 4070 SUPER50100150200250SE +/- 0.33, N = 32101. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTNVIDIA RTX 4070 SUPER100200300400500SE +/- 0.00, N = 34581. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYNVIDIA RTX 4070 SUPER90180270360450SE +/- 0.00, N = 34371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYNVIDIA RTX 4070 SUPER90180270360450SE +/- 0.33, N = 34231. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleNVIDIA RTX 4070 SUPER70140210280350SE +/- 0.30, N = 3339.591. (CXX) g++ options: -O3

Upscale: 2x - Precision: Double

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionIntel ARC A750Intel ARC A58012002400360048006000SE +/- 58.43, N = 3SE +/- 18.21, N = 3542551341. (CXX) g++ options: -O3

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A5803691215SE +/- 0.150, N = 15SE +/- 0.018, N = 3SE +/- 0.008, N = 3SE +/- 0.021, N = 36.3239.80810.35611.099

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER48121620SE +/- 0.03, N = 314.29

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYIntel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER70140210280350SE +/- 0.35, N = 3SE +/- 0.88, N = 15SE +/- 0.33, N = 398.2117.0334.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYIntel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER90180270360450SE +/- 0.12, N = 3SE +/- 2.14, N = 3SE +/- 0.00, N = 388.095.5392.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Classroom - Compute: NVIDIA OptiXNVIDIA RTX 4070 SUPER3691215SE +/- 0.00, N = 312.60

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTIntel ARC A750Intel ARC A580NVIDIA RTX 4070 SUPER80160240320400SE +/- 1.33, N = 3SE +/- 5.03, N = 3SE +/- 0.00, N = 31341873701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5808K16K24K32K40KSE +/- 0.99, N = 3SE +/- 3.31, N = 3SE +/- 1.40, N = 335492.6911380.919758.861. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5804K8K12K16K20KSE +/- 3.14, N = 3SE +/- 2.34, N = 3SE +/- 3.17, N = 318170.544885.364133.831. (CXX) g++ options: -O3

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5805000M10000M15000M20000M25000MSE +/- 5140363.15, N = 3SE +/- 65365351.42, N = 4SE +/- 49049760.02, N = 42213260000054018500004576025000

Benchmark: SHA1

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580300K600K900K1200K1500KSE +/- 1991.93, N = 3SE +/- 240.37, N = 3SE +/- 66.67, N = 31176467246633249667

Benchmark: 7-Zip

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleNVIDIA RTX 4070 SUPER140280420560700SE +/- 0.98, N = 3630.111. (CXX) g++ options: -O3

OpenCL Test: Double-Precision Double

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: GoogLeNetIntel ARC A750Intel ARC A58048121620SE +/- 0.10, N = 3SE +/- 0.07, N = 315.7515.68

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A580510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.04, N = 3SE +/- 0.02, N = 318.4918.2018.8920.411. (CXX) g++ options: -O3

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580700M1400M2100M2800M3500MSE +/- 1530068.99, N = 3SE +/- 5228554.08, N = 3SE +/- 1166666.67, N = 33232733333943466667807266667

Benchmark: SHA-512

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

MandelGPU

MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580130M260M390M520M650MSE +/- 467034.80, N = 3SE +/- 5085622.05, N = 15SE +/- 600066.13, N = 3587219538.2279683403.2248630520.11. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5NVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58014000M28000M42000M56000M70000MSE +/- 22430807.19, N = 3SE +/- 260097949.50, N = 3SE +/- 254278908.55, N = 3675830333333101050000026539200000

Benchmark: MD5

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: OpenCLIntel ARC A750Intel ARC A5800.65611.31221.96832.62443.2805SE +/- 0.011, N = 3SE +/- 0.004, N = 32.9002.916

ParaView

OpenBenchmarking.orgMiVoxels / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A750Intel ARC A5808001600240032004000SE +/- 73.43, N = 15SE +/- 35.11, N = 153901.283692.08

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Volume - Resolution: 1920 x 1080Intel ARC A750Intel ARC A58050100150200250SE +/- 4.59, N = 15SE +/- 2.19, N = 15243.83230.75

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: GoogLeNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58048121620SE +/- 0.17, N = 2SE +/- 0.13, N = 3SE +/- 0.05, N = 312.6215.4815.71

Device: GPU - Batch Size: 1 - Model: GoogLeNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580100200300400500SE +/- 0.12, N = 3SE +/- 0.10, N = 3SE +/- 0.03, N = 3446.2153.7144.01. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Read

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58090180270360450SE +/- 1.11, N = 3SE +/- 0.15, N = 3SE +/- 0.03, N = 3407.5280.1292.81. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Write

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NIntel ARC A750Intel ARC A580400800120016002000SE +/- 15.19, N = 15SE +/- 3.79, N = 32049.201777.181. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: GPU - Batch Size: 1 - Model: AlexNetNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58048121620SE +/- 0.22, N = 2SE +/- 0.16, N = 3SE +/- 0.16, N = 313.9215.7915.78

Device: GPU - Batch Size: 1 - Model: AlexNet

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

ProjectPhysX OpenCL-Benchmark

ProjectPhysX OpenCL-Benchmark provides various OpenCL compute and memory bandwidth micro-benchmarks Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 ComputeNVIDIA RTX 4070 SUPER0.13970.27940.41910.55880.6985SE +/- 0.000, N = 30.6211. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Operation: FP64 Compute

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: | Error: There are no OpenCL devices available. Make sure that the OpenCL 1.2 |

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionIntel ARC A750Intel ARC A58020406080100SE +/- 0.04, N = 3SE +/- 0.64, N = 774.5369.481. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesNVIDIA RTX 4070 SUPERIntel ARC A770 8GbIntel ARC A750Intel ARC A5801.25872.51743.77615.03486.2935SE +/- 0.014, N = 3SE +/- 0.019, N = 3SE +/- 0.006, N = 3SE +/- 0.011, N = 32.8555.1465.2915.594

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58070140210280350SE +/- 0.03, N = 3SE +/- 0.13, N = 3SE +/- 0.03, N = 3331.8269.6259.71. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Copy

Intel ARC A770 8Gb: The test quit with a non-zero exit status.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

Test: FFT + iFFT C2C Bluestein benchmark in double precision

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A580200K400K600K800K1000KSE +/- 633.33, N = 3SE +/- 200.00, N = 3SE +/- 57.74, N = 3802967328400277900

Benchmark: TrueCrypt RIPEMD160 + XTS

Intel ARC A770 8Gb: The test quit with a non-zero exit status. E: ./hashcat: 3: ./hashcat.bin: not found

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: CPU-onlyIntel ARC A750Intel ARC A5800.6571.3141.9712.6283.285SE +/- 0.012, N = 3SE +/- 0.005, N = 32.9052.920

ParaView

OpenBenchmarking.orgMiPolys / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A750Intel ARC A5806001200180024003000SE +/- 10.34, N = 3SE +/- 28.63, N = 42898.672633.04

OpenBenchmarking.orgFrames / Sec, More Is BetterParaView 5.13Test: Wavelet Contour - Resolution: 1920 x 1080Intel ARC A750Intel ARC A58060120180240300SE +/- 0.99, N = 3SE +/- 2.75, N = 4278.15252.66

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A58090180270360450SE +/- 0.02, N = 3SE +/- 0.13, N = 3SE +/- 0.13, N = 3437.65396.72388.851. (CXX) g++ options: -O3

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: OpenCLIntel ARC A750Intel ARC A5800.38030.76061.14091.52121.9015SE +/- 0.005, N = 3SE +/- 0.006, N = 31.6901.680

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7670bcda4450 google::LogMessageFatal::~LogMessageFatal()

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7b5ea59be450 google::LogMessageFatal::~LogMessageFatal()

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7d7151816450 google::LogMessageFatal::~LogMessageFatal()

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: OpenCLIntel ARC A750Intel ARC A5800.31160.62320.93481.24641.558SE +/- 0.001, N = 3SE +/- 0.005, N = 31.3621.385

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x7dd7c6de3450 google::LogMessageFatal::~LogMessageFatal()

FinanceBench

FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLNVIDIA RTX 4070 SUPERIntel ARC A750Intel ARC A5801.33022.66043.99065.32086.651SE +/- 0.114, N = 15SE +/- 0.213, N = 12SE +/- 0.279, N = 155.9124.8424.8161. (CXX) g++ options: -O3 -march=native -fopenmp

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x73552c3e3450 google::LogMessageFatal::~LogMessageFatal()

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: CPU-onlyIntel ARC A750Intel ARC A5800.380.761.141.521.9SE +/- 0.009, N = 3SE +/- 0.005, N = 31.6891.688

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: @ 0x74746a490450 google::LogMessageFatal::~LogMessageFatal()

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: CPU-onlyIntel ARC A750Intel ARC A5800.31070.62140.93211.24281.5535SE +/- 0.008, N = 3SE +/- 0.004, N = 31.3811.373

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadIntel ARC A750Intel ARC A58048121620SE +/- 0.01, N = 3SE +/- 0.02, N = 318.0417.771. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

Scale: 2x - Denoise: 3 - TAA: No

NVIDIA RTX 4070 SUPER: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Intel ARC A770 8Gb: The test run did not produce a result.

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashIntel ARC A750Intel ARC A580510152025SE +/- 0.01, N = 3SE +/- 0.03, N = 322.7219.531. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackIntel ARC A750Intel ARC A580510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 322.4422.431. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: OpenCLIntel ARC A750Intel ARC A5800.03960.07920.11880.15840.198SE +/- 0.000, N = 3SE +/- 0.002, N = 30.1720.176

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadIntel ARC A750Intel ARC A580510152025SE +/- 0.00, N = 3SE +/- 0.00, N = 318.6618.661. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPIntel ARC A750Intel ARC A58030060090012001500SE +/- 9.13, N = 3SE +/- 12.21, N = 31187.361192.991. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DIntel ARC A750Intel ARC A58050100150200250SE +/- 0.74, N = 3SE +/- 0.20, N = 3224.72216.391. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: CPU-onlyIntel ARC A750Intel ARC A5800.03890.07780.11670.15560.1945SE +/- 0.001, N = 3SE +/- 0.001, N = 30.1730.172

NeatBench

NeatBench is a benchmark of the cross-platform Neat Video software on the CPU and optional GPU (OpenCL / CUDA) support. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUNVIDIA RTX 4070 SUPER9001800270036004500SE +/- 0.00, N = 34070

Acceleration: GPU

Intel ARC A770 8Gb: The test run did not produce a result. E: Failed to load CUDA driver ("/usr/lib64/libcuda.so.1")

Intel ARC A750: The test run did not produce a result.

Intel ARC A580: The test run did not produce a result.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

Test: FFT + iFFT C2C 1D batched in double precision

Intel ARC A750: The test quit with a non-zero exit status.

Intel ARC A580: The test quit with a non-zero exit status.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Target: Vulkan GPU

NVIDIA RTX 4070 SUPER: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ncnn: line 3: ./benchncnn: No such file or directory

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

Test: Conjugate Gradient OpenCL

NVIDIA RTX 4070 SUPER: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: arrayfire: line 3: ./cg_opencl: No such file or directory

Intel ARC A750: The test run did not produce a result. E: ./arrayfire: 3: ./cg_opencl: not found

Intel ARC A580: The test run did not produce a result. E: ./arrayfire: 3: ./cg_opencl: not found

Betsy GPU Compressor

Betsy is an open-source GPU compressor of various GPU compression techniques. Betsy is written in GLSL for Vulkan/OpenGL (compute shader) support for GPU-based texture compression. Learn more via the OpenBenchmarking.org test page.

Codec: ETC1 - Quality: Highest

Intel ARC A750: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

GLmark2

This is a test of GLmark2, a basic OpenGL and OpenGL ES 2.0 benchmark supporting various windowing/display back-ends. Learn more via the OpenBenchmarking.org test page.

Resolution: $VIDEO_WIDTH x $VIDEO_HEIGHT

Intel ARC A750: The test quit with a non-zero exit status. E: ./glmark2: 2: ./bin/glmark2: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./glmark2: 2: ./bin/glmark2: not found

Betsy GPU Compressor

Betsy is an open-source GPU compressor of various GPU compression techniques. Betsy is written in GLSL for Vulkan/OpenGL (compute shader) support for GPU-based texture compression. Learn more via the OpenBenchmarking.org test page.

Codec: ETC2 RGB - Quality: Highest

Intel ARC A750: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

Intel ARC A580: The test quit with a non-zero exit status. E: ./betsy: 3: ./betsy: not found

184 Results Shown

TensorFlow
IndigoBench
TensorFlow:
  GPU - 32 - VGG-16:
    images/sec
    images/sec
  GPU - 64 - VGG-16:
    images/sec
  GPU - 16 - VGG-16:
    images/sec
NCNN:
  Vulkan GPU - vgg16
  Vulkan GPU - FastestDet
  Vulkan GPU - vision_transformer
  Vulkan GPU - regnety_400m
  Vulkan GPU - squeezenet_ssd
  Vulkan GPU - yolov4-tiny
  Vulkan GPU - resnet50
  Vulkan GPU - alexnet
  Vulkan GPU - resnet18
  Vulkan GPU - googlenet
  Vulkan GPU - blazeface
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - mnasnet
  Vulkan GPU - shufflenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU-v2-v2 - mobilenet-v2
  Vulkan GPU - mobilenet
  Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3
TensorFlow:
  GPU - 16 - VGG-16
  GPU - 64 - ResNet-50
  GPU - 64 - ResNet-50
SPECViewPerf 2020
Unigine Heaven
SPECViewPerf 2020:
  1920 x 1080 - SOLIDWORKS-07
  1920 x 1080 - MAYA-06
TensorFlow
SPECViewPerf 2020
TensorFlow:
  GPU - 32 - ResNet-50
  GPU - 16 - ResNet-50
SPECViewPerf 2020
Unigine Valley
SPECViewPerf 2020
TensorFlow:
  GPU - 16 - AlexNet
  GPU - 16 - ResNet-50
vkpeak:
  int32-scalar
  int32-vec4
  int16-vec4
  int16-scalar
  fp16-vec4
  fp16-scalar
  fp32-vec4
  fp32-scalar
TensorFlow
VkFFT
Xonotic
SPECViewPerf 2020
TensorFlow:
  GPU - 32 - GoogLeNet
  GPU - 64 - GoogLeNet
LuxMark:
  GPU - Microphone
  GPU - Hotel
  CPU+GPU - Microphone
  GPU - Luxball HDR
  CPU+GPU - Luxball HDR
  CPU+GPU - Hotel
TensorFlow:
  GPU - 32 - GoogLeNet
  GPU - 64 - GoogLeNet
IndigoBench
TensorFlow
VkFFT
TensorFlow
Xonotic
SHOC Scalable HeterOgeneous Computing
TensorFlow:
  GPU - 1 - VGG-16
  GPU - 16 - GoogLeNet
VkFFT
IndigoBench:
  CPU - Bedroom
  CPU - Supercar
TensorFlow
VkFFT
TensorFlow
vkpeak:
  fp16-vec4
  fp16-scalar
  fp32-vec4
  fp32-scalar
VkFFT
RealSR-NCNN
Blender
VkFFT:
  FFT + iFFT C2C 1D batched in single precision, no reshuffling:
    Benchmark Score
    Benchmark Score
Xonotic
VkFFT
TensorFlow
VkFFT
Blender
Xonotic
TensorFlow
OpenArena
OpenArena
VkFFT
TensorFlow
ParaView:
  Many Spheres - 1920 x 1080:
    MiPolys / Sec
    Frames / Sec
VkFFT
ProjectPhysX OpenCL-Benchmark
TensorFlow
Blender
FAHBench
TensorFlow
VkFFT
ProjectPhysX OpenCL-Benchmark:
  INT8 Compute
  Memory Bandwidth Coalesced Read
  INT16 Compute
  Memory Bandwidth Coalesced Write
  INT32 Compute
  INT64 Compute
  FP32 Compute
SHOC Scalable HeterOgeneous Computing
ViennaCL:
  CPU BLAS - dGEMM-TT
  CPU BLAS - dGEMM-TN
  CPU BLAS - dGEMM-NT
  CPU BLAS - dGEMM-NN
  CPU BLAS - dGEMV-T
  CPU BLAS - dGEMV-N
  CPU BLAS - dDOT
  CPU BLAS - dAXPY
  CPU BLAS - dCOPY
  CPU BLAS - sDOT
  CPU BLAS - sAXPY
  CPU BLAS - sCOPY
VkFFT
ViennaCL:
  OpenCL BLAS - dGEMM-TT
  OpenCL BLAS - dGEMM-TN
  OpenCL BLAS - dGEMM-NT
  OpenCL BLAS - dGEMM-NN
  OpenCL BLAS - dGEMV-T
  OpenCL BLAS - dGEMV-N
  OpenCL BLAS - dDOT
  OpenCL BLAS - dAXPY
  OpenCL BLAS - dCOPY
VkResample
VkFFT
RealSR-NCNN
Blender
ViennaCL:
  OpenCL BLAS - sCOPY
  OpenCL BLAS - sAXPY
Blender
ViennaCL
clpeak:
  Single-Precision Float
  Integer Compute INT
Hashcat:
  SHA1
  7-Zip
clpeak
TensorFlow
VkResample
Hashcat
MandelGPU
Hashcat
Darktable
ParaView:
  Wavelet Volume - 1920 x 1080:
    MiVoxels / Sec
    Frames / Sec
TensorFlow
cl-mem:
  Read
  Write
SHOC Scalable HeterOgeneous Computing
TensorFlow
ProjectPhysX OpenCL-Benchmark
SHOC Scalable HeterOgeneous Computing
Waifu2x-NCNN Vulkan
cl-mem
Hashcat
Darktable
ParaView:
  Wavelet Contour - 1920 x 1080:
    MiPolys / Sec
    Frames / Sec
clpeak
Darktable:
  Masskrug - OpenCL
  Server Room - OpenCL
FinanceBench
Darktable:
  Masskrug - CPU-only
  Server Room - CPU-only
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Triad
  OpenCL - MD5 Hash
  OpenCL - Bus Speed Readback
Darktable
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Bus Speed Download
  OpenCL - FFT SP
  OpenCL - S3D
Darktable
NeatBench