workingnow?

Intel Core i9-12900KF testing with a Gigabyte Z690 UD DDR4 (F7 BIOS) and MSI NVIDIA GeForce RTX 4090 24GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2305189-NE-WORKINGNO89.

workingnow?ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen ResolutionworkingnowIntel Core i9-12900KF @ 5.10GHz (16 Cores / 24 Threads)Gigabyte Z690 UD DDR4 (F7 BIOS)Intel Device 7aa732GB2000GB KINGSTON SNVS2000GMSI NVIDIA GeForce RTX 4090 24GBRealtek ALC897DELL P2419HRealtek RTL8125 2.5GbE + Intel Wi-Fi 6 AX200Ubuntu 22.045.15.0-71-generic (x86_64)LXQt 0.17.0X Server 1.21.1.3NVIDIA 530.30.024.6.0OpenCL 3.0 CUDA 12.1.681.3.236GCC 11.3.0 + CUDA 12.1ext41920x1080OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x2c - Thermald 2.4.9 - BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 95.02.18.08.01- GPU Compute Cores: 16384- Python 3.10.6- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

workingnow?vkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4realsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yesvkfft: hashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSmixbench: OpenCL - Integermixbench: NVIDIA CUDA - Integermixbench: OpenCL - Double Precisionmixbench: OpenCL - Single Precisionmixbench: NVIDIA CUDA - Half Precisionmixbench: NVIDIA CUDA - Double Precisionmixbench: NVIDIA CUDA - Single Precisionshoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writenamd-cuda: ATPase Simulation - 327,506 Atomsvkresample: 2x - Doublevkresample: 2x - Singleoctanebench: Total Scorefahbench: clpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthlczero: OpenCLarrayfire: Conjugate Gradient OpenCLluxcorerender: DLSC - GPUluxcorerender: Danish Mood - GPUluxcorerender: Orange Juice - GPUluxcorerender: LuxCore Benchmark - GPUluxcorerender: Rainbow Colors and Prism - GPUfinancebench: Black-Scholes OpenCLviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTgromacs: NVIDIA CUDA GPU - water_GMX50_barecaffe: AlexNet - NVIDIA CUDA - 100caffe: AlexNet - NVIDIA CUDA - 200caffe: AlexNet - NVIDIA CUDA - 1000caffe: GoogleNet - NVIDIA CUDA - 100caffe: GoogleNet - NVIDIA CUDA - 200caffe: GoogleNet - NVIDIA CUDA - 1000ncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetncnn: Vulkan GPU - resnet18blender: BMW27 - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - NVIDIA OptiXindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarmandelgpu: GPUneatbench: GPUworkingnow45893.5660521.2645678.7290594.961443.971446.2545787.9345552.2430436.1340527.844.68519.7932.0941334831558000000005068300000026518007424433333190626740702.1435349.151098.7177320.8680736.451098.8475020.16647.51521.67002789.7094.2226991.17526966.388834.024.736026.35423084.43414.4888.9806.80.1337254.1827.7471312.604869437.384541343.8780554.101434.39873.43214220.847225.8618.3820.0419.7544.732.81541.646.051.128.033.633.735.039.879.584.392.895.8487600460667780732224450119013201337138041.711443.703870.7734291.081661.773303.7316459.92.990.951.231.201.132.110.791.851.571.031.545.832.321.44208.302.180.9312.187.145.3230.068.0935.76480.0041035158945.74090OpenBenchmarking.org

vkpeak

fp32-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp32-scalarworkingnow10K20K30K40K50KSE +/- 47.33, N = 345893.56

vkpeak

fp32-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp32-vec4workingnow13K26K39K52K65KSE +/- 59.33, N = 360521.26

vkpeak

fp16-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp16-scalarworkingnow10K20K30K40K50KSE +/- 21.76, N = 345678.72

vkpeak

fp16-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp16-vec4workingnow20K40K60K80K100KSE +/- 28.95, N = 390594.96

vkpeak

fp64-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp64-scalarworkingnow30060090012001500SE +/- 0.99, N = 31443.97

vkpeak

fp64-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp64-vec4workingnow30060090012001500SE +/- 0.19, N = 31446.25

vkpeak

int32-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int32-scalarworkingnow10K20K30K40K50KSE +/- 13.40, N = 345787.93

vkpeak

int32-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int32-vec4workingnow10K20K30K40K50KSE +/- 0.36, N = 345552.24

vkpeak

int16-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int16-scalarworkingnow7K14K21K28K35KSE +/- 11.73, N = 330436.13

vkpeak

int16-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int16-vec4workingnow9K18K27K36K45KSE +/- 32.63, N = 340527.84

RealSR-NCNN

Scale: 4x - TAA: No

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Noworkingnow1.05412.10823.16234.21645.2705SE +/- 0.063, N = 34.685

RealSR-NCNN

Scale: 4x - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Yesworkingnow510152025SE +/- 0.07, N = 319.79

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: Yesworkingnow0.47120.94241.41361.88482.356SE +/- 0.008, N = 32.094

VkFFT

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.1.1workingnow30K60K90K120K150KSE +/- 438.58, N = 31334831. (CXX) g++ options: -O3

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5workingnow30000M60000M90000M120000M150000MSE +/- 305505046.33, N = 3155800000000

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1workingnow11000M22000M33000M44000M55000MSE +/- 15159485.48, N = 350683000000

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-Zipworkingnow600K1200K1800K2400K3000KSE +/- 7559.32, N = 32651800

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512workingnow1600M3200M4800M6400M8000MSE +/- 7995276.38, N = 37424433333

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSworkingnow400K800K1200K1600K2000KSE +/- 733.33, N = 31906267

Mixbench

Backend: OpenCL - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Integerworkingnow9K18K27K36K45KSE +/- 0.00, N = 340702.141. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Integerworkingnow8K16K24K32K40KSE +/- 21.23, N = 335349.151. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Double Precisionworkingnow2004006008001000SE +/- 0.08, N = 31098.711. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Single Precisionworkingnow17K34K51K68K85KSE +/- 98.70, N = 377320.861. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Half Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Half Precisionworkingnow20K40K60K80K100KSE +/- 58.22, N = 380736.451. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Double Precisionworkingnow2004006008001000SE +/- 0.00, N = 31098.841. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Single Precisionworkingnow16K32K48K64K80KSE +/- 19.41, N = 375020.161. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3Dworkingnow140280420560700SE +/- 0.37, N = 3647.521. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Triadworkingnow510152025SE +/- 0.08, N = 321.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPworkingnow6001200180024003000SE +/- 1.24, N = 32789.701. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 Hashworkingnow20406080100SE +/- 0.89, N = 1594.221. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Reductionworkingnow2004006008001000SE +/- 11.65, N = 15991.181. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_Nworkingnow6K12K18K24K30KSE +/- 47.83, N = 326966.31. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP Flopsworkingnow20K40K60K80K100KSE +/- 182.99, N = 388834.01. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Downloadworkingnow612182430SE +/- 0.01, N = 324.741. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Readbackworkingnow612182430SE +/- 0.00, N = 326.351. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read Bandwidthworkingnow7001400210028003500SE +/- 4.12, N = 33084.431. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copyworkingnow90180270360450SE +/- 0.03, N = 3414.41. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readworkingnow2004006008001000SE +/- 0.73, N = 3888.91. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writeworkingnow2004006008001000SE +/- 0.28, N = 3806.81. (CC) gcc options: -O2 -flto -lOpenCL

NAMD CUDA

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 Atomsworkingnow0.03010.06020.09030.12040.1505SE +/- 0.00029, N = 30.13372

VkResample

Upscale: 2x - Precision: Double

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Doubleworkingnow1224364860SE +/- 0.02, N = 354.181. (CXX) g++ options: -O3

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Singleworkingnow246810SE +/- 0.002, N = 37.7471. (CXX) g++ options: -O3

OctaneBench

Total Score

OpenBenchmarking.orgScore, More Is BetterOctaneBench 2020.1Total Scoreworkingnow300600900120015001312.60

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2workingnow90180270360450SE +/- 0.79, N = 3437.38

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTworkingnow9K18K27K36K45KSE +/- 1.32, N = 341343.871. (CXX) g++ options: -O3

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision Floatworkingnow20K40K60K80K100KSE +/- 124.24, N = 380554.101. (CXX) g++ options: -O3

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision Doubleworkingnow30060090012001500SE +/- 1.74, N = 31434.391. (CXX) g++ options: -O3

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory Bandwidthworkingnow2004006008001000SE +/- 0.45, N = 3873.431. (CXX) g++ options: -O3

LeelaChessZero

Backend: OpenCL

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: OpenCLworkingnow5K10K15K20K25KSE +/- 222.66, N = 9214221. (CXX) g++ options: -flto -pthread

ArrayFire

Test: Conjugate Gradient OpenCL

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.7Test: Conjugate Gradient OpenCLworkingnow0.19060.38120.57180.76240.953SE +/- 0.0012, N = 30.84721. (CXX) g++ options: -rdynamic

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: GPUworkingnow612182430SE +/- 0.02, N = 325.86MIN: 24.57 / MAX: 26.16

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Danish Mood - Acceleration: GPUworkingnow510152025SE +/- 0.13, N = 318.38MIN: 4.67 / MAX: 22.55

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: GPUworkingnow510152025SE +/- 0.02, N = 320.04MIN: 17.15 / MAX: 27.53

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: GPUworkingnow510152025SE +/- 0.06, N = 319.75MIN: 7.72 / MAX: 24.83

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Rainbow Colors and Prism - Acceleration: GPUworkingnow1020304050SE +/- 0.04, N = 344.73MIN: 39.81 / MAX: 49.16

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLworkingnow0.63341.26681.90022.53363.167SE +/- 0.001, N = 32.8151. (CXX) g++ options: -O3 -march=native -fopenmp

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYworkingnow918273645SE +/- 0.03, N = 341.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYworkingnow1020304050SE +/- 1.82, N = 346.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTworkingnow1224364860SE +/- 0.36, N = 351.11. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYworkingnow714212835SE +/- 0.09, N = 328.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYworkingnow816243240SE +/- 0.09, N = 333.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTworkingnow816243240SE +/- 0.97, N = 333.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-Nworkingnow816243240SE +/- 1.74, N = 335.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-Tworkingnow918273645SE +/- 0.12, N = 339.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNworkingnow20406080100SE +/- 1.59, N = 379.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTworkingnow20406080100SE +/- 2.74, N = 384.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNworkingnow20406080100SE +/- 4.90, N = 392.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTworkingnow20406080100SE +/- 0.17, N = 395.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYworkingnow110220330440550SE +/- 0.67, N = 34871. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYworkingnow130260390520650SE +/- 0.33, N = 36001. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTworkingnow100200300400500SE +/- 0.33, N = 34601. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYworkingnow140280420560700SE +/- 0.33, N = 36671. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYworkingnow2004006008001000SE +/- 0.33, N = 37801. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTworkingnow160320480640800SE +/- 0.88, N = 37321. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-Nworkingnow50100150200250SE +/- 0.33, N = 32241. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-Tworkingnow100200300400500SE +/- 0.33, N = 34501. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNworkingnow30060090012001500SE +/- 0.00, N = 311901. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTworkingnow30060090012001500SE +/- 0.00, N = 313201. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNworkingnow30060090012001500SE +/- 3.33, N = 313371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTworkingnow30060090012001500SE +/- 0.00, N = 313801. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

GROMACS

Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bareworkingnow1020304050SE +/- 0.04, N = 341.711. (CXX) g++ options: -O3

Caffe

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100workingnow100200300400500SE +/- 2.48, N = 3443.701. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200workingnow2004006008001000SE +/- 2.69, N = 3870.771. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000workingnow9001800270036004500SE +/- 2.18, N = 34291.081. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100workingnow400800120016002000SE +/- 7.59, N = 31661.771. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200workingnow7001400210028003500SE +/- 3.41, N = 33303.731. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000workingnow4K8K12K16K20KSE +/- 35.06, N = 316459.91. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: mobilenetworkingnow0.67281.34562.01842.69123.364SE +/- 0.03, N = 52.99MIN: 2.89 / MAX: 14.661. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2workingnow0.21380.42760.64140.85521.069SE +/- 0.01, N = 50.95MIN: 0.91 / MAX: 18.221. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3workingnow0.27680.55360.83041.10721.384SE +/- 0.01, N = 51.23MIN: 1.19 / MAX: 5.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: shufflenet-v2workingnow0.270.540.811.081.35SE +/- 0.01, N = 51.20MIN: 1.12 / MAX: 5.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: mnasnetworkingnow0.25430.50860.76291.01721.2715SE +/- 0.19, N = 51.13MIN: 0.91 / MAX: 35.411. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: efficientnet-b0workingnow0.47480.94961.42441.89922.374SE +/- 0.09, N = 52.11MIN: 1.8 / MAX: 20.241. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: blazefaceworkingnow0.17780.35560.53340.71120.889SE +/- 0.01, N = 50.79MIN: 0.75 / MAX: 4.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: googlenetworkingnow0.41630.83261.24891.66522.0815SE +/- 0.35, N = 51.85MIN: 1.46 / MAX: 26.491. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: vgg16workingnow0.35330.70661.05991.41321.7665SE +/- 0.06, N = 51.57MIN: 1.45 / MAX: 26.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: alexnetworkingnow0.23180.46360.69540.92721.159SE +/- 0.01, N = 51.03MIN: 0.99 / MAX: 5.661. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: resnet50workingnow0.34650.6931.03951.3861.7325SE +/- 0.03, N = 51.54MIN: 1.49 / MAX: 32.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: yolov4-tinyworkingnow1.31182.62363.93545.24726.559SE +/- 0.22, N = 55.83MIN: 4.9 / MAX: 54.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: squeezenet_ssdworkingnow0.5221.0441.5662.0882.61SE +/- 0.00, N = 52.32MIN: 2.28 / MAX: 2.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: regnety_400mworkingnow0.3240.6480.9721.2961.62SE +/- 0.02, N = 51.44MIN: 1.37 / MAX: 20.221. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: vision_transformerworkingnow50100150200250SE +/- 6.29, N = 5208.30MIN: 128.21 / MAX: 992.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: FastestDetworkingnow0.49050.9811.47151.9622.4525SE +/- 0.03, N = 52.18MIN: 1.68 / MAX: 24.011. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: Vulkan GPU - Model: resnet18workingnow0.20930.41860.62790.83721.0465SE +/- 0.01, N = 30.93MIN: 0.89 / MAX: 1.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.5Blend File: BMW27 - Compute: NVIDIA OptiXworkingnow3691215SE +/- 8.77, N = 1212.18

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.5Blend File: Classroom - Compute: NVIDIA OptiXworkingnow246810SE +/- 0.03, N = 37.14

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.5Blend File: Fishy Cat - Compute: NVIDIA OptiXworkingnow1.1972.3943.5914.7885.985SE +/- 0.06, N = 145.32

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.5Blend File: Barbershop - Compute: NVIDIA OptiXworkingnow714212835SE +/- 0.04, N = 330.06

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.5Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXworkingnow246810SE +/- 0.01, N = 38.09

IndigoBench

Acceleration: OpenCL GPU - Scene: Bedroom

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Bedroomworkingnow816243240SE +/- 0.06, N = 335.76

IndigoBench

Acceleration: OpenCL GPU - Scene: Supercar

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Supercarworkingnow20406080100SE +/- 0.03, N = 380.00

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUworkingnow200M400M600M800M1000MSE +/- 5134240.32, N = 31035158945.71. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

NeatBench

Acceleration: GPU

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUworkingnow9001800270036004500SE +/- 0.00, N = 34090


Phoronix Test Suite v10.8.4