Nvidia

2 x Intel Xeon Gold 6226R testing with a (5.14 BIOS) and ASPEED 16GB on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2407208-NE-NVIDIA79457.

NvidiaProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDisplay ServerDisplay DriverOpenCLCompilerFile-SystemScreen ResolutionASPEED - 2 x Intel Xeon Gold 6226R2 x Intel Xeon Gold 6226R @ 3.90GHz (32 Cores / 64 Threads)(5.14 BIOS)Intel Sky Lake-E DMI3 Registers512GB2 x 8002GB INTEL SSDPE2KX080T8ASPEED 16GBNVIDIA GA104 HD Audio27B2G52 x Intel X722 for 1GbE + 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25GbUbuntu 24.046.8.0-38-generic (x86_64)X ServerNVIDIAOpenCL 3.0 CUDA 12.4.131GCC 13.2.0 + CUDA 12.4ext41920x1080OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x5003605 - BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 94.04.57.00.08- Python 3.8.13- gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: SW loop KVM: SW loop + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled

Nvidiahashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSmixbench: OpenCL - Integermixbench: OpenCL - Double Precisionmixbench: OpenCL - Single Precisionshoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writefahbench: clpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthrodinia: OpenCL Particle Filterluxcorerender: DLSC - GPUluxcorerender: Danish Mood - GPUluxcorerender: Orange Juice - GPUluxcorerender: LuxCore Benchmark - GPUluxcorerender: Rainbow Colors and Prism - GPUfinancebench: Black-Scholes OpenCLviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TTncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetplaidml: No - Inference - IMDB LSTM - OpenCLplaidml: No - Inference - Mobilenet - OpenCLplaidml: Yes - Inference - Mobilenet - OpenCLplaidml: No - Inference - DenseNet 201 - OpenCLneatbench: GPUASPEED - 2 x Intel Xeon Gold 6226R15617811250091940033333422470013308000000345420011601.09309.3918670.28211.90412.11731094.6622.5655324.1823630.5521619.912.325013.15271998.58283.4380.1376.4240.13859617.4918602.65365.83377.047.10557.4435.8250.6426.70122.0610.460228382261117.918317510819659.759.462.258.326634531235938538317031734434834018.938.378.529.727.3011.024.1318.1545.7310.927.9721.9033.4320.3232.7758.4610.39751.931898.702201.61179.21OpenBenchmarking.org

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5ASPEED - 2 x Intel Xeon Gold 6226R30000M60000M90000M120000M150000MSE +/- 31239876948.73, N = 16156178112500

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1ASPEED - 2 x Intel Xeon Gold 6226R20000M40000M60000M80000M100000MSE +/- 113680610.09, N = 391940033333

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipASPEED - 2 x Intel Xeon Gold 6226R900K1800K2700K3600K4500KSE +/- 9462.73, N = 34224700

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512ASPEED - 2 x Intel Xeon Gold 6226R3000M6000M9000M12000M15000MSE +/- 26463244.95, N = 313308000000

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSASPEED - 2 x Intel Xeon Gold 6226R700K1400K2100K2800K3500KSE +/- 1039.23, N = 33454200

Mixbench

Backend: OpenCL - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: IntegerASPEED - 2 x Intel Xeon Gold 6226R2K4K6K8K10KSE +/- 4.62, N = 311601.091. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Double PrecisionASPEED - 2 x Intel Xeon Gold 6226R70140210280350SE +/- 0.92, N = 3309.391. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Single PrecisionASPEED - 2 x Intel Xeon Gold 6226R4K8K12K16K20KSE +/- 27.29, N = 318670.281. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DASPEED - 2 x Intel Xeon Gold 6226R50100150200250SE +/- 0.07, N = 3211.901. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadASPEED - 2 x Intel Xeon Gold 6226R3691215SE +/- 0.00, N = 312.121. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPASPEED - 2 x Intel Xeon Gold 6226R2004006008001000SE +/- 0.17, N = 31094.661. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashASPEED - 2 x Intel Xeon Gold 6226R510152025SE +/- 0.00, N = 322.571. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionASPEED - 2 x Intel Xeon Gold 6226R70140210280350SE +/- 0.27, N = 3324.181. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NASPEED - 2 x Intel Xeon Gold 6226R8001600240032004000SE +/- 44.25, N = 43630.551. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsASPEED - 2 x Intel Xeon Gold 6226R5K10K15K20K25KSE +/- 305.65, N = 321619.91. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadASPEED - 2 x Intel Xeon Gold 6226R3691215SE +/- 0.00, N = 312.331. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackASPEED - 2 x Intel Xeon Gold 6226R3691215SE +/- 0.00, N = 313.151. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthASPEED - 2 x Intel Xeon Gold 6226R400800120016002000SE +/- 4.68, N = 31998.581. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyASPEED - 2 x Intel Xeon Gold 6226R60120180240300SE +/- 0.07, N = 3283.41. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadASPEED - 2 x Intel Xeon Gold 6226R80160240320400SE +/- 0.03, N = 3380.11. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteASPEED - 2 x Intel Xeon Gold 6226R80160240320400SE +/- 0.10, N = 3376.41. (CC) gcc options: -O2 -flto -lOpenCL

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2ASPEED - 2 x Intel Xeon Gold 6226R50100150200250SE +/- 0.31, N = 3240.14

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTASPEED - 2 x Intel Xeon Gold 6226R2K4K6K8K10KSE +/- 87.03, N = 39617.491. (CXX) g++ options: -O3

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatASPEED - 2 x Intel Xeon Gold 6226R4K8K12K16K20KSE +/- 76.05, N = 318602.651. (CXX) g++ options: -O3

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleASPEED - 2 x Intel Xeon Gold 6226R80160240320400SE +/- 0.36, N = 3365.831. (CXX) g++ options: -O3

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthASPEED - 2 x Intel Xeon Gold 6226R80160240320400SE +/- 0.01, N = 3377.041. (CXX) g++ options: -O3

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle FilterASPEED - 2 x Intel Xeon Gold 6226R246810SE +/- 0.096, N = 37.1051. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: GPUASPEED - 2 x Intel Xeon Gold 6226R1326395265SE +/- 5.22, N = 1257.44MAX: 65.49

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Danish Mood - Acceleration: GPUASPEED - 2 x Intel Xeon Gold 6226R816243240SE +/- 0.26, N = 335.82MIN: 12.8 / MAX: 46.64

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: GPUASPEED - 2 x Intel Xeon Gold 6226R1122334455SE +/- 0.09, N = 350.64MIN: 44.98 / MAX: 64.06

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: GPUASPEED - 2 x Intel Xeon Gold 6226R612182430SE +/- 2.44, N = 1226.70MAX: 43.01

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Rainbow Colors and Prism - Acceleration: GPUASPEED - 2 x Intel Xeon Gold 6226R306090120150SE +/- 0.62, N = 3122.06MIN: 106.98 / MAX: 141.43

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLASPEED - 2 x Intel Xeon Gold 6226R3691215SE +/- 0.02, N = 310.461. (CXX) g++ options: -O3 -march=native -fopenmp

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYASPEED - 2 x Intel Xeon Gold 6226R50100150200250SE +/- 3.89, N = 152281. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYASPEED - 2 x Intel Xeon Gold 6226R80160240320400SE +/- 5.93, N = 153821. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTASPEED - 2 x Intel Xeon Gold 6226R60120180240300SE +/- 1.23, N = 152611. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYASPEED - 2 x Intel Xeon Gold 6226R306090120150SE +/- 2.79, N = 15117.91. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYASPEED - 2 x Intel Xeon Gold 6226R4080120160200SE +/- 1.77, N = 151831. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTASPEED - 2 x Intel Xeon Gold 6226R4080120160200SE +/- 1.40, N = 151751. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NASPEED - 2 x Intel Xeon Gold 6226R20406080100SE +/- 0.64, N = 151081. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TASPEED - 2 x Intel Xeon Gold 6226R4080120160200SE +/- 1.22, N = 151961. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNASPEED - 2 x Intel Xeon Gold 6226R1326395265SE +/- 1.20, N = 1559.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTASPEED - 2 x Intel Xeon Gold 6226R1326395265SE +/- 1.11, N = 1459.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNASPEED - 2 x Intel Xeon Gold 6226R1428425670SE +/- 1.35, N = 1462.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTASPEED - 2 x Intel Xeon Gold 6226R1326395265SE +/- 1.46, N = 1558.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYASPEED - 2 x Intel Xeon Gold 6226R60120180240300SE +/- 0.67, N = 32661. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYASPEED - 2 x Intel Xeon Gold 6226R70140210280350SE +/- 0.33, N = 33451. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTASPEED - 2 x Intel Xeon Gold 6226R70140210280350SE +/- 0.67, N = 33121. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYASPEED - 2 x Intel Xeon Gold 6226R80160240320400SE +/- 0.58, N = 33591. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYASPEED - 2 x Intel Xeon Gold 6226R80160240320400SE +/- 0.00, N = 33851. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTASPEED - 2 x Intel Xeon Gold 6226R80160240320400SE +/- 0.33, N = 33831. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NASPEED - 2 x Intel Xeon Gold 6226R4080120160200SE +/- 0.00, N = 31701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TASPEED - 2 x Intel Xeon Gold 6226R70140210280350SE +/- 1.76, N = 33171. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNASPEED - 2 x Intel Xeon Gold 6226R70140210280350SE +/- 1.45, N = 33441. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTASPEED - 2 x Intel Xeon Gold 6226R801602403204003481. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTASPEED - 2 x Intel Xeon Gold 6226R701402102803503401. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetASPEED - 2 x Intel Xeon Gold 6226R510152025SE +/- 0.22, N = 1218.93MIN: 17.49 / MAX: 22.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2ASPEED - 2 x Intel Xeon Gold 6226R246810SE +/- 0.10, N = 128.37MIN: 7.43 / MAX: 30.631. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3ASPEED - 2 x Intel Xeon Gold 6226R246810SE +/- 0.06, N = 128.52MIN: 7.93 / MAX: 87.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2ASPEED - 2 x Intel Xeon Gold 6226R3691215SE +/- 0.08, N = 129.72MIN: 8.97 / MAX: 16.961. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetASPEED - 2 x Intel Xeon Gold 6226R246810SE +/- 0.12, N = 127.30MIN: 6.57 / MAX: 71.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0ASPEED - 2 x Intel Xeon Gold 6226R3691215SE +/- 0.13, N = 1211.02MIN: 9.86 / MAX: 77.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceASPEED - 2 x Intel Xeon Gold 6226R0.92931.85862.78793.71724.6465SE +/- 0.07, N = 124.13MIN: 3.7 / MAX: 4.651. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetASPEED - 2 x Intel Xeon Gold 6226R48121620SE +/- 0.33, N = 1218.15MIN: 15.73 / MAX: 36.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16ASPEED - 2 x Intel Xeon Gold 6226R1020304050SE +/- 0.42, N = 1245.73MIN: 41.81 / MAX: 716.681. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18ASPEED - 2 x Intel Xeon Gold 6226R3691215SE +/- 0.11, N = 1210.92MIN: 10.17 / MAX: 12.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetASPEED - 2 x Intel Xeon Gold 6226R246810SE +/- 0.10, N = 127.97MIN: 7.31 / MAX: 10.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50ASPEED - 2 x Intel Xeon Gold 6226R510152025SE +/- 0.24, N = 1221.90MIN: 20.15 / MAX: 31.11. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyASPEED - 2 x Intel Xeon Gold 6226R816243240SE +/- 0.49, N = 1233.43MIN: 29.11 / MAX: 256.671. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdASPEED - 2 x Intel Xeon Gold 6226R510152025SE +/- 0.34, N = 1220.32MIN: 18.17 / MAX: 30.871. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mASPEED - 2 x Intel Xeon Gold 6226R816243240SE +/- 0.23, N = 1232.77MIN: 31.13 / MAX: 37.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerASPEED - 2 x Intel Xeon Gold 6226R1326395265SE +/- 0.72, N = 1258.46MIN: 52.56 / MAX: 125.761. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetASPEED - 2 x Intel Xeon Gold 6226R3691215SE +/- 0.41, N = 1110.39MIN: 8.52 / MAX: 30.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

PlaidML

FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCLASPEED - 2 x Intel Xeon Gold 6226R160320480640800SE +/- 1.13, N = 3751.93

PlaidML

FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCLASPEED - 2 x Intel Xeon Gold 6226R400800120016002000SE +/- 2.99, N = 31898.70

PlaidML

FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCLASPEED - 2 x Intel Xeon Gold 6226R5001000150020002500SE +/- 0.40, N = 32201.61

PlaidML

FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCLASPEED - 2 x Intel Xeon Gold 6226R4080120160200SE +/- 0.34, N = 3179.21


Phoronix Test Suite v10.8.5