pts-result-gpu-30-05-2024.list

2 x AMD EPYC 7452 32-Core testing with a Supermicro AS-2124GQ-NART H12DSG-Q-CPU6 v1.01 (1.0a BIOS) and NVIDIA A100-SXM4-40GB on CentOS Linux 7 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2407234-NE-PTSRESULT61&grw&sor.

pts-result-gpu-30-05-2024.listProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDisplay ServerDisplay DriverVulkanCompilerFile-SystemScreen ResolutionGPU-run-30-05-2024pts-config-gpu-30-05-20242 x AMD EPYC 7452 32-Core @ 2.35GHz (64 Cores)Supermicro AS-2124GQ-NART H12DSG-Q-CPU6 v1.01 (1.0a BIOS)AMD Starship/Matisse16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE252GBNVIDIA A100-SXM4-40GB2 x Intel 10-Gigabit X540-AT2CentOS Linux 75.4.265-1.el7.elrepo.x86_64 (x86_64)X ServerNVIDIA1.3.260GCC 4.8.5 20150623 + CUDA 12.3tmpfs1024x768GCC 8.3.0 + CUDA 12.3OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- GPU-run-30-05-2024: --build=x86_64-redhat-linux --disable-libgcj --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-linker-hash-style=gnu --with-tune=generic - pts-config-gpu-30-05-2024: --disable-multilib --enable-languages=c,c++Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x830107a Graphics Details- BAR1 / Visible vRAM Size: 65536 MiB - vBIOS Version: 92.00.19.00.13Python Details- GPU-run-30-05-2024: Python 2.7.5 + Python 3.6.8- pts-config-gpu-30-05-2024: Python 2.7.5 + Python 3.9.7Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Vulnerable + spec_store_bypass: Vulnerable + spectre_v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers + spectre_v2: Vulnerable IBPB: disabled STIBP: disabled PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected Environment Details- pts-config-gpu-30-05-2024: EXTRA_NVCCFLAGS=-cudart=shared

pts-result-gpu-30-05-2024.listncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetrodinia: OpenCL Particle Filterhashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSmixbench: OpenCL - Integermixbench: NVIDIA CUDA - Integermixbench: OpenCL - Double Precisionmixbench: OpenCL - Single Precisionmixbench: NVIDIA CUDA - Half Precisionmixbench: NVIDIA CUDA - Double Precisionmixbench: NVIDIA CUDA - Single Precisionfinancebench: Black-Scholes OpenCLcl-mem: Copycl-mem: Readcl-mem: Writeclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLASGPU-run-30-05-2024pts-config-gpu-30-05-202439.7318.9718.3021.5915.4023.168.9738.14124.6228.9917.8958.7739.7370.3237.7441.64156.2618.253.30217062563750088126433333438593312848366667329913314801.8712812.497699.8415389.5244809.087814.5015201.951.191231.7780.71242.616043.4917926.387979.311300.10470736418742948627157.245074.668.885.176.3OpenBenchmarking.org

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetpts-config-gpu-30-05-2024918273645SE +/- 1.15, N = 939.73MIN: 29.4 / MAX: 277.23

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2pts-config-gpu-30-05-2024510152025SE +/- 3.03, N = 918.97MIN: 10.12 / MAX: 542.78

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3pts-config-gpu-30-05-2024510152025SE +/- 1.67, N = 918.30MIN: 9.64 / MAX: 63.36

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2pts-config-gpu-30-05-2024510152025SE +/- 1.17, N = 921.59MIN: 12.52 / MAX: 248.51

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetpts-config-gpu-30-05-202448121620SE +/- 1.11, N = 915.40MIN: 8.91 / MAX: 466.02

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0pts-config-gpu-30-05-2024612182430SE +/- 2.28, N = 923.16MIN: 12.29 / MAX: 221.56

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefacepts-config-gpu-30-05-20243691215SE +/- 0.98, N = 98.97MIN: 4.17 / MAX: 172.12

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetpts-config-gpu-30-05-2024918273645SE +/- 1.64, N = 938.14MIN: 27.74 / MAX: 523.7

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16pts-config-gpu-30-05-2024306090120150SE +/- 5.02, N = 9124.62MIN: 66.23 / MAX: 682.25

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18pts-config-gpu-30-05-2024714212835SE +/- 1.18, N = 928.99MIN: 19.8 / MAX: 221.98

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetpts-config-gpu-30-05-202448121620SE +/- 1.37, N = 917.89MIN: 13.81 / MAX: 182.2

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50pts-config-gpu-30-05-20241326395265SE +/- 1.60, N = 958.77MIN: 44.97 / MAX: 337.56

NCNN

Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3pts-config-gpu-30-05-2024918273645SE +/- 1.15, N = 939.73MIN: 29.4 / MAX: 277.23

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinypts-config-gpu-30-05-20241632486480SE +/- 2.29, N = 970.32MIN: 50.39 / MAX: 562.92

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdpts-config-gpu-30-05-2024918273645SE +/- 1.52, N = 937.74MIN: 27.11 / MAX: 623.11

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mpts-config-gpu-30-05-20241020304050SE +/- 2.44, N = 941.64MIN: 28.21 / MAX: 229.16

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerpts-config-gpu-30-05-2024306090120150SE +/- 3.66, N = 9156.26MIN: 137.55 / MAX: 893.71

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetpts-config-gpu-30-05-202448121620SE +/- 1.79, N = 918.25MIN: 11.3 / MAX: 482.82

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filterpts-config-gpu-30-05-20240.7431.4862.2292.9723.715SE +/- 0.027, N = 133.3021. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5pts-config-gpu-30-05-202440000M80000M120000M160000M200000MSE +/- 26429817415.14, N = 16170625637500

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1pts-config-gpu-30-05-202420000M40000M60000M80000M100000MSE +/- 104285670.69, N = 388126433333

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-Zippts-config-gpu-30-05-2024900K1800K2700K3600K4500KSE +/- 25606.34, N = 34385933

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512pts-config-gpu-30-05-20243000M6000M9000M12000M15000MSE +/- 3468589.21, N = 312848366667

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSpts-config-gpu-30-05-2024700K1400K2100K2800K3500KSE +/- 592.55, N = 33299133

Mixbench

Backend: OpenCL - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Integerpts-config-gpu-30-05-20243K6K9K12K15KSE +/- 4.35, N = 314801.871. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Integerpts-config-gpu-30-05-20243K6K9K12K15KSE +/- 5.64, N = 312812.491. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Double Precisionpts-config-gpu-30-05-202417003400510068008500SE +/- 113.22, N = 157699.841. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Single Precisionpts-config-gpu-30-05-20243K6K9K12K15KSE +/- 262.04, N = 1515389.521. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Half Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Half Precisionpts-config-gpu-30-05-202410K20K30K40K50KSE +/- 727.84, N = 1544809.081. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Double Precisionpts-config-gpu-30-05-20242K4K6K8K10KSE +/- 160.94, N = 157814.501. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: NVIDIA CUDA - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Single Precisionpts-config-gpu-30-05-20243K6K9K12K15KSE +/- 290.53, N = 1215201.951. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLpts-config-gpu-30-05-20240.2680.5360.8041.0721.34SE +/- 0.005, N = 31.1911. (CXX) g++ options: -O3 -march=native -fopenmp

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copypts-config-gpu-30-05-202450100150200250SE +/- 0.03, N = 3231.71. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readpts-config-gpu-30-05-20242004006008001000SE +/- 0.06, N = 3780.71. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writepts-config-gpu-30-05-202430060090012001500SE +/- 0.45, N = 31242.61. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTpts-config-gpu-30-05-20243K6K9K12K15KSE +/- 199.00, N = 1516043.49

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision Floatpts-config-gpu-30-05-20244K8K12K16K20KSE +/- 165.14, N = 317926.38

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision Doublepts-config-gpu-30-05-20242K4K6K8K10KSE +/- 119.81, N = 127979.31

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory Bandwidthpts-config-gpu-30-05-202430060090012001500SE +/- 3.13, N = 31300.10

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYpts-config-gpu-30-05-2024100200300400500SE +/- 17.64, N = 15470

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYpts-config-gpu-30-05-2024160320480640800SE +/- 25.06, N = 15736

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTpts-config-gpu-30-05-202490180270360450SE +/- 1.84, N = 15418

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYpts-config-gpu-30-05-2024160320480640800SE +/- 36.07, N = 15742

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYpts-config-gpu-30-05-20242004006008001000SE +/- 42.05, N = 15948

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTpts-config-gpu-30-05-2024140280420560700SE +/- 19.94, N = 15627

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-Npts-config-gpu-30-05-2024306090120150SE +/- 6.04, N = 15157.2

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-Tpts-config-gpu-30-05-2024100200300400500SE +/- 5.98, N = 15450

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNpts-config-gpu-30-05-202420406080100SE +/- 0.41, N = 1574.6

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTpts-config-gpu-30-05-20241530456075SE +/- 0.48, N = 1568.8

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNpts-config-gpu-30-05-202420406080100SE +/- 0.59, N = 1585.1

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTpts-config-gpu-30-05-202420406080100SE +/- 0.79, N = 1576.3


Phoronix Test Suite v10.8.5