pts-result-gpu-30-05-2024.list 2 x AMD EPYC 7452 32-Core testing with a Supermicro AS-2124GQ-NART H12DSG-Q-CPU6 v1.01 (1.0a BIOS) and NVIDIA A100-SXM4-40GB on CentOS Linux 7 via the Phoronix Test Suite. GPU-run-30-05-2024: Processor: 2 x AMD EPYC 7452 32-Core @ 2.35GHz (64 Cores), Motherboard: Supermicro AS-2124GQ-NART H12DSG-Q-CPU6 v1.01 (1.0a BIOS), Chipset: AMD Starship/Matisse, Memory: 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE, Disk: 252GB, Graphics: NVIDIA A100-SXM4-40GB, Network: 2 x Intel 10-Gigabit X540-AT2 OS: CentOS Linux 7, Kernel: 5.4.265-1.el7.elrepo.x86_64 (x86_64), Display Server: X Server, Display Driver: NVIDIA, Vulkan: 1.3.260, Compiler: GCC 4.8.5 20150623 + CUDA 12.3, File-System: tmpfs, Screen Resolution: 1024x768 pts-config-gpu-30-05-2024: Processor: 2 x AMD EPYC 7452 32-Core @ 2.35GHz (64 Cores), Motherboard: Supermicro AS-2124GQ-NART H12DSG-Q-CPU6 v1.01 (1.0a BIOS), Chipset: AMD Starship/Matisse, Memory: 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE, Disk: 252GB, Graphics: NVIDIA A100-SXM4-40GB, Network: 2 x Intel 10-Gigabit X540-AT2 OS: CentOS Linux 7, Kernel: 5.4.265-1.el7.elrepo.x86_64 (x86_64), Display Server: X Server, Display Driver: NVIDIA, Vulkan: 1.3.260, Compiler: GCC 8.3.0 + CUDA 12.3, File-System: tmpfs, Screen Resolution: 1024x768 ArrayFire 3.9 Test: Conjugate Gradient OpenCL Blender 4.1 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better Blender 4.1 Blend File: Junkshop - Compute: NVIDIA OptiX Seconds < Lower Is Better Blender 4.1 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better Blender 4.1 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better Blender 4.1 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better Blender 4.1 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 231.7 |============================================ cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 780.7 |============================================ cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 1242.6 |=========================================== clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better pts-config-gpu-30-05-2024 . 16043.49 |========================================= clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better pts-config-gpu-30-05-2024 . 17926.38 |========================================= clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better pts-config-gpu-30-05-2024 . 7979.31 |========================================== clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better pts-config-gpu-30-05-2024 . 1300.10 |========================================== FAHBench 2.3.2 Ns Per Day > Higher Is Better FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better pts-config-gpu-30-05-2024 . 1.191 |============================================ GROMACS 2024 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better pts-config-gpu-30-05-2024 . 170625637500 |===================================== Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better pts-config-gpu-30-05-2024 . 88126433333 |====================================== Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better pts-config-gpu-30-05-2024 . 4385933 |========================================== Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better pts-config-gpu-30-05-2024 . 12848366667 |====================================== Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better pts-config-gpu-30-05-2024 . 3299133 |========================================== LeelaChessZero 0.30 Backend: OpenCL Nodes Per Second > Higher Is Better LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer GIOPS > Higher Is Better pts-config-gpu-30-05-2024 . 14801.87 |========================================= Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer GIOPS > Higher Is Better pts-config-gpu-30-05-2024 . 12812.49 |========================================= Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision GFLOPS > Higher Is Better pts-config-gpu-30-05-2024 . 7699.84 |========================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision GFLOPS > Higher Is Better pts-config-gpu-30-05-2024 . 15389.52 |========================================= Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision GFLOPS > Higher Is Better pts-config-gpu-30-05-2024 . 44809.08 |========================================= Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision GFLOPS > Higher Is Better pts-config-gpu-30-05-2024 . 7814.50 |========================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision GFLOPS > Higher Is Better pts-config-gpu-30-05-2024 . 15201.95 |========================================= NCNN 20230517 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better pts-config-gpu-30-05-2024 . 39.73 |============================================ NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better pts-config-gpu-30-05-2024 . 18.97 |============================================ NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better pts-config-gpu-30-05-2024 . 18.30 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better pts-config-gpu-30-05-2024 . 21.59 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better pts-config-gpu-30-05-2024 . 15.40 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better pts-config-gpu-30-05-2024 . 23.16 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better pts-config-gpu-30-05-2024 . 8.97 |============================================= NCNN 20230517 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better pts-config-gpu-30-05-2024 . 38.14 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better pts-config-gpu-30-05-2024 . 124.62 |=========================================== NCNN 20230517 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better pts-config-gpu-30-05-2024 . 28.99 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better pts-config-gpu-30-05-2024 . 17.89 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better pts-config-gpu-30-05-2024 . 58.77 |============================================ NCNN 20230517 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better pts-config-gpu-30-05-2024 . 39.73 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better pts-config-gpu-30-05-2024 . 70.32 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better pts-config-gpu-30-05-2024 . 37.74 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better pts-config-gpu-30-05-2024 . 41.64 |============================================ NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better pts-config-gpu-30-05-2024 . 156.26 |=========================================== NCNN 20230517 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better pts-config-gpu-30-05-2024 . 18.25 |============================================ NeatBench 5 Acceleration: GPU FPS > Higher Is Better PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better RedShift Demo 3.0 Seconds < Lower Is Better Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better pts-config-gpu-30-05-2024 . 3.302 |============================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 470 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 736 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 418 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 742 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 948 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 627 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 157.2 |============================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better pts-config-gpu-30-05-2024 . 450 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better pts-config-gpu-30-05-2024 . 74.6 |============================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better pts-config-gpu-30-05-2024 . 68.8 |============================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better pts-config-gpu-30-05-2024 . 85.1 |============================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better pts-config-gpu-30-05-2024 . 76.3 |============================================= ViennaCL 1.7.1 Test: OpenCL BLAS GFLOPS > Higher Is Better