workingnow? Intel Core i9-12900KF testing with a Gigabyte Z690 UD DDR4 (F7 BIOS) and MSI NVIDIA GeForce RTX 4090 24GB on Ubuntu 22.04 via the Phoronix Test Suite. workingnow: Processor: Intel Core i9-12900KF @ 5.10GHz (16 Cores / 24 Threads), Motherboard: Gigabyte Z690 UD DDR4 (F7 BIOS), Chipset: Intel Device 7aa7, Memory: 32GB, Disk: 2000GB KINGSTON SNVS2000G, Graphics: MSI NVIDIA GeForce RTX 4090 24GB, Audio: Realtek ALC897, Monitor: DELL P2419H, Network: Realtek RTL8125 2.5GbE + Intel Wi-Fi 6 AX200 OS: Ubuntu 22.04, Kernel: 5.15.0-71-generic (x86_64), Desktop: LXQt 0.17.0, Display Server: X Server 1.21.1.3, Display Driver: NVIDIA 530.30.02, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.1.68, Vulkan: 1.3.236, Compiler: GCC 11.3.0 + CUDA 12.1, File-System: ext4, Screen Resolution: 1920x1080 vkpeak 20210424 fp32-scalar GFLOPS > Higher Is Better workingnow . 45893.56 |======================================================== vkpeak 20210424 fp32-vec4 GFLOPS > Higher Is Better workingnow . 60521.26 |======================================================== vkpeak 20210424 fp16-scalar GFLOPS > Higher Is Better workingnow . 45678.72 |======================================================== vkpeak 20210424 fp16-vec4 GFLOPS > Higher Is Better workingnow . 90594.96 |======================================================== vkpeak 20210424 fp64-scalar GFLOPS > Higher Is Better workingnow . 1443.97 |========================================================= vkpeak 20210424 fp64-vec4 GFLOPS > Higher Is Better workingnow . 1446.25 |========================================================= vkpeak 20210424 int32-scalar GIOPS > Higher Is Better workingnow . 45787.93 |======================================================== vkpeak 20210424 int32-vec4 GIOPS > Higher Is Better workingnow . 45552.24 |======================================================== vkpeak 20210424 int16-scalar GIOPS > Higher Is Better workingnow . 30436.13 |======================================================== vkpeak 20210424 int16-vec4 GIOPS > Higher Is Better workingnow . 40527.84 |======================================================== RealSR-NCNN 20200818 Scale: 4x - TAA: No Seconds < Lower Is Better workingnow . 4.685 |=========================================================== RealSR-NCNN 20200818 Scale: 4x - TAA: Yes Seconds < Lower Is Better workingnow . 19.79 |=========================================================== Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No Seconds < Lower Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes Seconds < Lower Is Better workingnow . 2.094 |=========================================================== VkFFT 1.1.1 Benchmark Score > Higher Is Better workingnow . 133483 |========================================================== Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better workingnow . 155800000000 |==================================================== Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better workingnow . 50683000000 |===================================================== Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better workingnow . 2651800 |========================================================= Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better workingnow . 7424433333 |====================================================== Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better workingnow . 1906267 |========================================================= Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer GIOPS > Higher Is Better workingnow . 40702.14 |======================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer GIOPS > Higher Is Better workingnow . 35349.15 |======================================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision GFLOPS > Higher Is Better workingnow . 1098.71 |========================================================= Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision GFLOPS > Higher Is Better workingnow . 77320.86 |======================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision GFLOPS > Higher Is Better workingnow . 80736.45 |======================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision GFLOPS > Higher Is Better workingnow . 1098.84 |========================================================= Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision GFLOPS > Higher Is Better workingnow . 75020.16 |======================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D GFLOPS > Higher Is Better workingnow . 647.52 |========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad GB/s > Higher Is Better workingnow . 21.67 |=========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better workingnow . 2789.70 |========================================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash GHash/s > Higher Is Better workingnow . 94.22 |=========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction GB/s > Higher Is Better workingnow . 991.18 |========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N GFLOPS > Higher Is Better workingnow . 26966.3 |========================================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops GFLOPS > Higher Is Better workingnow . 88834.0 |========================================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download GB/s > Higher Is Better workingnow . 24.74 |=========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback GB/s > Higher Is Better workingnow . 26.35 |=========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better workingnow . 3084.43 |========================================================= Libplacebo 5.229.1 FPS > Higher Is Better cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better workingnow . 414.4 |=========================================================== cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better workingnow . 888.9 |=========================================================== cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better workingnow . 806.8 |=========================================================== NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better workingnow . 0.13372 |========================================================= Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest Seconds < Lower Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest Seconds < Lower Is Better VkResample 1.0 Upscale: 2x - Precision: Double ms < Lower Is Better workingnow . 54.18 |=========================================================== VkResample 1.0 Upscale: 2x - Precision: Single ms < Lower Is Better workingnow . 7.747 |=========================================================== OctaneBench 2020.1 Total Score Score > Higher Is Better workingnow . 1312.60 |========================================================= RedShift Demo 3.0 Seconds < Lower Is Better FAHBench 2.3.2 Ns Per Day > Higher Is Better workingnow . 437.38 |========================================================== clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better workingnow . 41343.87 |======================================================== clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better workingnow . 80554.10 |======================================================== clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better workingnow . 1434.39 |========================================================= clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better workingnow . 873.43 |========================================================== LeelaChessZero 0.28 Backend: OpenCL Nodes Per Second > Higher Is Better workingnow . 21422 |=========================================================== Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL ms < Lower Is Better workingnow . 0.8472 |========================================================== LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better workingnow . 25.86 |=========================================================== LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better workingnow . 18.38 |=========================================================== LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better workingnow . 20.04 |=========================================================== LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better workingnow . 19.75 |=========================================================== LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better workingnow . 44.73 |=========================================================== FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better workingnow . 2.815 |=========================================================== ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better workingnow . 41.6 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better workingnow . 46.0 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better workingnow . 51.1 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better workingnow . 28.0 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better workingnow . 33.6 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better workingnow . 33.7 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better workingnow . 35.0 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better workingnow . 39.8 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better workingnow . 79.5 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better workingnow . 84.3 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better workingnow . 92.8 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better workingnow . 95.8 |============================================================ ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better workingnow . 487 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better workingnow . 600 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better workingnow . 460 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better workingnow . 667 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better workingnow . 780 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better workingnow . 732 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better workingnow . 224 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better workingnow . 450 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better workingnow . 1190 |============================================================ ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better workingnow . 1320 |============================================================ ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better workingnow . 1337 |============================================================ ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT GFLOPs/s > Higher Is Better workingnow . 1380 |============================================================ GROMACS 2023 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better workingnow . 41.71 |=========================================================== Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better workingnow . 443.70 |========================================================== Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better workingnow . 870.77 |========================================================== Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better workingnow . 4291.08 |========================================================= Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better workingnow . 1661.77 |========================================================= Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better workingnow . 3303.73 |========================================================= Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better workingnow . 16459.9 |========================================================= NCNN 20220729 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better workingnow . 2.99 |============================================================ NCNN 20220729 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better workingnow . 0.95 |============================================================ NCNN 20220729 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better workingnow . 1.23 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better workingnow . 1.20 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better workingnow . 1.13 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better workingnow . 2.11 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better workingnow . 0.79 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better workingnow . 1.85 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better workingnow . 1.57 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better workingnow . 1.03 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better workingnow . 1.54 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better workingnow . 5.83 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better workingnow . 2.32 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better workingnow . 1.44 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better workingnow . 208.30 |========================================================== NCNN 20220729 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better workingnow . 2.18 |============================================================ NCNN 20220729 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better workingnow . 0.93 |============================================================ PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better Blender 3.5 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better workingnow . 12.18 |=========================================================== Blender 3.5 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better workingnow . 7.14 |============================================================ Blender 3.5 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better workingnow . 5.32 |============================================================ Blender 3.5 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better workingnow . 30.06 |=========================================================== Blender 3.5 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better workingnow . 8.09 |============================================================ IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom M samples/s > Higher Is Better workingnow . 35.76 |=========================================================== IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar M samples/s > Higher Is Better workingnow . 80.00 |=========================================================== MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better workingnow . 1035158945.7 |==================================================== NeatBench 5 Acceleration: GPU FPS > Higher Is Better workingnow . 4090 |============================================================