Ubuntu 22.04.02 LTS 7900X 7900XTX opencl AMD Ryzen 9 7900X 12-Core testing with a ASUS ROG STRIX B650E-F GAMING WIFI (1410 BIOS) and ASUS NVIDIA GeForce RTX 4080 16GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2305210-NE-2305163NE65&grr&rdt .
Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL OpenCL Compiler File-System Screen Resolution Display Driver Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl AMD Ryzen 9 7900X 12-Core @ 4.70GHz (12 Cores / 24 Threads) ASUS ROG STRIX B650E-F GAMING WIFI (1410 BIOS) AMD Device 14d8 64GB 2000GB SHPP41-2000GM + 120GB TOSHIBA RC100 + 1000GB Western Digital WD_BLACK SN750 SE NVMe 1TB + 32GB Flash Drive AMD Radeon RX 7900 XTX 24GB (3220/1249MHz) AMD Device ab30 LG HDR 4K + LG Ultra HD Intel I225-V + MEDIATEK Device 0608 Ubuntu 22.04 5.19.0-41-generic (x86_64) Budgie 10.6.1 X Server 1.21.1.4 4.6 Mesa 22.3.0-devel (LLVM 15.0.3 DRM 3.48) OpenCL 2.1 AMD-APP (3513.0) GCC 11.3.0 ext4 7680x2160 32GB 2000GB SHPP41-2000GM + 120GB TOSHIBA RC100 + 1000GB Western Digital WD_BLACK SN750 SE NVMe 1TB ASUS NVIDIA GeForce RTX 4080 16GB NVIDIA Device 22bb NVIDIA 530.41.03 4.6.0 OpenCL 3.0 CUDA 12.1.98 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 Graphics Details - Ubuntu 22.04.02 LTS 7900X 7900XTX opencl: GLAMOR - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-TIC106615-100 - Ubuntu 22.04.02 LTS 7900X 4080 opencl: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 95.03.2b.00.8c Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected OpenCL Details - Ubuntu 22.04.02 LTS 7900X 4080 opencl: GPU Compute Cores: 9728
Ubuntu 22.04.02 LTS 7900X 7900XTX opencl shoc: OpenCL - Max SP Flops luxmark: CPU+GPU - Microphone luxmark: GPU - Microphone luxmark: GPU - Hotel luxmark: CPU+GPU - Luxball HDR luxmark: GPU - Luxball HDR luxmark: CPU+GPU - Hotel smallpt-gpu: GPU - 7680 x 2160 - Caustic smallpt-gpu: GPU - 7680 x 2160 - Caustic3 smallpt-gpu: GPU - 7680 x 2160 - Cornell clpeak: Transfer Bandwidth enqueueWriteBuffer clpeak: Transfer Bandwidth enqueueReadBuffer fluidx3d: FP32-FP32 viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY viennacl: OpenCL BLAS - dGEMM-TT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sCOPY fluidx3d: FP32-FP16C fluidx3d: FP32-FP16S rodinia: OpenCL Myocyte clpeak: Double-Precision Compute shoc: OpenCL - Texture Read Bandwidth shoc: OpenCL - GEMM SGEMM_N rodinia: OpenCL Leukocyte darktable: Boat - OpenCL darktable: Masskrug - OpenCL rodinia: OpenCL Particle Filter cl-mem: Copy cl-mem: Write cl-mem: Read shoc: OpenCL - Bus Speed Readback darktable: Server Room - OpenCL clpeak: Global Memory Bandwidth shoc: OpenCL - S3D shoc: OpenCL - Reduction clpeak: Kernel Latency lulesh-cl: shoc: OpenCL - Triad shoc: OpenCL - FFT SP darktable: Server Rack - OpenCL shoc: OpenCL - Bus Speed Download clpeak: Single-Precision Compute clpeak: Integer 24-bit Compute clpeak: Integer Compute shoc: OpenCL - MD5 Hash parboil: OpenCL LBM Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 66.2 69.9 61.6 64.1 132 115 99.2 95.4 63.5 287 289 179 2.758 3.027 2.504 0.121 53452.7 75300 75029 22310 99792 99385 22294 1684649225 1684649501 1684649363 12.63 11.41 3772 66.2 70.3 61.9 64.4 78.8 73.3 59.9 54.4 36.0 178 164 103 817 801 766 743 427 219 596 605 524 412 483 367 7705 7786 19.916 817.20 2970.92 16861.6 2.393 1.187 1.864 2.929 374.9 520.5 620.7 26.3975 0.647 577.23 422.946 946.679 3.89 9189.9160 25.9744 1812.90 0.111 26.8876 46337.02 23875.52 23782.67 57.0867 OpenBenchmarking.org
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops Ubuntu 22.04.02 LTS 7900X 4080 opencl 11K 22K 33K 44K 55K SE +/- 92.32, N = 3 53452.7 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
LuxMark OpenCL Device: CPU+GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Microphone Ubuntu 22.04.02 LTS 7900X 4080 opencl 16K 32K 48K 64K 80K SE +/- 12.53, N = 3 75300
LuxMark OpenCL Device: GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Microphone Ubuntu 22.04.02 LTS 7900X 4080 opencl 16K 32K 48K 64K 80K SE +/- 252.33, N = 3 75029
LuxMark OpenCL Device: GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Hotel Ubuntu 22.04.02 LTS 7900X 4080 opencl 5K 10K 15K 20K 25K SE +/- 37.00, N = 3 22310
LuxMark OpenCL Device: CPU+GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Luxball HDR Ubuntu 22.04.02 LTS 7900X 4080 opencl 20K 40K 60K 80K 100K SE +/- 6.36, N = 3 99792
LuxMark OpenCL Device: GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Luxball HDR Ubuntu 22.04.02 LTS 7900X 4080 opencl 20K 40K 60K 80K 100K SE +/- 399.21, N = 3 99385
LuxMark OpenCL Device: CPU+GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Hotel Ubuntu 22.04.02 LTS 7900X 4080 opencl 5K 10K 15K 20K 25K SE +/- 11.26, N = 3 22294
SmallPT GPU OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Caustic OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Caustic Ubuntu 22.04.02 LTS 7900X 4080 opencl 400M 800M 1200M 1600M 2000M SE +/- 25.69, N = 3 1684649225 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Caustic3 OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Caustic3 Ubuntu 22.04.02 LTS 7900X 4080 opencl 400M 800M 1200M 1600M 2000M SE +/- 25.40, N = 3 1684649501 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Cornell OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 7680 x 2160 - Scene: Cornell Ubuntu 22.04.02 LTS 7900X 4080 opencl 400M 800M 1200M 1600M 2000M SE +/- 25.12, N = 3 1684649363 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueWriteBuffer Ubuntu 22.04.02 LTS 7900X 4080 opencl 3 6 9 12 15 SE +/- 0.16, N = 3 12.63 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueReadBuffer Ubuntu 22.04.02 LTS 7900X 4080 opencl 3 6 9 12 15 SE +/- 0.03, N = 3 11.41 1. (CXX) g++ options: -O3
FluidX3D Test: FP32-FP32 OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.3 Test: FP32-FP32 Ubuntu 22.04.02 LTS 7900X 4080 opencl 800 1600 2400 3200 4000 SE +/- 4.37, N = 3 3772
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 15 30 45 60 75 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 66.2 66.2 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 16 32 48 64 80 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 69.9 70.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 14 28 42 56 70 SE +/- 0.17, N = 3 SE +/- 0.19, N = 3 61.6 61.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 14 28 42 56 70 SE +/- 0.20, N = 3 SE +/- 0.20, N = 3 64.1 64.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.12, N = 3 132.0 78.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 30 60 90 120 150 SE +/- 0.67, N = 3 SE +/- 0.07, N = 3 115.0 73.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 20 40 60 80 100 SE +/- 0.19, N = 3 SE +/- 0.03, N = 3 99.2 59.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 95.4 54.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 14 28 42 56 70 SE +/- 0.09, N = 3 SE +/- 0.12, N = 3 63.5 36.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 60 120 180 240 300 SE +/- 0.00, N = 2 SE +/- 0.67, N = 3 287 178 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 1.20, N = 3 289 164 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 1.00, N = 3 179 103 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT Ubuntu 22.04.02 LTS 7900X 4080 opencl 200 400 600 800 1000 SE +/- 1.53, N = 3 817 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN Ubuntu 22.04.02 LTS 7900X 4080 opencl 200 400 600 800 1000 SE +/- 1.20, N = 3 801 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT Ubuntu 22.04.02 LTS 7900X 4080 opencl 170 340 510 680 850 SE +/- 1.33, N = 3 766 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN Ubuntu 22.04.02 LTS 7900X 4080 opencl 160 320 480 640 800 SE +/- 1.53, N = 3 743 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T Ubuntu 22.04.02 LTS 7900X 4080 opencl 90 180 270 360 450 SE +/- 0.33, N = 3 427 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N Ubuntu 22.04.02 LTS 7900X 4080 opencl 50 100 150 200 250 SE +/- 0.00, N = 3 219 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT Ubuntu 22.04.02 LTS 7900X 4080 opencl 130 260 390 520 650 SE +/- 0.33, N = 3 596 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY Ubuntu 22.04.02 LTS 7900X 4080 opencl 130 260 390 520 650 SE +/- 0.33, N = 3 605 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY Ubuntu 22.04.02 LTS 7900X 4080 opencl 110 220 330 440 550 SE +/- 0.00, N = 3 524 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT Ubuntu 22.04.02 LTS 7900X 4080 opencl 90 180 270 360 450 SE +/- 0.33, N = 3 412 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY Ubuntu 22.04.02 LTS 7900X 4080 opencl 100 200 300 400 500 SE +/- 0.00, N = 3 483 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY Ubuntu 22.04.02 LTS 7900X 4080 opencl 80 160 240 320 400 SE +/- 1.53, N = 3 367 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
FluidX3D Test: FP32-FP16C OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.3 Test: FP32-FP16C Ubuntu 22.04.02 LTS 7900X 4080 opencl 1700 3400 5100 6800 8500 SE +/- 1.00, N = 3 7705
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.3 Test: FP32-FP16S Ubuntu 22.04.02 LTS 7900X 4080 opencl 2K 4K 6K 8K 10K SE +/- 0.67, N = 3 7786
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte Ubuntu 22.04.02 LTS 7900X 4080 opencl 5 10 15 20 25 SE +/- 0.09, N = 3 19.92 1. (CXX) g++ options: -O2 -lOpenCL
clpeak OpenCL Test: Double-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Compute Ubuntu 22.04.02 LTS 7900X 4080 opencl 200 400 600 800 1000 SE +/- 1.38, N = 3 817.20 1. (CXX) g++ options: -O3
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth Ubuntu 22.04.02 LTS 7900X 4080 opencl 600 1200 1800 2400 3000 SE +/- 2.71, N = 3 2970.92 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N Ubuntu 22.04.02 LTS 7900X 4080 opencl 4K 8K 12K 16K 20K SE +/- 152.29, N = 15 16861.6 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Rodinia Test: OpenCL Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Leukocyte Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.5384 1.0768 1.6152 2.1536 2.692 SE +/- 0.030, N = 12 2.393 1. (CXX) g++ options: -O2 -lOpenCL
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Boat - Acceleration: OpenCL Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.6206 1.2412 1.8618 2.4824 3.103 SE +/- 0.022, N = 3 SE +/- 0.006, N = 3 2.758 1.187
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Masskrug - Acceleration: OpenCL Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.6811 1.3622 2.0433 2.7244 3.4055 SE +/- 0.007, N = 3 SE +/- 0.008, N = 3 3.027 1.864
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.659 1.318 1.977 2.636 3.295 SE +/- 0.028, N = 7 2.929 1. (CXX) g++ options: -O2 -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy Ubuntu 22.04.02 LTS 7900X 4080 opencl 80 160 240 320 400 SE +/- 0.12, N = 3 374.9 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write Ubuntu 22.04.02 LTS 7900X 4080 opencl 110 220 330 440 550 SE +/- 1.33, N = 3 520.5 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read Ubuntu 22.04.02 LTS 7900X 4080 opencl 130 260 390 520 650 SE +/- 0.22, N = 3 620.7 1. (CC) gcc options: -O2 -flto -lOpenCL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback Ubuntu 22.04.02 LTS 7900X 4080 opencl 6 12 18 24 30 SE +/- 0.00, N = 3 26.40 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Room - Acceleration: OpenCL Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.5634 1.1268 1.6902 2.2536 2.817 SE +/- 0.006, N = 3 SE +/- 0.003, N = 3 2.504 0.647
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth Ubuntu 22.04.02 LTS 7900X 4080 opencl 120 240 360 480 600 SE +/- 3.45, N = 3 577.23 1. (CXX) g++ options: -O3
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D Ubuntu 22.04.02 LTS 7900X 4080 opencl 90 180 270 360 450 SE +/- 0.21, N = 3 422.95 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction Ubuntu 22.04.02 LTS 7900X 4080 opencl 200 400 600 800 1000 SE +/- 8.26, N = 8 946.68 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak 1.1.2 OpenCL Test: Kernel Latency Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.8753 1.7506 2.6259 3.5012 4.3765 SE +/- 0.05, N = 15 3.89 1. (CXX) g++ options: -O3
Lulesh OpenCL OpenBenchmarking.org z/s, More Is Better Lulesh OpenCL 2017-07-06 Ubuntu 22.04.02 LTS 7900X 4080 opencl 2K 4K 6K 8K 10K SE +/- 33.49, N = 3 9189.92 1. (CXX) g++ options: -std=c++11 -lOpenCL -O3 -lm
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad Ubuntu 22.04.02 LTS 7900X 4080 opencl 6 12 18 24 30 SE +/- 0.00, N = 3 25.97 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP Ubuntu 22.04.02 LTS 7900X 4080 opencl 400 800 1200 1600 2000 SE +/- 2.20, N = 3 1812.90 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Darktable Test: Server Rack - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Rack - Acceleration: OpenCL Ubuntu 22.04.02 LTS 7900X 7900XTX opencl Ubuntu 22.04.02 LTS 7900X 4080 opencl 0.0272 0.0544 0.0816 0.1088 0.136 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 0.121 0.111
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download Ubuntu 22.04.02 LTS 7900X 4080 opencl 6 12 18 24 30 SE +/- 0.00, N = 3 26.89 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
clpeak OpenCL Test: Single-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Single-Precision Compute Ubuntu 22.04.02 LTS 7900X 4080 opencl 10K 20K 30K 40K 50K SE +/- 82.14, N = 3 46337.02 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer 24-bit Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer 24-bit Compute Ubuntu 22.04.02 LTS 7900X 4080 opencl 5K 10K 15K 20K 25K SE +/- 101.49, N = 3 23875.52 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer Compute Ubuntu 22.04.02 LTS 7900X 4080 opencl 5K 10K 15K 20K 25K SE +/- 61.13, N = 3 23782.67 1. (CXX) g++ options: -O3
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash Ubuntu 22.04.02 LTS 7900X 4080 opencl 13 26 39 52 65 SE +/- 0.32, N = 3 57.09 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Phoronix Test Suite v10.8.5