clpeak benchmark

AMD EPYC 7262 8-Core testing with a GIGABYTE MZ32-AR0-00 v01000100 (R21 BIOS) and NVIDIA GeForce RTX 4090 24GB on Ubuntu 22.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2401230-NE-CLPEAKBEN88
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
NVIDIA GeForce RTX 4090
January 23
  5 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


clpeak benchmarkOpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 7262 8-Core @ 3.20GHz (8 Cores / 16 Threads)GIGABYTE MZ32-AR0-00 v01000100 (R21 BIOS)AMD Starship/Matisse128GB1000GB Samsung SSD 980 PRO 1TBNVIDIA GeForce RTX 4090 24GBNVIDIA Device 22baDELL U2720Q2 x Intel I350Ubuntu 22.046.5.0-14-generic (x86_64)GNOME Shell 42.9X Server 1.21.1.4NVIDIA 535.154.054.6.0OpenCL 3.0 CUDA 12.2.1481.3.242GCC 11.4.0 + CUDA 11.8ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen ResolutionClpeak Benchmark PerformanceSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0x830107a - BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 95.02.18.80.53- GPU Compute Cores: 16384- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

clpeak benchmarkclpeak: Kernel Latencyclpeak: Integer Computeclpeak: Integer 24-bit Computeclpeak: Global Memory Bandwidthclpeak: Double-Precision Computeclpeak: Single-Precision Computeclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBufferNVIDIA GeForce RTX 40906.2140578.1040776.55869.391346.2278861.249.3211.71OpenBenchmarking.org

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is Betterclpeak 1.1.2OpenCL Test: Kernel LatencyNVIDIA GeForce RTX 4090246810SE +/- 0.02, N = 36.211. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer ComputeNVIDIA GeForce RTX 40909K18K27K36K45KSE +/- 186.80, N = 340578.101. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer 24-bit ComputeNVIDIA GeForce RTX 40909K18K27K36K45KSE +/- 82.15, N = 340776.551. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthNVIDIA GeForce RTX 40902004006008001000SE +/- 2.22, N = 3869.391. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision ComputeNVIDIA GeForce RTX 409030060090012001500SE +/- 0.63, N = 31346.221. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision ComputeNVIDIA GeForce RTX 409020K40K60K80K100KSE +/- 505.70, N = 378861.241. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueReadBufferNVIDIA GeForce RTX 40903691215SE +/- 0.04, N = 39.321. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueWriteBufferNVIDIA GeForce RTX 40903691215SE +/- 0.08, N = 311.711. (CXX) g++ options: -O3