gpu3Multicore1

AMD Ryzen Threadripper 2950X 16-Core testing with a ASRock X399 Professional Gaming (P3.80 BIOS) and MSI NVIDIA GeForce GTX 1080 8GB on Ubuntu 16.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2102113-HA-GPU3MULTI38
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
gpu3Multicore1
February 10 2021
  23 Hours, 10 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


gpu3Multicore1OpenBenchmarking.orgPhoronix Test SuiteAMD Ryzen Threadripper 2950X 16-Core @ 3.50GHz (16 Cores / 32 Threads)ASRock X399 Professional Gaming (P3.80 BIOS)AMD 17h126GB1000GB Samsung SSD 860MSI NVIDIA GeForce GTX 1080 8GBNVIDIA GP104 HD AudioAquantia AQC107 NBase-T/IEEE + 2 x Intel I211 + Intel Dual Band-AC 3168NGWUbuntu 16.044.19.174-custom (x86_64)X Server 1.19.6NVIDIAOpenCL 1.2 CUDA 10.1.1201.1.99GCC 5.4.0 20160609 + Clang 3.8.0-2ubuntu4 + CUDA 9.2ext4640x480ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionGpu3Multicore1 BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820b- Python 2.7.12 + Python 3.5.2- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected

gpu3Multicore1hpcg: npb: BT.Cnpb: CG.Cnpb: EP.Cnpb: EP.Dnpb: FT.Cnpb: IS.Dnpb: LU.Cnpb: MG.Cnpb: SP.Bparboil: OpenMP LBMparboil: OpenMP CUTCPparboil: OpenMP MRI-Qparboil: OpenMP Stencilparboil: OpenMP MRI Griddingrodinia: OpenMP LavaMDrodinia: OpenMP HotSpot3Drodinia: OpenMP Leukocyterodinia: OpenMP CFD Solverrodinia: OpenMP Streamclusternamd: ATPase Simulation - 327,506 Atomspennant: sedovbigpennant: leblancbiglammps: 20k Atomslammps: Rhodopsin Proteinlibgav1: Chimera 1080plibgav1: Summer Nature 4Klibgav1: Summer Nature 1080plibgav1: Chimera 1080p 10-bitcompress-zstd: 3compress-zstd: 19arrayfire: BLAS CPUjohn-the-ripper: Blowfishjohn-the-ripper: MD5nero2d: Total Timeonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUospray: San Miguel - SciVisospray: XFrog Forest - SciVisospray: San Miguel - Path Tracerospray: NASA Streamlines - SciVisospray: XFrog Forest - Path Tracerospray: Magnetic Reconnection - SciVisospray: NASA Streamlines - Path Tracerospray: Magnetic Reconnection - Path Traceraom-av1: Speed 0 Two-Passaom-av1: Speed 4 Two-Passaom-av1: Speed 6 Realtimeaom-av1: Speed 6 Two-Passaom-av1: Speed 8 Realtimekvazaar: Bosphorus 4K - Slowkvazaar: Bosphorus 4K - Mediumkvazaar: Bosphorus 1080p - Slowkvazaar: Bosphorus 1080p - Mediumkvazaar: Bosphorus 4K - Very Fastkvazaar: Bosphorus 4K - Ultra Fastkvazaar: Bosphorus 1080p - Very Fastkvazaar: Bosphorus 1080p - Ultra Fastrav1e: 1rav1e: 5rav1e: 6rav1e: 10svt-av1: Enc Mode 0 - 1080psvt-av1: Enc Mode 4 - 1080psvt-av1: Enc Mode 8 - 1080psvt-hevc: 1080p 8-bit YUV To HEVC Video Encodevpxenc: Speed 0vpxenc: Speed 5x265: Bosphorus 4Kx265: Bosphorus 1080pmt-dgemm: Sustained Floating-Point Rateoidn: Memorialopenvkl: vklBenchmarkopenvkl: vklBenchmarkVdbVolumeopenvkl: vklBenchmarkStructuredVolumeopenvkl: vklBenchmarkUnstructuredVolumecoremark: CoreMark Size 666 - Iterations Per Secondcompress-7zip: Compress Speed Testasmfish: 1024 Hash Memory, 26 Depthswet: Averageebizzy: build-ffmpeg: Time To Compilebuild-gcc: Time To Compilebuild-imagemagick: Time To Compilebuild-linux-kernel: Time To Compilebuild-llvm: Time To Compilebuild-mplayer: Time To Compilebuild2: Time To Compilec-ray: Total Time - 4K, 16 Rays Per Pixelcompress-pbzip2: 256MB File Compressionprimesieve: 1e12 Prime Number Generationrust-mandel: Time To Complete Serial/Parallel Mandelbrotrust-prime: Prime Number Test To 200,000,000smallpt: Global Illumination Renderer; 128 Samplestungsten: Hairtungsten: Water Caustictungsten: Non-Exponentialtungsten: Volumetric Causticaobench: 2048 x 2048 - Total Timebuild-eigen: Time To Compileffmpeg: H.264 HD To NTSC DVm-queens: Time To Solven-queens: Elapsed Timeradiance: Serialradiance: SMP Paralleltachyon: Total Timeaircrack-ng: askap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingaskap: tConvolve OpenMP - Griddingaskap: tConvolve OpenMP - Degriddingaskap: Hogbom Clean OpenMPintel-mpi: IMB-P2P PingPongintel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 PingPongintel-mpi: IMB-MPI1 Sendrecvintel-mpi: IMB-MPI1 Sendrecvgromacs: Water Benchmarkmysqlslap: 1mysqlslap: 4mysqlslap: 8mysqlslap: 16mysqlslap: 32mysqlslap: 64mysqlslap: 128mysqlslap: 256mysqlslap: 512sysbench: Memorysysbench: CPUblender: BMW27 - CUDAblender: BMW27 - OpenCLblender: BMW27 - CPU-Onlyblender: Classroom - CUDAblender: Fishy Cat - CUDAblender: Barbershop - CUDAblender: Classroom - OpenCLblender: Fishy Cat - OpenCLblender: Barbershop - OpenCLblender: BMW27 - NVIDIA OptiXblender: Classroom - CPU-Onlyblender: Fishy Cat - CPU-Onlyblender: Barbershop - CPU-Onlyblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - CUDAblender: Pabellon Barcelona - OpenCLblender: Pabellon Barcelona - CPU-Onlyblender: Pabellon Barcelona - NVIDIA OptiXxsbench: gpu3Multicore17.0288137405.638001.35631.90621.1819497.22877.8742621.3917236.6013344.8871.6793572.1220726.0890538.251748170.793706330.70597.437108.62017.98819.5081.3097751.0319538.6688110.92110.01937.2517.0353.7115.295059.145.4410.33234430105266732.4544.103595.039662.939451.5012810.04484.750167.5212913.43208.317235.313694043.692153.944055.952150.862.593474043.522153.612.4544517.543.051.3322.221.5812.664.42166.670.261.8715.062.9727.126.947.0625.9926.9216.7731.1659.84106.810.3471.0421.3713.0320.1264.70335.69472.355.9318.275.0714.651.6108736.9317916110776657185191358541490007.450348731863904153375771006044494839.720947.29020.80149.810429.22025.10690.04534.5842.43013.63938.55420.4656.61816.821426.417010.103810.146244.49663.5767.68836.9648.010758.359235.27255.463628284.2821575.192273.881857.742716.9268.33763294201701.77575.422338.832001.41270.901.1301813107310189453222191771711707093621.185334334.374527.40316.4899.4667.3570.47236.16324.61854.84575.76316.48290.41143.83449.85321.26852.90585.55192.15957.47339.34954.942979914OpenBenchmarking.org

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1gpu3Multicore1246810SE +/- 0.01315, N = 37.028811. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.Cgpu3Multicore18K16K24K32K40KSE +/- 37.34, N = 337405.631. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.Cgpu3Multicore12K4K6K8K10KSE +/- 8.06, N = 38001.351. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Cgpu3Multicore1140280420560700SE +/- 0.54, N = 3631.901. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Dgpu3Multicore1130260390520650SE +/- 0.35, N = 3621.181. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.Cgpu3Multicore14K8K12K16K20KSE +/- 18.12, N = 319497.221. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.Dgpu3Multicore12004006008001000SE +/- 0.89, N = 3877.871. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.Cgpu3Multicore19K18K27K36K45KSE +/- 122.06, N = 342621.391. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.Cgpu3Multicore14K8K12K16K20KSE +/- 8.84, N = 317236.601. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.Bgpu3Multicore13K6K9K12K15KSE +/- 29.92, N = 313344.881. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

Parboil

The Parboil Benchmarks from the IMPACT Research Group at University of Illinois are a set of throughput computing applications for looking at computing architecture and compilers. Parboil test-cases support OpenMP, OpenCL, and CUDA multi-processing environments. However, at this time the test profile is just making use of the OpenMP and OpenCL test workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP LBMgpu3Multicore11632486480SE +/- 0.02, N = 371.681. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP CUTCPgpu3Multicore10.47750.9551.43251.912.3875SE +/- 0.020422, N = 32.1220721. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP MRI-Qgpu3Multicore1246810SE +/- 0.001530, N = 36.0890531. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP Stencilgpu3Multicore1246810SE +/- 0.029315, N = 38.2517481. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP MRI Griddinggpu3Multicore14080120160200SE +/- 0.80, N = 3170.791. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LavaMDgpu3Multicore170140210280350SE +/- 0.71, N = 3330.711. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP HotSpot3Dgpu3Multicore120406080100SE +/- 0.83, N = 397.441. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP Leukocytegpu3Multicore120406080100SE +/- 0.57, N = 3108.621. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP CFD Solvergpu3Multicore148121620SE +/- 0.06, N = 317.991. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP Streamclustergpu3Multicore1510152025SE +/- 0.23, N = 1519.511. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsgpu3Multicore10.29470.58940.88411.17881.4735SE +/- 0.00223, N = 31.30977

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbiggpu3Multicore11224364860SE +/- 0.04, N = 351.031. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbiggpu3Multicore1918273645SE +/- 0.03, N = 338.671. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: 20k Atomsgpu3Multicore13691215SE +/- 0.03, N = 310.921. (CXX) g++ options: -O3 -pthread -lm

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: Rhodopsin Proteingpu3Multicore13691215SE +/- 0.11, N = 1510.021. (CXX) g++ options: -O3 -pthread -lm

libgav1

Libgav1 is an AV1 decoder developed by Google for AV1 profile 0/1 compliance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Chimera 1080pgpu3Multicore1918273645SE +/- 0.05, N = 337.251. (CXX) g++ options: -O3 -lpthread

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Summer Nature 4Kgpu3Multicore148121620SE +/- 0.01, N = 317.031. (CXX) g++ options: -O3 -lpthread

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Summer Nature 1080pgpu3Multicore11224364860SE +/- 0.04, N = 353.711. (CXX) g++ options: -O3 -lpthread

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Chimera 1080p 10-bitgpu3Multicore148121620SE +/- 0.15, N = 315.291. (CXX) g++ options: -O3 -lpthread

Zstd Compression

This test measures the time needed to compress a sample file (an Ubuntu ISO) using Zstd compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 3gpu3Multicore111002200330044005500SE +/- 3.85, N = 35059.11. (CC) gcc options: -O3 -pthread -lz

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 19gpu3Multicore11020304050SE +/- 0.03, N = 345.41. (CC) gcc options: -O3 -pthread -lz

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterArrayFire 3.7Test: BLAS CPUgpu3Multicore190180270360450SE +/- 0.49, N = 3410.331. (CXX) g++ options: -rdynamic

John The Ripper

This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: Blowfishgpu3Multicore17K14K21K28K35KSE +/- 93.23, N = 3344301. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -pthread -lm -lz -ldl -lcrypt -lbz2

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: MD5gpu3Multicore1200K400K600K800K1000KSE +/- 6489.31, N = 310526671. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -pthread -lm -lz -ldl -lcrypt -lbz2

Open FMM Nero2D

This is a test of Nero2D, which is a two-dimensional TM/TE solver for Open FMM. Open FMM is a free collection of electromagnetic software for scattering at very large objects. This test profile times how long it takes to solve one of the included 2D examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpen FMM Nero2D 2.0.2Total Timegpu3Multicore1816243240SE +/- 0.08, N = 332.451. (CXX) g++ options: -O2 -lfftw3 -llapack -lblas -lgfortran -lquadmath -lm -pthread -lmpi_cxx -lmpi

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUgpu3Multicore10.92331.84662.76993.69324.6165SE +/- 0.01135, N = 34.10359MIN: 3.821. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUgpu3Multicore11.13392.26783.40174.53565.6695SE +/- 0.00243, N = 35.03966MIN: 4.981. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.66141.32281.98422.64563.307SE +/- 0.00193, N = 32.93945MIN: 2.791. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.33780.67561.01341.35121.689SE +/- 0.00670, N = 31.50128MIN: 1.411. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUgpu3Multicore13691215SE +/- 0.01, N = 310.04MIN: 9.891. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUgpu3Multicore11.06882.13763.20644.27525.344SE +/- 0.01438, N = 34.75016MIN: 4.351. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUgpu3Multicore1246810SE +/- 0.00364, N = 37.52129MIN: 6.831. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUgpu3Multicore13691215SE +/- 0.01, N = 313.43MIN: 13.211. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUgpu3Multicore1246810SE +/- 0.09293, N = 38.31723MIN: 7.941. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUgpu3Multicore11.19562.39123.58684.78245.978SE +/- 0.00230, N = 35.31369MIN: 5.081. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 10.83, N = 34043.69MIN: 4015.011. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 2.49, N = 32153.94MIN: 2140.591. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 0.97, N = 34055.95MIN: 4045.371. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 1.08, N = 32150.86MIN: 2141.221. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUgpu3Multicore10.58351.1671.75052.3342.9175SE +/- 0.00134, N = 32.59347MIN: 2.531. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 6.67, N = 34043.52MIN: 4021.891. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 1.16, N = 32153.61MIN: 2144.871. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.55231.10461.65692.20922.7615SE +/- 0.00098, N = 32.45445MIN: 2.291. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OSPray

Intel OSPray is a portable ray-tracing engine for high-performance, high-fidenlity scientific visualizations. OSPray builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: SciVisgpu3Multicore148121620SE +/- 0.00, N = 317.54MIN: 16.95 / MAX: 18.52

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: SciVisgpu3Multicore10.68631.37262.05892.74523.4315SE +/- 0.00, N = 33.05MIN: 3.01 / MAX: 3.09

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: Path Tracergpu3Multicore10.29930.59860.89791.19721.4965SE +/- 0.00, N = 31.33MIN: 1.32 / MAX: 1.34

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: SciVisgpu3Multicore1510152025SE +/- 0.00, N = 322.22MIN: 21.74 / MAX: 22.73

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: Path Tracergpu3Multicore10.35550.7111.06651.4221.7775SE +/- 0.00, N = 31.58MIN: 1.56 / MAX: 1.61

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: SciVisgpu3Multicore13691215SE +/- 0.00, N = 312.66MIN: 12.5 / MAX: 12.82

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: Path Tracergpu3Multicore10.99451.9892.98353.9784.9725SE +/- 0.01, N = 34.42MIN: 4.35 / MAX: 4.55

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: Path Tracergpu3Multicore14080120160200SE +/- 0.00, N = 3166.67MIN: 125 / MAX: 200

AOM AV1

This is a simple test of the AOMedia AV1 encoder run on the CPU with a sample video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 0 Two-Passgpu3Multicore10.05850.1170.17550.2340.2925SE +/- 0.00, N = 30.261. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 4 Two-Passgpu3Multicore10.42080.84161.26241.68322.104SE +/- 0.01, N = 31.871. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 6 Realtimegpu3Multicore148121620SE +/- 0.02, N = 315.061. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 6 Two-Passgpu3Multicore10.66831.33662.00492.67323.3415SE +/- 0.01, N = 32.971. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 8 Realtimegpu3Multicore1612182430SE +/- 0.17, N = 327.121. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Kvazaar

This is a test of Kvazaar as a CPU-based H.265 video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Slowgpu3Multicore1246810SE +/- 0.02, N = 36.941. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Mediumgpu3Multicore1246810SE +/- 0.01, N = 37.061. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Slowgpu3Multicore1612182430SE +/- 0.03, N = 325.991. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Mediumgpu3Multicore1612182430SE +/- 0.04, N = 326.921. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Very Fastgpu3Multicore148121620SE +/- 0.01, N = 316.771. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Ultra Fastgpu3Multicore1714212835SE +/- 0.05, N = 331.161. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Very Fastgpu3Multicore11326395265SE +/- 0.04, N = 359.841. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Ultra Fastgpu3Multicore120406080100SE +/- 0.08, N = 3106.811. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

rav1e

Xiph rav1e is a Rust-written AV1 video encoder. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 1gpu3Multicore10.07810.15620.23430.31240.3905SE +/- 0.001, N = 30.347

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 5gpu3Multicore10.23450.4690.70350.9381.1725SE +/- 0.001, N = 31.042

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 6gpu3Multicore10.30850.6170.92551.2341.5425SE +/- 0.002, N = 31.371

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 10gpu3Multicore10.68221.36442.04662.72883.411SE +/- 0.009, N = 33.032

SVT-AV1

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 0 - Input: 1080pgpu3Multicore10.02840.05680.08520.11360.142SE +/- 0.000, N = 30.1261. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 4 - Input: 1080pgpu3Multicore11.05822.11643.17464.23285.291SE +/- 0.015, N = 34.7031. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 8 - Input: 1080pgpu3Multicore1816243240SE +/- 0.12, N = 335.691. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-HEVC

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.4.11080p 8-bit YUV To HEVC Video Encodegpu3Multicore11632486480SE +/- 0.05, N = 372.351. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

VP9 libvpx Encoding

This is a standard video encoding performance test of Google's libvpx library and the vpxenc command for the VP9/WebM format using a sample 1080p video. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 0gpu3Multicore11.33432.66864.00295.33726.6715SE +/- 0.01, N = 35.931. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=c++11

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 5gpu3Multicore148121620SE +/- 0.05, N = 318.271. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=c++11

x265

This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4Kgpu3Multicore11.14082.28163.42244.56325.704SE +/- 0.06, N = 35.071. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pgpu3Multicore148121620SE +/- 0.06, N = 314.651. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Rategpu3Multicore10.36240.72481.08721.44961.812SE +/- 0.007697, N = 31.6108731. (CC) gcc options: -O3 -march=native -fopenmp

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.2.0Scene: Memorialgpu3Multicore1246810SE +/- 0.05, N = 36.93

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkgpu3Multicore14080120160200179MIN: 1 / MAX: 587

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkVdbVolumegpu3Multicore13M6M9M12M15MSE +/- 189330.67, N = 316110776MIN: 463741 / MAX: 96050880

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkStructuredVolumegpu3Multicore114M28M42M56M70MSE +/- 249259.82, N = 365718519MIN: 429600 / MAX: 792670464

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkUnstructuredVolumegpu3Multicore1300K600K900K1200K1500KSE +/- 2593.98, N = 31358541MIN: 17295 / MAX: 4612260

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondgpu3Multicore1100K200K300K400K500KSE +/- 885.70, N = 3490007.451. (CC) gcc options: -O2 -lrt" -lrt

7-Zip Compression

This is a test of 7-Zip using p7zip with its integrated benchmark feature or upstream 7-Zip for the Windows x64 build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 16.02Compress Speed Testgpu3Multicore116K32K48K64K80KSE +/- 211.29, N = 3731861. (CXX) g++ options: -pipe -lpthread

asmFish

This is a test of asmFish, an advanced chess benchmark written in Assembly. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes/second, More Is BetterasmFish 2018-07-231024 Hash Memory, 26 Depthgpu3Multicore18M16M24M32M40MSE +/- 41169.03, N = 339041533

Swet

Swet is a synthetic CPU/RAM benchmark, includes multi-processor test cases. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOperations Per Second, More Is BetterSwet 1.5.16Averagegpu3Multicore1160M320M480M640M800MSE +/- 9820613.43, N = 37577100601. (CC) gcc options: -lm -lpthread -lcurses -lrt

ebizzy

This is a test of ebizzy, a program to generate workloads resembling web server workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRecords/s, More Is Betterebizzy 0.3gpu3Multicore1100K200K300K400K500KSE +/- 5403.48, N = 154449481. (CC) gcc options: -pthread -lpthread -O3 -march=native

Timed FFmpeg Compilation

This test times how long it takes to build the FFmpeg multimedia library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.2.2Time To Compilegpu3Multicore1918273645SE +/- 0.09, N = 339.72

Timed GCC Compilation

This test times how long it takes to build the GNU Compiler Collection (GCC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed GCC Compilation 9.3.0Time To Compilegpu3Multicore12004006008001000SE +/- 0.65, N = 3947.29

Timed ImageMagick Compilation

This test times how long it takes to build ImageMagick. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To Compilegpu3Multicore1510152025SE +/- 0.10, N = 320.80

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 5.4Time To Compilegpu3Multicore11122334455SE +/- 0.53, N = 349.81

Timed LLVM Compilation

This test times how long it takes to build the LLVM compiler. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 10.0Time To Compilegpu3Multicore190180270360450SE +/- 2.48, N = 3429.22

Timed MPlayer Compilation

This test times how long it takes to build the MPlayer open-source media player program. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MPlayer Compilation 1.4Time To Compilegpu3Multicore1612182430SE +/- 0.07, N = 325.11

Build2

This test profile measures the time to bootstrap/install the build2 C++ build toolchain from source. Build2 is a cross-platform build toolchain for C/C++ code and features Cargo-like features. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBuild2 0.13Time To Compilegpu3Multicore120406080100SE +/- 0.16, N = 390.05

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per Pixelgpu3Multicore1816243240SE +/- 0.06, N = 334.581. (CC) gcc options: -lm -lpthread -O3

Parallel BZIP2 Compression

This test measures the time needed to compress a file (a .tar package of the Linux kernel source code) using BZIP2 compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterParallel BZIP2 Compression 1.1.12256MB File Compressiongpu3Multicore10.54681.09361.64042.18722.734SE +/- 0.009, N = 32.4301. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread

Primesieve

Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.41e12 Prime Number Generationgpu3Multicore148121620SE +/- 0.03, N = 313.641. (CXX) g++ options: -O3 -lpthread

Rust Mandelbrot

This test profile is of the combined time for the serial and parallel Mandelbrot sets written in Rustlang via willi-kappler/mandel-rust. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRust MandelbrotTime To Complete Serial/Parallel Mandelbrotgpu3Multicore1918273645SE +/- 0.01, N = 338.551. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil

Rust Prime Benchmark

Based on petehunt/rust-benchmark, this is a prime number benchmark that is multi-threaded and written in Rustlang. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRust Prime BenchmarkPrime Number Test To 200,000,000gpu3Multicore1510152025SE +/- 0.01, N = 320.471. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil

Smallpt

Smallpt is a C++ global illumination renderer written in less than 100 lines of code. Global illumination is done via unbiased Monte Carlo path tracing and there is multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 128 Samplesgpu3Multicore1246810SE +/- 0.008, N = 36.6181. (CXX) g++ options: -fopenmp -O3

Tungsten Renderer

Tungsten is a C++ physically based renderer that makes use of Intel's Embree ray tracing library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Hairgpu3Multicore148121620SE +/- 0.03, N = 316.821. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Water Causticgpu3Multicore1612182430SE +/- 0.09, N = 326.421. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Non-Exponentialgpu3Multicore13691215SE +/- 0.04, N = 310.101. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Volumetric Causticgpu3Multicore13691215SE +/- 0.01, N = 310.151. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

AOBench

AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total Timegpu3Multicore11020304050SE +/- 0.14, N = 344.501. (CC) gcc options: -lm -O3

Timed Eigen Compilation

This test times how long it takes to build all Eigen examples. The Eigen examples are compiled serially. Eigen is a C++ template library for linear algebra. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Eigen Compilation 3.3.9Time To Compilegpu3Multicore11428425670SE +/- 0.06, N = 363.58

FFmpeg

This test uses FFmpeg for testing the system's audio/video encoding performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 4.0.2H.264 HD To NTSC DVgpu3Multicore1246810SE +/- 0.050, N = 37.6881. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm -lxcb -lxcb-shape -lxcb-xfixes -lxcb-render -pthread -lbz2 -std=c11 -fomit-frame-pointer -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT

m-queens

A solver for the N-queens problem with multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is Betterm-queens 1.2Time To Solvegpu3Multicore1816243240SE +/- 0.08, N = 336.961. (CXX) g++ options: -fopenmp -O2 -march=native

N-Queens

This is a test of the OpenMP version of a test that solves the N-queens problem. The board problem size is 18. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterN-Queens 1.0Elapsed Timegpu3Multicore1246810SE +/- 0.001, N = 38.0101. (CC) gcc options: -static -fopenmp -O3 -march=native

Radiance Benchmark

This is a benchmark of NREL Radiance, a synthetic imaging system that is open-source and developed by the Lawrence Berkeley National Laboratory in California. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: Serialgpu3Multicore1160320480640800758.36

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP Parallelgpu3Multicore150100150200250235.27

Tachyon

This is a test of the threaded Tachyon, a parallel ray-tracing system, measuring the time to ray-trace a sample scene. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.99b6Total Timegpu3Multicore11224364860SE +/- 0.15, N = 355.461. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread

Aircrack-ng

Aircrack-ng is a tool for assessing WiFi/WLAN network security. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgk/s, More Is BetterAircrack-ng 1.5.2gpu3Multicore16K12K18K24K30KSE +/- 68.07, N = 328284.281. (CXX) g++ options: -O3 -fvisibility=hidden -masm=intel -fcommon -rdynamic -lpthread -lz -lcrypto -lhwloc -ldl -lm -pthread

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Griddinggpu3Multicore130060090012001500SE +/- 0.61, N = 31575.191. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Degriddinggpu3Multicore15001000150020002500SE +/- 2.64, N = 32273.881. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Griddinggpu3Multicore1400800120016002000SE +/- 11.39, N = 31857.741. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Degriddinggpu3Multicore16001200180024003000SE +/- 0.00, N = 32716.91. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMPgpu3Multicore160120180240300SE +/- 0.24, N = 3268.341. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Intel MPI Benchmarks

Intel MPI Benchmarks for stressing MPI implementations. At this point the test profile aggregates results for some common MPI functionality. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgAverage Msg/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-P2P PingPonggpu3Multicore11.4M2.8M4.2M5.6M7MSE +/- 35072.49, N = 36329420MIN: 1185 / MAX: 153504141. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Exchangegpu3Multicore1400800120016002000SE +/- 136.55, N = 121701.77MAX: 18194.891. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Exchangegpu3Multicore1120240360480600SE +/- 17.24, N = 12575.42MIN: 0.3 / MAX: 17330.781. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 PingPonggpu3Multicore15001000150020002500SE +/- 442.13, N = 122338.83MIN: 3.77 / MAX: 11785.091. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Sendrecvgpu3Multicore1400800120016002000SE +/- 147.96, N = 152001.41MAX: 19536.021. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Sendrecvgpu3Multicore160120180240300SE +/- 5.97, N = 15270.90MIN: 0.16 / MAX: 7916.441. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing on the CPU with the water_GMX50 data. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2020.3Water Benchmarkgpu3Multicore10.25430.50860.76291.01721.2715SE +/- 0.002, N = 31.1301. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm

MariaDB

This is a MariaDB MySQL database server benchmark making use of mysqlslap. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 1gpu3Multicore1400800120016002000SE +/- 7.16, N = 318131. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 4gpu3Multicore12004006008001000SE +/- 2.39, N = 310731. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 8gpu3Multicore12004006008001000SE +/- 3.01, N = 310181. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 16gpu3Multicore12004006008001000SE +/- 1.66, N = 39451. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 32gpu3Multicore170140210280350SE +/- 62.25, N = 93221. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 64gpu3Multicore150100150200250SE +/- 0.35, N = 32191. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 128gpu3Multicore14080120160200SE +/- 0.10, N = 31771. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 256gpu3Multicore14080120160200SE +/- 0.37, N = 31711. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 512gpu3Multicore14080120160200SE +/- 0.29, N = 31701. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

Sysbench

This is a benchmark of Sysbench with CPU and memory sub-tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 2018-07-28Test: Memorygpu3Multicore11.5M3M4.5M6M7.5MSE +/- 3413.58, N = 37093621.191. (CC) gcc options: -pthread -O3 -funroll-loops -ggdb3 -march=amdfam10 -rdynamic -ldl -lm

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 2018-07-28Test: CPUgpu3Multicore17K14K21K28K35KSE +/- 2.71, N = 334334.371. (CC) gcc options: -pthread -O3 -funroll-loops -ggdb3 -march=amdfam10 -rdynamic -ldl -lm

Blender

Blender is an open-source 3D creation software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing via OpenCL or CUDA is supported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: CUDAgpu3Multicore1612182430SE +/- 0.02, N = 327.40

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: OpenCLgpu3Multicore170140210280350SE +/- 0.57, N = 3316.48

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: CPU-Onlygpu3Multicore120406080100SE +/- 0.23, N = 399.46

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: CUDAgpu3Multicore11530456075SE +/- 0.53, N = 367.35

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: CUDAgpu3Multicore11632486480SE +/- 0.72, N = 370.47

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: CUDAgpu3Multicore150100150200250SE +/- 1.46, N = 3236.16

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: OpenCLgpu3Multicore170140210280350SE +/- 1.77, N = 3324.61

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: OpenCLgpu3Multicore12004006008001000SE +/- 4.14, N = 3854.84

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: OpenCLgpu3Multicore1120240360480600SE +/- 2.24, N = 3575.76

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: NVIDIA OptiXgpu3Multicore170140210280350SE +/- 0.62, N = 3316.48

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: CPU-Onlygpu3Multicore160120180240300SE +/- 0.30, N = 3290.41

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: CPU-Onlygpu3Multicore1306090120150SE +/- 0.04, N = 3143.83

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: CPU-Onlygpu3Multicore1100200300400500SE +/- 0.88, N = 3449.85

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: NVIDIA OptiXgpu3Multicore170140210280350SE +/- 0.41, N = 3321.26

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: NVIDIA OptiXgpu3Multicore12004006008001000SE +/- 4.49, N = 3852.90

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: NVIDIA OptiXgpu3Multicore1130260390520650SE +/- 2.13, N = 3585.55

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: CUDAgpu3Multicore14080120160200SE +/- 1.07, N = 3192.15

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: OpenCLgpu3Multicore12004006008001000SE +/- 1.53, N = 3957.47

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: CPU-Onlygpu3Multicore170140210280350SE +/- 0.86, N = 3339.34

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXgpu3Multicore12004006008001000SE +/- 5.63, N = 3954.94

Xsbench

XSBench is a mini-app representing a key computational kernel of the Monte Carlo neutronics application OpenMC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgLookups/s, More Is BetterXsbench 2017-07-06gpu3Multicore1600K1200K1800K2400K3000KSE +/- 920.47, N = 329799141. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm

167 Results Shown

High Performance Conjugate Gradient
NAS Parallel Benchmarks:
  BT.C
  CG.C
  EP.C
  EP.D
  FT.C
  IS.D
  LU.C
  MG.C
  SP.B
Parboil:
  OpenMP LBM
  OpenMP CUTCP
  OpenMP MRI-Q
  OpenMP Stencil
  OpenMP MRI Gridding
Rodinia:
  OpenMP LavaMD
  OpenMP HotSpot3D
  OpenMP Leukocyte
  OpenMP CFD Solver
  OpenMP Streamcluster
NAMD
Pennant:
  sedovbig
  leblancbig
LAMMPS Molecular Dynamics Simulator:
  20k Atoms
  Rhodopsin Protein
libgav1:
  Chimera 1080p
  Summer Nature 4K
  Summer Nature 1080p
  Chimera 1080p 10-bit
Zstd Compression:
  3
  19
ArrayFire
John The Ripper:
  Blowfish
  MD5
Open FMM Nero2D
oneDNN:
  IP Shapes 1D - f32 - CPU
  IP Shapes 3D - f32 - CPU
  IP Shapes 1D - u8s8f32 - CPU
  IP Shapes 3D - u8s8f32 - CPU
  Convolution Batch Shapes Auto - f32 - CPU
  Deconvolution Batch shapes_1d - f32 - CPU
  Deconvolution Batch shapes_3d - f32 - CPU
  Convolution Batch Shapes Auto - u8s8f32 - CPU
  Deconvolution Batch shapes_1d - u8s8f32 - CPU
  Deconvolution Batch shapes_3d - u8s8f32 - CPU
  Recurrent Neural Network Training - f32 - CPU
  Recurrent Neural Network Inference - f32 - CPU
  Recurrent Neural Network Training - u8s8f32 - CPU
  Recurrent Neural Network Inference - u8s8f32 - CPU
  Matrix Multiply Batch Shapes Transformer - f32 - CPU
  Recurrent Neural Network Training - bf16bf16bf16 - CPU
  Recurrent Neural Network Inference - bf16bf16bf16 - CPU
  Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU
OSPray:
  San Miguel - SciVis
  XFrog Forest - SciVis
  San Miguel - Path Tracer
  NASA Streamlines - SciVis
  XFrog Forest - Path Tracer
  Magnetic Reconnection - SciVis
  NASA Streamlines - Path Tracer
  Magnetic Reconnection - Path Tracer
AOM AV1:
  Speed 0 Two-Pass
  Speed 4 Two-Pass
  Speed 6 Realtime
  Speed 6 Two-Pass
  Speed 8 Realtime
Kvazaar:
  Bosphorus 4K - Slow
  Bosphorus 4K - Medium
  Bosphorus 1080p - Slow
  Bosphorus 1080p - Medium
  Bosphorus 4K - Very Fast
  Bosphorus 4K - Ultra Fast
  Bosphorus 1080p - Very Fast
  Bosphorus 1080p - Ultra Fast
rav1e:
  1
  5
  6
  10
SVT-AV1:
  Enc Mode 0 - 1080p
  Enc Mode 4 - 1080p
  Enc Mode 8 - 1080p
SVT-HEVC
VP9 libvpx Encoding:
  Speed 0
  Speed 5
x265:
  Bosphorus 4K
  Bosphorus 1080p
ACES DGEMM
Intel Open Image Denoise
OpenVKL:
  vklBenchmark
  vklBenchmarkVdbVolume
  vklBenchmarkStructuredVolume
  vklBenchmarkUnstructuredVolume
Coremark
7-Zip Compression
asmFish
Swet
ebizzy
Timed FFmpeg Compilation
Timed GCC Compilation
Timed ImageMagick Compilation
Timed Linux Kernel Compilation
Timed LLVM Compilation
Timed MPlayer Compilation
Build2
C-Ray
Parallel BZIP2 Compression
Primesieve
Rust Mandelbrot
Rust Prime Benchmark
Smallpt
Tungsten Renderer:
  Hair
  Water Caustic
  Non-Exponential
  Volumetric Caustic
AOBench
Timed Eigen Compilation
FFmpeg
m-queens
N-Queens
Radiance Benchmark:
  Serial
  SMP Parallel
Tachyon
Aircrack-ng
ASKAP:
  tConvolve MT - Gridding
  tConvolve MT - Degridding
  tConvolve OpenMP - Gridding
  tConvolve OpenMP - Degridding
  Hogbom Clean OpenMP
Intel MPI Benchmarks:
  IMB-P2P PingPong
  IMB-MPI1 Exchange
  IMB-MPI1 Exchange
  IMB-MPI1 PingPong
  IMB-MPI1 Sendrecv
  IMB-MPI1 Sendrecv
GROMACS
MariaDB:
  1
  4
  8
  16
  32
  64
  128
  256
  512
Sysbench:
  Memory
  CPU
Blender:
  BMW27 - CUDA
  BMW27 - OpenCL
  BMW27 - CPU-Only
  Classroom - CUDA
  Fishy Cat - CUDA
  Barbershop - CUDA
  Classroom - OpenCL
  Fishy Cat - OpenCL
  Barbershop - OpenCL
  BMW27 - NVIDIA OptiX
  Classroom - CPU-Only
  Fishy Cat - CPU-Only
  Barbershop - CPU-Only
  Classroom - NVIDIA OptiX
  Fishy Cat - NVIDIA OptiX
  Barbershop - NVIDIA OptiX
  Pabellon Barcelona - CUDA
  Pabellon Barcelona - OpenCL
  Pabellon Barcelona - CPU-Only
  Pabellon Barcelona - NVIDIA OptiX
Xsbench