gpu3Multicore1

AMD Ryzen Threadripper 2950X 16-Core testing with a ASRock X399 Professional Gaming (P3.80 BIOS) and MSI NVIDIA GeForce GTX 1080 8GB on Ubuntu 16.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2102113-HA-GPU3MULTI38.

gpu3Multicore1ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen Resolutiongpu3Multicore1AMD Ryzen Threadripper 2950X 16-Core @ 3.50GHz (16 Cores / 32 Threads)ASRock X399 Professional Gaming (P3.80 BIOS)AMD 17h126GB1000GB Samsung SSD 860MSI NVIDIA GeForce GTX 1080 8GBNVIDIA GP104 HD AudioAquantia AQC107 NBase-T/IEEE + 2 x Intel I211 + Intel Dual Band-AC 3168NGWUbuntu 16.044.19.174-custom (x86_64)X Server 1.19.6NVIDIAOpenCL 1.2 CUDA 10.1.1201.1.99GCC 5.4.0 20160609 + Clang 3.8.0-2ubuntu4 + CUDA 9.2ext4640x480OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820b- Python 2.7.12 + Python 3.5.2- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected

gpu3Multicore1hpcg: npb: BT.Cnpb: CG.Cnpb: EP.Cnpb: EP.Dnpb: FT.Cnpb: IS.Dnpb: LU.Cnpb: MG.Cnpb: SP.Bparboil: OpenMP LBMparboil: OpenMP CUTCPparboil: OpenMP MRI-Qparboil: OpenMP Stencilparboil: OpenMP MRI Griddingrodinia: OpenMP LavaMDrodinia: OpenMP HotSpot3Drodinia: OpenMP Leukocyterodinia: OpenMP CFD Solverrodinia: OpenMP Streamclusternamd: ATPase Simulation - 327,506 Atomspennant: sedovbigpennant: leblancbiglammps: 20k Atomslammps: Rhodopsin Proteinlibgav1: Chimera 1080plibgav1: Summer Nature 4Klibgav1: Summer Nature 1080plibgav1: Chimera 1080p 10-bitcompress-zstd: 3compress-zstd: 19arrayfire: BLAS CPUjohn-the-ripper: Blowfishjohn-the-ripper: MD5nero2d: Total Timeonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUospray: San Miguel - SciVisospray: XFrog Forest - SciVisospray: San Miguel - Path Tracerospray: NASA Streamlines - SciVisospray: XFrog Forest - Path Tracerospray: Magnetic Reconnection - SciVisospray: NASA Streamlines - Path Tracerospray: Magnetic Reconnection - Path Traceraom-av1: Speed 0 Two-Passaom-av1: Speed 4 Two-Passaom-av1: Speed 6 Realtimeaom-av1: Speed 6 Two-Passaom-av1: Speed 8 Realtimekvazaar: Bosphorus 4K - Slowkvazaar: Bosphorus 4K - Mediumkvazaar: Bosphorus 1080p - Slowkvazaar: Bosphorus 1080p - Mediumkvazaar: Bosphorus 4K - Very Fastkvazaar: Bosphorus 4K - Ultra Fastkvazaar: Bosphorus 1080p - Very Fastkvazaar: Bosphorus 1080p - Ultra Fastrav1e: 1rav1e: 5rav1e: 6rav1e: 10svt-av1: Enc Mode 0 - 1080psvt-av1: Enc Mode 4 - 1080psvt-av1: Enc Mode 8 - 1080psvt-hevc: 1080p 8-bit YUV To HEVC Video Encodevpxenc: Speed 0vpxenc: Speed 5x265: Bosphorus 4Kx265: Bosphorus 1080pmt-dgemm: Sustained Floating-Point Rateoidn: Memorialopenvkl: vklBenchmarkopenvkl: vklBenchmarkVdbVolumeopenvkl: vklBenchmarkStructuredVolumeopenvkl: vklBenchmarkUnstructuredVolumecoremark: CoreMark Size 666 - Iterations Per Secondcompress-7zip: Compress Speed Testasmfish: 1024 Hash Memory, 26 Depthswet: Averageebizzy: build-ffmpeg: Time To Compilebuild-gcc: Time To Compilebuild-imagemagick: Time To Compilebuild-linux-kernel: Time To Compilebuild-llvm: Time To Compilebuild-mplayer: Time To Compilebuild2: Time To Compilec-ray: Total Time - 4K, 16 Rays Per Pixelcompress-pbzip2: 256MB File Compressionprimesieve: 1e12 Prime Number Generationrust-mandel: Time To Complete Serial/Parallel Mandelbrotrust-prime: Prime Number Test To 200,000,000smallpt: Global Illumination Renderer; 128 Samplestungsten: Hairtungsten: Water Caustictungsten: Non-Exponentialtungsten: Volumetric Causticaobench: 2048 x 2048 - Total Timebuild-eigen: Time To Compileffmpeg: H.264 HD To NTSC DVm-queens: Time To Solven-queens: Elapsed Timeradiance: Serialradiance: SMP Paralleltachyon: Total Timeaircrack-ng: askap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingaskap: tConvolve OpenMP - Griddingaskap: tConvolve OpenMP - Degriddingaskap: Hogbom Clean OpenMPintel-mpi: IMB-P2P PingPongintel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 PingPongintel-mpi: IMB-MPI1 Sendrecvintel-mpi: IMB-MPI1 Sendrecvgromacs: Water Benchmarkmysqlslap: 1mysqlslap: 4mysqlslap: 8mysqlslap: 16mysqlslap: 32mysqlslap: 64mysqlslap: 128mysqlslap: 256mysqlslap: 512sysbench: Memorysysbench: CPUblender: BMW27 - CUDAblender: BMW27 - OpenCLblender: BMW27 - CPU-Onlyblender: Classroom - CUDAblender: Fishy Cat - CUDAblender: Barbershop - CUDAblender: Classroom - OpenCLblender: Fishy Cat - OpenCLblender: Barbershop - OpenCLblender: BMW27 - NVIDIA OptiXblender: Classroom - CPU-Onlyblender: Fishy Cat - CPU-Onlyblender: Barbershop - CPU-Onlyblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - CUDAblender: Pabellon Barcelona - OpenCLblender: Pabellon Barcelona - CPU-Onlyblender: Pabellon Barcelona - NVIDIA OptiXxsbench: gpu3Multicore17.0288137405.638001.35631.90621.1819497.22877.8742621.3917236.6013344.8871.6793572.1220726.0890538.251748170.793706330.70597.437108.62017.98819.5081.3097751.0319538.6688110.92110.01937.2517.0353.7115.295059.145.4410.33234430105266732.4544.103595.039662.939451.5012810.04484.750167.5212913.43208.317235.313694043.692153.944055.952150.862.593474043.522153.612.4544517.543.051.3322.221.5812.664.42166.670.261.8715.062.9727.126.947.0625.9926.9216.7731.1659.84106.810.3471.0421.3713.0320.1264.70335.69472.355.9318.275.0714.651.6108736.9317916110776657185191358541490007.450348731863904153375771006044494839.720947.29020.80149.810429.22025.10690.04534.5842.43013.63938.55420.4656.61816.821426.417010.103810.146244.49663.5767.68836.9648.010758.359235.27255.463628284.2821575.192273.881857.742716.9268.33763294201701.77575.422338.832001.41270.901.1301813107310189453222191771711707093621.185334334.374527.40316.4899.4667.3570.47236.16324.61854.84575.76316.48290.41143.83449.85321.26852.90585.55192.15957.47339.34954.942979914OpenBenchmarking.org

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1gpu3Multicore1246810SE +/- 0.01315, N = 37.028811. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi

NAS Parallel Benchmarks

Test / Class: BT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.Cgpu3Multicore18K16K24K32K40KSE +/- 37.34, N = 337405.631. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: CG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.Cgpu3Multicore12K4K6K8K10KSE +/- 8.06, N = 38001.351. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: EP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Cgpu3Multicore1140280420560700SE +/- 0.54, N = 3631.901. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: EP.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Dgpu3Multicore1130260390520650SE +/- 0.35, N = 3621.181. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: FT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.Cgpu3Multicore14K8K12K16K20KSE +/- 18.12, N = 319497.221. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: IS.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.Dgpu3Multicore12004006008001000SE +/- 0.89, N = 3877.871. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.Cgpu3Multicore19K18K27K36K45KSE +/- 122.06, N = 342621.391. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: MG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.Cgpu3Multicore14K8K12K16K20KSE +/- 8.84, N = 317236.601. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: SP.B

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.Bgpu3Multicore13K6K9K12K15KSE +/- 29.92, N = 313344.881. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

Parboil

Test: OpenMP LBM

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP LBMgpu3Multicore11632486480SE +/- 0.02, N = 371.681. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Parboil

Test: OpenMP CUTCP

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP CUTCPgpu3Multicore10.47750.9551.43251.912.3875SE +/- 0.020422, N = 32.1220721. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Parboil

Test: OpenMP MRI-Q

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP MRI-Qgpu3Multicore1246810SE +/- 0.001530, N = 36.0890531. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Parboil

Test: OpenMP Stencil

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP Stencilgpu3Multicore1246810SE +/- 0.029315, N = 38.2517481. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Parboil

Test: OpenMP MRI Gridding

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP MRI Griddinggpu3Multicore14080120160200SE +/- 0.80, N = 3170.791. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Rodinia

Test: OpenMP LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LavaMDgpu3Multicore170140210280350SE +/- 0.71, N = 3330.711. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Rodinia

Test: OpenMP HotSpot3D

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP HotSpot3Dgpu3Multicore120406080100SE +/- 0.83, N = 397.441. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Rodinia

Test: OpenMP Leukocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP Leukocytegpu3Multicore120406080100SE +/- 0.57, N = 3108.621. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Rodinia

Test: OpenMP CFD Solver

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP CFD Solvergpu3Multicore148121620SE +/- 0.06, N = 317.991. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Rodinia

Test: OpenMP Streamcluster

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP Streamclustergpu3Multicore1510152025SE +/- 0.23, N = 1519.511. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsgpu3Multicore10.29470.58940.88411.17881.4735SE +/- 0.00223, N = 31.30977

Pennant

Test: sedovbig

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbiggpu3Multicore11224364860SE +/- 0.04, N = 351.031. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

Pennant

Test: leblancbig

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbiggpu3Multicore1918273645SE +/- 0.03, N = 338.671. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: 20k Atomsgpu3Multicore13691215SE +/- 0.03, N = 310.921. (CXX) g++ options: -O3 -pthread -lm

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: Rhodopsin Proteingpu3Multicore13691215SE +/- 0.11, N = 1510.021. (CXX) g++ options: -O3 -pthread -lm

libgav1

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Chimera 1080pgpu3Multicore1918273645SE +/- 0.05, N = 337.251. (CXX) g++ options: -O3 -lpthread

libgav1

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Summer Nature 4Kgpu3Multicore148121620SE +/- 0.01, N = 317.031. (CXX) g++ options: -O3 -lpthread

libgav1

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Summer Nature 1080pgpu3Multicore11224364860SE +/- 0.04, N = 353.711. (CXX) g++ options: -O3 -lpthread

libgav1

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Chimera 1080p 10-bitgpu3Multicore148121620SE +/- 0.15, N = 315.291. (CXX) g++ options: -O3 -lpthread

Zstd Compression

Compression Level: 3

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 3gpu3Multicore111002200330044005500SE +/- 3.85, N = 35059.11. (CC) gcc options: -O3 -pthread -lz

Zstd Compression

Compression Level: 19

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 19gpu3Multicore11020304050SE +/- 0.03, N = 345.41. (CC) gcc options: -O3 -pthread -lz

ArrayFire

Test: BLAS CPU

OpenBenchmarking.orgGFLOPS, More Is BetterArrayFire 3.7Test: BLAS CPUgpu3Multicore190180270360450SE +/- 0.49, N = 3410.331. (CXX) g++ options: -rdynamic

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: Blowfishgpu3Multicore17K14K21K28K35KSE +/- 93.23, N = 3344301. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -pthread -lm -lz -ldl -lcrypt -lbz2

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: MD5gpu3Multicore1200K400K600K800K1000KSE +/- 6489.31, N = 310526671. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -pthread -lm -lz -ldl -lcrypt -lbz2

Open FMM Nero2D

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpen FMM Nero2D 2.0.2Total Timegpu3Multicore1816243240SE +/- 0.08, N = 332.451. (CXX) g++ options: -O2 -lfftw3 -llapack -lblas -lgfortran -lquadmath -lm -pthread -lmpi_cxx -lmpi

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUgpu3Multicore10.92331.84662.76993.69324.6165SE +/- 0.01135, N = 34.10359MIN: 3.821. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUgpu3Multicore11.13392.26783.40174.53565.6695SE +/- 0.00243, N = 35.03966MIN: 4.981. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.66141.32281.98422.64563.307SE +/- 0.00193, N = 32.93945MIN: 2.791. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.33780.67561.01341.35121.689SE +/- 0.00670, N = 31.50128MIN: 1.411. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUgpu3Multicore13691215SE +/- 0.01, N = 310.04MIN: 9.891. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUgpu3Multicore11.06882.13763.20644.27525.344SE +/- 0.01438, N = 34.75016MIN: 4.351. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUgpu3Multicore1246810SE +/- 0.00364, N = 37.52129MIN: 6.831. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUgpu3Multicore13691215SE +/- 0.01, N = 313.43MIN: 13.211. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUgpu3Multicore1246810SE +/- 0.09293, N = 38.31723MIN: 7.941. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUgpu3Multicore11.19562.39123.58684.78245.978SE +/- 0.00230, N = 35.31369MIN: 5.081. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 10.83, N = 34043.69MIN: 4015.011. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 2.49, N = 32153.94MIN: 2140.591. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 0.97, N = 34055.95MIN: 4045.371. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 1.08, N = 32150.86MIN: 2141.221. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUgpu3Multicore10.58351.1671.75052.3342.9175SE +/- 0.00134, N = 32.59347MIN: 2.531. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 6.67, N = 34043.52MIN: 4021.891. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 1.16, N = 32153.61MIN: 2144.871. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.55231.10461.65692.20922.7615SE +/- 0.00098, N = 32.45445MIN: 2.291. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OSPray

Demo: San Miguel - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: SciVisgpu3Multicore148121620SE +/- 0.00, N = 317.54MIN: 16.95 / MAX: 18.52

OSPray

Demo: XFrog Forest - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: SciVisgpu3Multicore10.68631.37262.05892.74523.4315SE +/- 0.00, N = 33.05MIN: 3.01 / MAX: 3.09

OSPray

Demo: San Miguel - Renderer: Path Tracer

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: Path Tracergpu3Multicore10.29930.59860.89791.19721.4965SE +/- 0.00, N = 31.33MIN: 1.32 / MAX: 1.34

OSPray

Demo: NASA Streamlines - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: SciVisgpu3Multicore1510152025SE +/- 0.00, N = 322.22MIN: 21.74 / MAX: 22.73

OSPray

Demo: XFrog Forest - Renderer: Path Tracer

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: Path Tracergpu3Multicore10.35550.7111.06651.4221.7775SE +/- 0.00, N = 31.58MIN: 1.56 / MAX: 1.61

OSPray

Demo: Magnetic Reconnection - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: SciVisgpu3Multicore13691215SE +/- 0.00, N = 312.66MIN: 12.5 / MAX: 12.82

OSPray

Demo: NASA Streamlines - Renderer: Path Tracer

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: Path Tracergpu3Multicore10.99451.9892.98353.9784.9725SE +/- 0.01, N = 34.42MIN: 4.35 / MAX: 4.55

OSPray

Demo: Magnetic Reconnection - Renderer: Path Tracer

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: Path Tracergpu3Multicore14080120160200SE +/- 0.00, N = 3166.67MIN: 125 / MAX: 200

AOM AV1

Encoder Mode: Speed 0 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 0 Two-Passgpu3Multicore10.05850.1170.17550.2340.2925SE +/- 0.00, N = 30.261. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 4 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 4 Two-Passgpu3Multicore10.42080.84161.26241.68322.104SE +/- 0.01, N = 31.871. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 6 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 6 Realtimegpu3Multicore148121620SE +/- 0.02, N = 315.061. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 6 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 6 Two-Passgpu3Multicore10.66831.33662.00492.67323.3415SE +/- 0.01, N = 32.971. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

AOM AV1

Encoder Mode: Speed 8 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 8 Realtimegpu3Multicore1612182430SE +/- 0.17, N = 327.121. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Slow

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Slowgpu3Multicore1246810SE +/- 0.02, N = 36.941. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Mediumgpu3Multicore1246810SE +/- 0.01, N = 37.061. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 1080p - Video Preset: Slow

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Slowgpu3Multicore1612182430SE +/- 0.03, N = 325.991. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 1080p - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Mediumgpu3Multicore1612182430SE +/- 0.04, N = 326.921. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Very Fastgpu3Multicore148121620SE +/- 0.01, N = 316.771. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Ultra Fastgpu3Multicore1714212835SE +/- 0.05, N = 331.161. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 1080p - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Very Fastgpu3Multicore11326395265SE +/- 0.04, N = 359.841. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 1080p - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Ultra Fastgpu3Multicore120406080100SE +/- 0.08, N = 3106.811. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

rav1e

Speed: 1

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 1gpu3Multicore10.07810.15620.23430.31240.3905SE +/- 0.001, N = 30.347

rav1e

Speed: 5

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 5gpu3Multicore10.23450.4690.70350.9381.1725SE +/- 0.001, N = 31.042

rav1e

Speed: 6

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 6gpu3Multicore10.30850.6170.92551.2341.5425SE +/- 0.002, N = 31.371

rav1e

Speed: 10

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 10gpu3Multicore10.68221.36442.04662.72883.411SE +/- 0.009, N = 33.032

SVT-AV1

Encoder Mode: Enc Mode 0 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 0 - Input: 1080pgpu3Multicore10.02840.05680.08520.11360.142SE +/- 0.000, N = 30.1261. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-AV1

Encoder Mode: Enc Mode 4 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 4 - Input: 1080pgpu3Multicore11.05822.11643.17464.23285.291SE +/- 0.015, N = 34.7031. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-AV1

Encoder Mode: Enc Mode 8 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 8 - Input: 1080pgpu3Multicore1816243240SE +/- 0.12, N = 335.691. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

SVT-HEVC

1080p 8-bit YUV To HEVC Video Encode

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.4.11080p 8-bit YUV To HEVC Video Encodegpu3Multicore11632486480SE +/- 0.05, N = 372.351. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

VP9 libvpx Encoding

Speed: Speed 0

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 0gpu3Multicore11.33432.66864.00295.33726.6715SE +/- 0.01, N = 35.931. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=c++11

VP9 libvpx Encoding

Speed: Speed 5

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 5gpu3Multicore148121620SE +/- 0.05, N = 318.271. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=c++11

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4Kgpu3Multicore11.14082.28163.42244.56325.704SE +/- 0.06, N = 35.071. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pgpu3Multicore148121620SE +/- 0.06, N = 314.651. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Rategpu3Multicore10.36240.72481.08721.44961.812SE +/- 0.007697, N = 31.6108731. (CC) gcc options: -O3 -march=native -fopenmp

Intel Open Image Denoise

Scene: Memorial

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.2.0Scene: Memorialgpu3Multicore1246810SE +/- 0.05, N = 36.93

OpenVKL

Benchmark: vklBenchmark

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkgpu3Multicore14080120160200179MIN: 1 / MAX: 587

OpenVKL

Benchmark: vklBenchmarkVdbVolume

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkVdbVolumegpu3Multicore13M6M9M12M15MSE +/- 189330.67, N = 316110776MIN: 463741 / MAX: 96050880

OpenVKL

Benchmark: vklBenchmarkStructuredVolume

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkStructuredVolumegpu3Multicore114M28M42M56M70MSE +/- 249259.82, N = 365718519MIN: 429600 / MAX: 792670464

OpenVKL

Benchmark: vklBenchmarkUnstructuredVolume

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkUnstructuredVolumegpu3Multicore1300K600K900K1200K1500KSE +/- 2593.98, N = 31358541MIN: 17295 / MAX: 4612260

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondgpu3Multicore1100K200K300K400K500KSE +/- 885.70, N = 3490007.451. (CC) gcc options: -O2 -lrt" -lrt

7-Zip Compression

Compress Speed Test

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 16.02Compress Speed Testgpu3Multicore116K32K48K64K80KSE +/- 211.29, N = 3731861. (CXX) g++ options: -pipe -lpthread

asmFish

1024 Hash Memory, 26 Depth

OpenBenchmarking.orgNodes/second, More Is BetterasmFish 2018-07-231024 Hash Memory, 26 Depthgpu3Multicore18M16M24M32M40MSE +/- 41169.03, N = 339041533

Swet

Average

OpenBenchmarking.orgOperations Per Second, More Is BetterSwet 1.5.16Averagegpu3Multicore1160M320M480M640M800MSE +/- 9820613.43, N = 37577100601. (CC) gcc options: -lm -lpthread -lcurses -lrt

ebizzy

OpenBenchmarking.orgRecords/s, More Is Betterebizzy 0.3gpu3Multicore1100K200K300K400K500KSE +/- 5403.48, N = 154449481. (CC) gcc options: -pthread -lpthread -O3 -march=native

Timed FFmpeg Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.2.2Time To Compilegpu3Multicore1918273645SE +/- 0.09, N = 339.72

Timed GCC Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed GCC Compilation 9.3.0Time To Compilegpu3Multicore12004006008001000SE +/- 0.65, N = 3947.29

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To Compilegpu3Multicore1510152025SE +/- 0.10, N = 320.80

Timed Linux Kernel Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 5.4Time To Compilegpu3Multicore11122334455SE +/- 0.53, N = 349.81

Timed LLVM Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 10.0Time To Compilegpu3Multicore190180270360450SE +/- 2.48, N = 3429.22

Timed MPlayer Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MPlayer Compilation 1.4Time To Compilegpu3Multicore1612182430SE +/- 0.07, N = 325.11

Build2

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterBuild2 0.13Time To Compilegpu3Multicore120406080100SE +/- 0.16, N = 390.05

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per Pixelgpu3Multicore1816243240SE +/- 0.06, N = 334.581. (CC) gcc options: -lm -lpthread -O3

Parallel BZIP2 Compression

256MB File Compression

OpenBenchmarking.orgSeconds, Fewer Is BetterParallel BZIP2 Compression 1.1.12256MB File Compressiongpu3Multicore10.54681.09361.64042.18722.734SE +/- 0.009, N = 32.4301. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.41e12 Prime Number Generationgpu3Multicore148121620SE +/- 0.03, N = 313.641. (CXX) g++ options: -O3 -lpthread

Rust Mandelbrot

Time To Complete Serial/Parallel Mandelbrot

OpenBenchmarking.orgSeconds, Fewer Is BetterRust MandelbrotTime To Complete Serial/Parallel Mandelbrotgpu3Multicore1918273645SE +/- 0.01, N = 338.551. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil

Rust Prime Benchmark

Prime Number Test To 200,000,000

OpenBenchmarking.orgSeconds, Fewer Is BetterRust Prime BenchmarkPrime Number Test To 200,000,000gpu3Multicore1510152025SE +/- 0.01, N = 320.471. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil

Smallpt

Global Illumination Renderer; 128 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 128 Samplesgpu3Multicore1246810SE +/- 0.008, N = 36.6181. (CXX) g++ options: -fopenmp -O3

Tungsten Renderer

Scene: Hair

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Hairgpu3Multicore148121620SE +/- 0.03, N = 316.821. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

Tungsten Renderer

Scene: Water Caustic

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Water Causticgpu3Multicore1612182430SE +/- 0.09, N = 326.421. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

Tungsten Renderer

Scene: Non-Exponential

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Non-Exponentialgpu3Multicore13691215SE +/- 0.04, N = 310.101. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

Tungsten Renderer

Scene: Volumetric Caustic

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Volumetric Causticgpu3Multicore13691215SE +/- 0.01, N = 310.151. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total Timegpu3Multicore11020304050SE +/- 0.14, N = 344.501. (CC) gcc options: -lm -O3

Timed Eigen Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Eigen Compilation 3.3.9Time To Compilegpu3Multicore11428425670SE +/- 0.06, N = 363.58

FFmpeg

H.264 HD To NTSC DV

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 4.0.2H.264 HD To NTSC DVgpu3Multicore1246810SE +/- 0.050, N = 37.6881. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm -lxcb -lxcb-shape -lxcb-xfixes -lxcb-render -pthread -lbz2 -std=c11 -fomit-frame-pointer -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT

m-queens

Time To Solve

OpenBenchmarking.orgSeconds, Fewer Is Betterm-queens 1.2Time To Solvegpu3Multicore1816243240SE +/- 0.08, N = 336.961. (CXX) g++ options: -fopenmp -O2 -march=native

N-Queens

Elapsed Time

OpenBenchmarking.orgSeconds, Fewer Is BetterN-Queens 1.0Elapsed Timegpu3Multicore1246810SE +/- 0.001, N = 38.0101. (CC) gcc options: -static -fopenmp -O3 -march=native

Radiance Benchmark

Test: Serial

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: Serialgpu3Multicore1160320480640800758.36

Radiance Benchmark

Test: SMP Parallel

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP Parallelgpu3Multicore150100150200250235.27

Tachyon

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.99b6Total Timegpu3Multicore11224364860SE +/- 0.15, N = 355.461. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread

Aircrack-ng

OpenBenchmarking.orgk/s, More Is BetterAircrack-ng 1.5.2gpu3Multicore16K12K18K24K30KSE +/- 68.07, N = 328284.281. (CXX) g++ options: -O3 -fvisibility=hidden -masm=intel -fcommon -rdynamic -lpthread -lz -lcrypto -lhwloc -ldl -lm -pthread

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Griddinggpu3Multicore130060090012001500SE +/- 0.61, N = 31575.191. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Degriddinggpu3Multicore15001000150020002500SE +/- 2.64, N = 32273.881. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve OpenMP - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Griddinggpu3Multicore1400800120016002000SE +/- 11.39, N = 31857.741. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve OpenMP - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Degriddinggpu3Multicore16001200180024003000SE +/- 0.00, N = 32716.91. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: Hogbom Clean OpenMP

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMPgpu3Multicore160120180240300SE +/- 0.24, N = 3268.341. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Intel MPI Benchmarks

Test: IMB-P2P PingPong

OpenBenchmarking.orgAverage Msg/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-P2P PingPonggpu3Multicore11.4M2.8M4.2M5.6M7MSE +/- 35072.49, N = 36329420MIN: 1185 / MAX: 153504141. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Exchange

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Exchangegpu3Multicore1400800120016002000SE +/- 136.55, N = 121701.77MAX: 18194.891. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Exchange

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Exchangegpu3Multicore1120240360480600SE +/- 17.24, N = 12575.42MIN: 0.3 / MAX: 17330.781. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 PingPong

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 PingPonggpu3Multicore15001000150020002500SE +/- 442.13, N = 122338.83MIN: 3.77 / MAX: 11785.091. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Sendrecv

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Sendrecvgpu3Multicore1400800120016002000SE +/- 147.96, N = 152001.41MAX: 19536.021. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Sendrecv

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Sendrecvgpu3Multicore160120180240300SE +/- 5.97, N = 15270.90MIN: 0.16 / MAX: 7916.441. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

GROMACS

Water Benchmark

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2020.3Water Benchmarkgpu3Multicore10.25430.50860.76291.01721.2715SE +/- 0.002, N = 31.1301. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm

MariaDB

Clients: 1

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 1gpu3Multicore1400800120016002000SE +/- 7.16, N = 318131. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 4

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 4gpu3Multicore12004006008001000SE +/- 2.39, N = 310731. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 8

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 8gpu3Multicore12004006008001000SE +/- 3.01, N = 310181. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 16

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 16gpu3Multicore12004006008001000SE +/- 1.66, N = 39451. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 32

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 32gpu3Multicore170140210280350SE +/- 62.25, N = 93221. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 64

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 64gpu3Multicore150100150200250SE +/- 0.35, N = 32191. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 128

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 128gpu3Multicore14080120160200SE +/- 0.10, N = 31771. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 256

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 256gpu3Multicore14080120160200SE +/- 0.37, N = 31711. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 512

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 512gpu3Multicore14080120160200SE +/- 0.29, N = 31701. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

Sysbench

Test: Memory

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 2018-07-28Test: Memorygpu3Multicore11.5M3M4.5M6M7.5MSE +/- 3413.58, N = 37093621.191. (CC) gcc options: -pthread -O3 -funroll-loops -ggdb3 -march=amdfam10 -rdynamic -ldl -lm

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 2018-07-28Test: CPUgpu3Multicore17K14K21K28K35KSE +/- 2.71, N = 334334.371. (CC) gcc options: -pthread -O3 -funroll-loops -ggdb3 -march=amdfam10 -rdynamic -ldl -lm

Blender

Blend File: BMW27 - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: CUDAgpu3Multicore1612182430SE +/- 0.02, N = 327.40

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: OpenCLgpu3Multicore170140210280350SE +/- 0.57, N = 3316.48

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: CPU-Onlygpu3Multicore120406080100SE +/- 0.23, N = 399.46

Blender

Blend File: Classroom - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: CUDAgpu3Multicore11530456075SE +/- 0.53, N = 367.35

Blender

Blend File: Fishy Cat - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: CUDAgpu3Multicore11632486480SE +/- 0.72, N = 370.47

Blender

Blend File: Barbershop - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: CUDAgpu3Multicore150100150200250SE +/- 1.46, N = 3236.16

Blender

Blend File: Classroom - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: OpenCLgpu3Multicore170140210280350SE +/- 1.77, N = 3324.61

Blender

Blend File: Fishy Cat - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: OpenCLgpu3Multicore12004006008001000SE +/- 4.14, N = 3854.84

Blender

Blend File: Barbershop - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: OpenCLgpu3Multicore1120240360480600SE +/- 2.24, N = 3575.76

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: NVIDIA OptiXgpu3Multicore170140210280350SE +/- 0.62, N = 3316.48

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: CPU-Onlygpu3Multicore160120180240300SE +/- 0.30, N = 3290.41

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: CPU-Onlygpu3Multicore1306090120150SE +/- 0.04, N = 3143.83

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: CPU-Onlygpu3Multicore1100200300400500SE +/- 0.88, N = 3449.85

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: NVIDIA OptiXgpu3Multicore170140210280350SE +/- 0.41, N = 3321.26

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: NVIDIA OptiXgpu3Multicore12004006008001000SE +/- 4.49, N = 3852.90

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: NVIDIA OptiXgpu3Multicore1130260390520650SE +/- 2.13, N = 3585.55

Blender

Blend File: Pabellon Barcelona - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: CUDAgpu3Multicore14080120160200SE +/- 1.07, N = 3192.15

Blender

Blend File: Pabellon Barcelona - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: OpenCLgpu3Multicore12004006008001000SE +/- 1.53, N = 3957.47

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: CPU-Onlygpu3Multicore170140210280350SE +/- 0.86, N = 3339.34

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXgpu3Multicore12004006008001000SE +/- 5.63, N = 3954.94

Xsbench

OpenBenchmarking.orgLookups/s, More Is BetterXsbench 2017-07-06gpu3Multicore1600K1200K1800K2400K3000KSE +/- 920.47, N = 329799141. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm


Phoronix Test Suite v10.8.4