gpu3Multicore1

AMD Ryzen Threadripper 2950X 16-Core testing with a ASRock X399 Professional Gaming (P3.80 BIOS) and MSI NVIDIA GeForce GTX 1080 8GB on Ubuntu 16.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2102113-HA-GPU3MULTI38&grr.

gpu3Multicore1ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen Resolutiongpu3Multicore1AMD Ryzen Threadripper 2950X 16-Core @ 3.50GHz (16 Cores / 32 Threads)ASRock X399 Professional Gaming (P3.80 BIOS)AMD 17h126GB1000GB Samsung SSD 860MSI NVIDIA GeForce GTX 1080 8GBNVIDIA GP104 HD AudioAquantia AQC107 NBase-T/IEEE + 2 x Intel I211 + Intel Dual Band-AC 3168NGWUbuntu 16.044.19.174-custom (x86_64)X Server 1.19.6NVIDIAOpenCL 1.2 CUDA 10.1.1201.1.99GCC 5.4.0 20160609 + Clang 3.8.0-2ubuntu4 + CUDA 9.2ext4640x480OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820b- Python 2.7.12 + Python 3.5.2- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected

gpu3Multicore1mysqlslap: 32mysqlslap: 512mysqlslap: 256mysqlslap: 128mysqlslap: 64blender: Pabellon Barcelona - OpenCLblender: Pabellon Barcelona - NVIDIA OptiXbuild-gcc: Time To Compileopenvkl: vklBenchmarkUnstructuredVolumeblender: Fishy Cat - OpenCLblender: Fishy Cat - NVIDIA OptiXlammps: 20k Atomslibgav1: Chimera 1080p 10-bitblender: Barbershop - NVIDIA OptiXblender: Barbershop - OpenCLblender: Barbershop - CPU-Onlybuild-llvm: Time To Compileopenvkl: vklBenchmarkblender: Pabellon Barcelona - CPU-Onlyrodinia: OpenMP LavaMDblender: Classroom - OpenCLblender: Classroom - NVIDIA OptiXblender: BMW27 - NVIDIA OptiXblender: BMW27 - OpenCLblender: Classroom - CPU-Onlyradiance: Seriallibgav1: Chimera 1080pblender: Barbershop - CUDAmysqlslap: 16intel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 Exchangenpb: EP.Dmysqlslap: 8libgav1: Summer Nature 4Khpcg: mysqlslap: 4blender: Pabellon Barcelona - CUDAospray: San Miguel - Path Tracerasmfish: 1024 Hash Memory, 26 Depthparboil: OpenMP MRI Griddingintel-mpi: IMB-MPI1 Sendrecvintel-mpi: IMB-MPI1 Sendrecvsvt-av1: Enc Mode 0 - 1080pgromacs: Water Benchmarkblender: Fishy Cat - CPU-Onlymt-dgemm: Sustained Floating-Point Rateaskap: tConvolve MT - Degriddingaskap: tConvolve MT - Griddingospray: XFrog Forest - Path Tracermysqlslap: 1x265: Bosphorus 4Krodinia: OpenMP Leukocytevpxenc: Speed 0blender: BMW27 - CPU-Onlyebizzy: rodinia: OpenMP Streamclusterrodinia: OpenMP HotSpot3Dbuild2: Time To Compilekvazaar: Bosphorus 4K - Slowkvazaar: Bosphorus 4K - Mediumonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - f32 - CPUnamd: ATPase Simulation - 327,506 Atomsradiance: SMP Parallelonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUnpb: BT.Cparboil: OpenMP LBMospray: XFrog Forest - SciVisblender: Fishy Cat - CUDAlibgav1: Summer Nature 1080pblender: Classroom - CUDAopenvkl: vklBenchmarkVdbVolumecompress-zstd: 19build-eigen: Time To Compilejohn-the-ripper: MD5openvkl: vklBenchmarkStructuredVolumerav1e: 1rav1e: 5tachyon: Total Timepennant: sedovbigospray: NASA Streamlines - Path Tracerbuild-linux-kernel: Time To Compilenpb: LU.Cnpb: IS.Dcompress-7zip: Compress Speed Testaobench: 2048 x 2048 - Total Timeintel-mpi: IMB-MPI1 PingPongrav1e: 6intel-mpi: IMB-P2P PingPongx265: Bosphorus 1080paom-av1: Speed 6 Realtimebuild-ffmpeg: Time To Compileaom-av1: Speed 0 Two-Passpennant: leblancbigospray: San Miguel - SciVisrust-mandel: Time To Complete Serial/Parallel Mandelbrotm-queens: Time To Solvekvazaar: Bosphorus 4K - Very Fastc-ray: Total Time - 4K, 16 Rays Per Pixelaom-av1: Speed 6 Two-Passvpxenc: Speed 5nero2d: Total Timecompress-zstd: 3john-the-ripper: Blowfishaircrack-ng: rav1e: 10oidn: Memorialaskap: Hogbom Clean OpenMPblender: BMW27 - CUDAtungsten: Water Causticnpb: SP.Baom-av1: Speed 4 Two-Passbuild-mplayer: Time To Compilecoremark: CoreMark Size 666 - Iterations Per Secondkvazaar: Bosphorus 1080p - Slowaom-av1: Speed 8 Realtimekvazaar: Bosphorus 1080p - Mediumxsbench: npb: FT.Conednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUbuild-imagemagick: Time To Compilerust-prime: Prime Number Test To 200,000,000swet: Averagekvazaar: Bosphorus 4K - Ultra Fastsvt-av1: Enc Mode 4 - 1080pnpb: CG.Crodinia: OpenMP CFD Solverospray: Magnetic Reconnection - SciVistungsten: Hairarrayfire: BLAS CPUaskap: tConvolve OpenMP - Degriddingaskap: tConvolve OpenMP - Griddingonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUnpb: EP.Cprimesieve: 1e12 Prime Number Generationonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUospray: NASA Streamlines - SciVislammps: Rhodopsin Proteinsvt-av1: Enc Mode 8 - 1080pnpb: MG.Ctungsten: Volumetric Caustictungsten: Non-Exponentialparboil: OpenMP Stencilkvazaar: Bosphorus 1080p - Very Fastsysbench: CPUsysbench: Memoryonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 3D - f32 - CPUsvt-hevc: 1080p 8-bit YUV To HEVC Video Encoden-queens: Elapsed Timeffmpeg: H.264 HD To NTSC DVsmallpt: Global Illumination Renderer; 128 Samplesparboil: OpenMP MRI-Qonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUkvazaar: Bosphorus 1080p - Ultra Fastonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUospray: Magnetic Reconnection - Path Tracercompress-pbzip2: 256MB File Compressionparboil: OpenMP CUTCPgpu3Multicore1322170171177219957.47954.94947.2901358541854.84852.9010.92115.29585.55575.76449.85429.220179339.34330.705324.61321.26316.48316.48290.41758.35937.25236.16945575.421701.77621.18101817.037.028811073192.151.3339041533170.793706270.902001.410.1261.130143.831.6108732273.881575.191.5818135.07108.6205.9399.4644494819.50897.43790.0456.947.064055.954043.524043.691.30977235.2722153.612153.942150.8637405.6371.6793573.0570.4753.7167.351611077645.463.5761052667657185190.3471.04255.463651.031954.4249.81042621.39877.877318644.4962338.831.371632942014.6515.0639.7200.2638.6688117.5438.55436.96416.7734.5842.9718.2732.4545059.13443028284.2823.0326.93268.33727.4026.417013344.881.8725.106490007.45034825.9927.1226.92297991419497.228.317234.7501620.80120.46575771006031.164.7038001.3517.98812.6616.8214410.3322716.91857.744.103592.93945631.9013.6392.593472.4544522.2210.01935.69417236.6010.146210.10388.25174859.8434334.37457093621.18531.501285.0396672.358.0107.6886.6186.08905313.432010.0448106.815.313697.52129166.672.4302.122072OpenBenchmarking.org

MariaDB

Clients: 32

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 32gpu3Multicore170140210280350SE +/- 62.25, N = 93221. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 512

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 512gpu3Multicore14080120160200SE +/- 0.29, N = 31701. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 256

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 256gpu3Multicore14080120160200SE +/- 0.37, N = 31711. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 128

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 128gpu3Multicore14080120160200SE +/- 0.10, N = 31771. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

MariaDB

Clients: 64

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 64gpu3Multicore150100150200250SE +/- 0.35, N = 32191. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

Blender

Blend File: Pabellon Barcelona - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: OpenCLgpu3Multicore12004006008001000SE +/- 1.53, N = 3957.47

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXgpu3Multicore12004006008001000SE +/- 5.63, N = 3954.94

Timed GCC Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed GCC Compilation 9.3.0Time To Compilegpu3Multicore12004006008001000SE +/- 0.65, N = 3947.29

OpenVKL

Benchmark: vklBenchmarkUnstructuredVolume

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkUnstructuredVolumegpu3Multicore1300K600K900K1200K1500KSE +/- 2593.98, N = 31358541MIN: 17295 / MAX: 4612260

Blender

Blend File: Fishy Cat - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: OpenCLgpu3Multicore12004006008001000SE +/- 4.14, N = 3854.84

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: NVIDIA OptiXgpu3Multicore12004006008001000SE +/- 4.49, N = 3852.90

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: 20k Atomsgpu3Multicore13691215SE +/- 0.03, N = 310.921. (CXX) g++ options: -O3 -pthread -lm

libgav1

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Chimera 1080p 10-bitgpu3Multicore148121620SE +/- 0.15, N = 315.291. (CXX) g++ options: -O3 -lpthread

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: NVIDIA OptiXgpu3Multicore1130260390520650SE +/- 2.13, N = 3585.55

Blender

Blend File: Barbershop - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: OpenCLgpu3Multicore1120240360480600SE +/- 2.24, N = 3575.76

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: CPU-Onlygpu3Multicore1100200300400500SE +/- 0.88, N = 3449.85

Timed LLVM Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 10.0Time To Compilegpu3Multicore190180270360450SE +/- 2.48, N = 3429.22

OpenVKL

Benchmark: vklBenchmark

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkgpu3Multicore14080120160200179MIN: 1 / MAX: 587

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: CPU-Onlygpu3Multicore170140210280350SE +/- 0.86, N = 3339.34

Rodinia

Test: OpenMP LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LavaMDgpu3Multicore170140210280350SE +/- 0.71, N = 3330.711. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Blender

Blend File: Classroom - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: OpenCLgpu3Multicore170140210280350SE +/- 1.77, N = 3324.61

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: NVIDIA OptiXgpu3Multicore170140210280350SE +/- 0.41, N = 3321.26

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: NVIDIA OptiXgpu3Multicore170140210280350SE +/- 0.62, N = 3316.48

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: OpenCLgpu3Multicore170140210280350SE +/- 0.57, N = 3316.48

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: CPU-Onlygpu3Multicore160120180240300SE +/- 0.30, N = 3290.41

Radiance Benchmark

Test: Serial

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: Serialgpu3Multicore1160320480640800758.36

libgav1

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Chimera 1080pgpu3Multicore1918273645SE +/- 0.05, N = 337.251. (CXX) g++ options: -O3 -lpthread

Blender

Blend File: Barbershop - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: CUDAgpu3Multicore150100150200250SE +/- 1.46, N = 3236.16

MariaDB

Clients: 16

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 16gpu3Multicore12004006008001000SE +/- 1.66, N = 39451. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

Intel MPI Benchmarks

Test: IMB-MPI1 Exchange

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Exchangegpu3Multicore1120240360480600SE +/- 17.24, N = 12575.42MIN: 0.3 / MAX: 17330.781. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Exchange

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Exchangegpu3Multicore1400800120016002000SE +/- 136.55, N = 121701.77MAX: 18194.891. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

NAS Parallel Benchmarks

Test / Class: EP.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Dgpu3Multicore1130260390520650SE +/- 0.35, N = 3621.181. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

MariaDB

Clients: 8

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 8gpu3Multicore12004006008001000SE +/- 3.01, N = 310181. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

libgav1

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Summer Nature 4Kgpu3Multicore148121620SE +/- 0.01, N = 317.031. (CXX) g++ options: -O3 -lpthread

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1gpu3Multicore1246810SE +/- 0.01315, N = 37.028811. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi

MariaDB

Clients: 4

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 4gpu3Multicore12004006008001000SE +/- 2.39, N = 310731. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

Blender

Blend File: Pabellon Barcelona - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Pabellon Barcelona - Compute: CUDAgpu3Multicore14080120160200SE +/- 1.07, N = 3192.15

OSPray

Demo: San Miguel - Renderer: Path Tracer

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: Path Tracergpu3Multicore10.29930.59860.89791.19721.4965SE +/- 0.00, N = 31.33MIN: 1.32 / MAX: 1.34

asmFish

1024 Hash Memory, 26 Depth

OpenBenchmarking.orgNodes/second, More Is BetterasmFish 2018-07-231024 Hash Memory, 26 Depthgpu3Multicore18M16M24M32M40MSE +/- 41169.03, N = 339041533

Parboil

Test: OpenMP MRI Gridding

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP MRI Griddinggpu3Multicore14080120160200SE +/- 0.80, N = 3170.791. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Intel MPI Benchmarks

Test: IMB-MPI1 Sendrecv

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Sendrecvgpu3Multicore160120180240300SE +/- 5.97, N = 15270.90MIN: 0.16 / MAX: 7916.441. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Sendrecv

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 Sendrecvgpu3Multicore1400800120016002000SE +/- 147.96, N = 152001.41MAX: 19536.021. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

SVT-AV1

Encoder Mode: Enc Mode 0 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 0 - Input: 1080pgpu3Multicore10.02840.05680.08520.11360.142SE +/- 0.000, N = 30.1261. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

GROMACS

Water Benchmark

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2020.3Water Benchmarkgpu3Multicore10.25430.50860.76291.01721.2715SE +/- 0.002, N = 31.1301. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: CPU-Onlygpu3Multicore1306090120150SE +/- 0.04, N = 3143.83

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Rategpu3Multicore10.36240.72481.08721.44961.812SE +/- 0.007697, N = 31.6108731. (CC) gcc options: -O3 -march=native -fopenmp

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Degriddinggpu3Multicore15001000150020002500SE +/- 2.64, N = 32273.881. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - Griddinggpu3Multicore130060090012001500SE +/- 0.61, N = 31575.191. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OSPray

Demo: XFrog Forest - Renderer: Path Tracer

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: Path Tracergpu3Multicore10.35550.7111.06651.4221.7775SE +/- 0.00, N = 31.58MIN: 1.56 / MAX: 1.61

MariaDB

Clients: 1

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 10.5.2Clients: 1gpu3Multicore1400800120016002000SE +/- 7.16, N = 318131. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -lbz2 -lnuma -lcrypt -lz -lm -lssl -lcrypto -ldl

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4Kgpu3Multicore11.14082.28163.42244.56325.704SE +/- 0.06, N = 35.071. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

Rodinia

Test: OpenMP Leukocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP Leukocytegpu3Multicore120406080100SE +/- 0.57, N = 3108.621. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

VP9 libvpx Encoding

Speed: Speed 0

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 0gpu3Multicore11.33432.66864.00295.33726.6715SE +/- 0.01, N = 35.931. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=c++11

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: CPU-Onlygpu3Multicore120406080100SE +/- 0.23, N = 399.46

ebizzy

OpenBenchmarking.orgRecords/s, More Is Betterebizzy 0.3gpu3Multicore1100K200K300K400K500KSE +/- 5403.48, N = 154449481. (CC) gcc options: -pthread -lpthread -O3 -march=native

Rodinia

Test: OpenMP Streamcluster

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP Streamclustergpu3Multicore1510152025SE +/- 0.23, N = 1519.511. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Rodinia

Test: OpenMP HotSpot3D

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP HotSpot3Dgpu3Multicore120406080100SE +/- 0.83, N = 397.441. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Build2

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterBuild2 0.13Time To Compilegpu3Multicore120406080100SE +/- 0.16, N = 390.05

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Slow

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Slowgpu3Multicore1246810SE +/- 0.02, N = 36.941. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Mediumgpu3Multicore1246810SE +/- 0.01, N = 37.061. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 0.97, N = 34055.95MIN: 4045.371. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 6.67, N = 34043.52MIN: 4021.891. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUgpu3Multicore19001800270036004500SE +/- 10.83, N = 34043.69MIN: 4015.011. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsgpu3Multicore10.29470.58940.88411.17881.4735SE +/- 0.00223, N = 31.30977

Radiance Benchmark

Test: SMP Parallel

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP Parallelgpu3Multicore150100150200250235.27

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 1.16, N = 32153.61MIN: 2144.871. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 2.49, N = 32153.94MIN: 2140.591. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUgpu3Multicore15001000150020002500SE +/- 1.08, N = 32150.86MIN: 2141.221. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

NAS Parallel Benchmarks

Test / Class: BT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.Cgpu3Multicore18K16K24K32K40KSE +/- 37.34, N = 337405.631. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

Parboil

Test: OpenMP LBM

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP LBMgpu3Multicore11632486480SE +/- 0.02, N = 371.681. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OSPray

Demo: XFrog Forest - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: SciVisgpu3Multicore10.68631.37262.05892.74523.4315SE +/- 0.00, N = 33.05MIN: 3.01 / MAX: 3.09

Blender

Blend File: Fishy Cat - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Fishy Cat - Compute: CUDAgpu3Multicore11632486480SE +/- 0.72, N = 370.47

libgav1

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterlibgav1 2019-10-05Video Input: Summer Nature 1080pgpu3Multicore11224364860SE +/- 0.04, N = 353.711. (CXX) g++ options: -O3 -lpthread

Blender

Blend File: Classroom - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Classroom - Compute: CUDAgpu3Multicore11530456075SE +/- 0.53, N = 367.35

OpenVKL

Benchmark: vklBenchmarkVdbVolume

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkVdbVolumegpu3Multicore13M6M9M12M15MSE +/- 189330.67, N = 316110776MIN: 463741 / MAX: 96050880

Zstd Compression

Compression Level: 19

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 19gpu3Multicore11020304050SE +/- 0.03, N = 345.41. (CC) gcc options: -O3 -pthread -lz

Timed Eigen Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Eigen Compilation 3.3.9Time To Compilegpu3Multicore11428425670SE +/- 0.06, N = 363.58

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: MD5gpu3Multicore1200K400K600K800K1000KSE +/- 6489.31, N = 310526671. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -pthread -lm -lz -ldl -lcrypt -lbz2

OpenVKL

Benchmark: vklBenchmarkStructuredVolume

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 0.9Benchmark: vklBenchmarkStructuredVolumegpu3Multicore114M28M42M56M70MSE +/- 249259.82, N = 365718519MIN: 429600 / MAX: 792670464

rav1e

Speed: 1

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 1gpu3Multicore10.07810.15620.23430.31240.3905SE +/- 0.001, N = 30.347

rav1e

Speed: 5

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 5gpu3Multicore10.23450.4690.70350.9381.1725SE +/- 0.001, N = 31.042

Tachyon

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.99b6Total Timegpu3Multicore11224364860SE +/- 0.15, N = 355.461. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread

Pennant

Test: sedovbig

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbiggpu3Multicore11224364860SE +/- 0.04, N = 351.031. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

OSPray

Demo: NASA Streamlines - Renderer: Path Tracer

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: Path Tracergpu3Multicore10.99451.9892.98353.9784.9725SE +/- 0.01, N = 34.42MIN: 4.35 / MAX: 4.55

Timed Linux Kernel Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 5.4Time To Compilegpu3Multicore11122334455SE +/- 0.53, N = 349.81

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.Cgpu3Multicore19K18K27K36K45KSE +/- 122.06, N = 342621.391. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

NAS Parallel Benchmarks

Test / Class: IS.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.Dgpu3Multicore12004006008001000SE +/- 0.89, N = 3877.871. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

7-Zip Compression

Compress Speed Test

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 16.02Compress Speed Testgpu3Multicore116K32K48K64K80KSE +/- 211.29, N = 3731861. (CXX) g++ options: -pipe -lpthread

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total Timegpu3Multicore11020304050SE +/- 0.14, N = 344.501. (CC) gcc options: -lm -O3

Intel MPI Benchmarks

Test: IMB-MPI1 PingPong

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 PingPonggpu3Multicore15001000150020002500SE +/- 442.13, N = 122338.83MIN: 3.77 / MAX: 11785.091. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

rav1e

Speed: 6

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 6gpu3Multicore10.30850.6170.92551.2341.5425SE +/- 0.002, N = 31.371

Intel MPI Benchmarks

Test: IMB-P2P PingPong

OpenBenchmarking.orgAverage Msg/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-P2P PingPonggpu3Multicore11.4M2.8M4.2M5.6M7MSE +/- 35072.49, N = 36329420MIN: 1185 / MAX: 153504141. (CXX) g++ options: -O0 -pedantic -fopenmp -pthread -lmpi_cxx -lmpi

x265

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 1080pgpu3Multicore148121620SE +/- 0.06, N = 314.651. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma

AOM AV1

Encoder Mode: Speed 6 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 6 Realtimegpu3Multicore148121620SE +/- 0.02, N = 315.061. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Timed FFmpeg Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.2.2Time To Compilegpu3Multicore1918273645SE +/- 0.09, N = 339.72

AOM AV1

Encoder Mode: Speed 0 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 0 Two-Passgpu3Multicore10.05850.1170.17550.2340.2925SE +/- 0.00, N = 30.261. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Pennant

Test: leblancbig

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbiggpu3Multicore1918273645SE +/- 0.03, N = 338.671. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

OSPray

Demo: San Miguel - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: SciVisgpu3Multicore148121620SE +/- 0.00, N = 317.54MIN: 16.95 / MAX: 18.52

Rust Mandelbrot

Time To Complete Serial/Parallel Mandelbrot

OpenBenchmarking.orgSeconds, Fewer Is BetterRust MandelbrotTime To Complete Serial/Parallel Mandelbrotgpu3Multicore1918273645SE +/- 0.01, N = 338.551. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil

m-queens

Time To Solve

OpenBenchmarking.orgSeconds, Fewer Is Betterm-queens 1.2Time To Solvegpu3Multicore1816243240SE +/- 0.08, N = 336.961. (CXX) g++ options: -fopenmp -O2 -march=native

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Very Fastgpu3Multicore148121620SE +/- 0.01, N = 316.771. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per Pixelgpu3Multicore1816243240SE +/- 0.06, N = 334.581. (CC) gcc options: -lm -lpthread -O3

AOM AV1

Encoder Mode: Speed 6 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 6 Two-Passgpu3Multicore10.66831.33662.00492.67323.3415SE +/- 0.01, N = 32.971. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

VP9 libvpx Encoding

Speed: Speed 5

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.8.2Speed: Speed 5gpu3Multicore148121620SE +/- 0.05, N = 318.271. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=c++11

Open FMM Nero2D

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpen FMM Nero2D 2.0.2Total Timegpu3Multicore1816243240SE +/- 0.08, N = 332.451. (CXX) g++ options: -O2 -lfftw3 -llapack -lblas -lgfortran -lquadmath -lm -pthread -lmpi_cxx -lmpi

Zstd Compression

Compression Level: 3

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.4.5Compression Level: 3gpu3Multicore111002200330044005500SE +/- 3.85, N = 35059.11. (CC) gcc options: -O3 -pthread -lz

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: Blowfishgpu3Multicore17K14K21K28K35KSE +/- 93.23, N = 3344301. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -pthread -lm -lz -ldl -lcrypt -lbz2

Aircrack-ng

OpenBenchmarking.orgk/s, More Is BetterAircrack-ng 1.5.2gpu3Multicore16K12K18K24K30KSE +/- 68.07, N = 328284.281. (CXX) g++ options: -O3 -fvisibility=hidden -masm=intel -fcommon -rdynamic -lpthread -lz -lcrypto -lhwloc -ldl -lm -pthread

rav1e

Speed: 10

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.4Speed: 10gpu3Multicore10.68221.36442.04662.72883.411SE +/- 0.009, N = 33.032

Intel Open Image Denoise

Scene: Memorial

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.2.0Scene: Memorialgpu3Multicore1246810SE +/- 0.05, N = 36.93

ASKAP

Test: Hogbom Clean OpenMP

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMPgpu3Multicore160120180240300SE +/- 0.24, N = 3268.341. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Blender

Blend File: BMW27 - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: CUDAgpu3Multicore1612182430SE +/- 0.02, N = 327.40

Tungsten Renderer

Scene: Water Caustic

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Water Causticgpu3Multicore1612182430SE +/- 0.09, N = 326.421. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

NAS Parallel Benchmarks

Test / Class: SP.B

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.Bgpu3Multicore13K6K9K12K15KSE +/- 29.92, N = 313344.881. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

AOM AV1

Encoder Mode: Speed 4 Two-Pass

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 4 Two-Passgpu3Multicore10.42080.84161.26241.68322.104SE +/- 0.01, N = 31.871. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Timed MPlayer Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MPlayer Compilation 1.4Time To Compilegpu3Multicore1612182430SE +/- 0.07, N = 325.11

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondgpu3Multicore1100K200K300K400K500KSE +/- 885.70, N = 3490007.451. (CC) gcc options: -O2 -lrt" -lrt

Kvazaar

Video Input: Bosphorus 1080p - Video Preset: Slow

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Slowgpu3Multicore1612182430SE +/- 0.03, N = 325.991. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

AOM AV1

Encoder Mode: Speed 8 Realtime

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 2.0Encoder Mode: Speed 8 Realtimegpu3Multicore1612182430SE +/- 0.17, N = 327.121. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread

Kvazaar

Video Input: Bosphorus 1080p - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Mediumgpu3Multicore1612182430SE +/- 0.04, N = 326.921. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Xsbench

OpenBenchmarking.orgLookups/s, More Is BetterXsbench 2017-07-06gpu3Multicore1600K1200K1800K2400K3000KSE +/- 920.47, N = 329799141. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm

NAS Parallel Benchmarks

Test / Class: FT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.Cgpu3Multicore14K8K12K16K20KSE +/- 18.12, N = 319497.221. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUgpu3Multicore1246810SE +/- 0.09293, N = 38.31723MIN: 7.941. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUgpu3Multicore11.06882.13763.20644.27525.344SE +/- 0.01438, N = 34.75016MIN: 4.351. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To Compilegpu3Multicore1510152025SE +/- 0.10, N = 320.80

Rust Prime Benchmark

Prime Number Test To 200,000,000

OpenBenchmarking.orgSeconds, Fewer Is BetterRust Prime BenchmarkPrime Number Test To 200,000,000gpu3Multicore1510152025SE +/- 0.01, N = 320.471. (CC) gcc options: -m64 -pie -nodefaultlibs -ldl -lrt -lpthread -lgcc_s -lc -lm -lutil

Swet

Average

OpenBenchmarking.orgOperations Per Second, More Is BetterSwet 1.5.16Averagegpu3Multicore1160M320M480M640M800MSE +/- 9820613.43, N = 37577100601. (CC) gcc options: -lm -lpthread -lcurses -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 4K - Video Preset: Ultra Fastgpu3Multicore1714212835SE +/- 0.05, N = 331.161. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

SVT-AV1

Encoder Mode: Enc Mode 4 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 4 - Input: 1080pgpu3Multicore11.05822.11643.17464.23285.291SE +/- 0.015, N = 34.7031. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

NAS Parallel Benchmarks

Test / Class: CG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.Cgpu3Multicore12K4K6K8K10KSE +/- 8.06, N = 38001.351. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

Rodinia

Test: OpenMP CFD Solver

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP CFD Solvergpu3Multicore148121620SE +/- 0.06, N = 317.991. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

OSPray

Demo: Magnetic Reconnection - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: SciVisgpu3Multicore13691215SE +/- 0.00, N = 312.66MIN: 12.5 / MAX: 12.82

Tungsten Renderer

Scene: Hair

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Hairgpu3Multicore148121620SE +/- 0.03, N = 316.821. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

ArrayFire

Test: BLAS CPU

OpenBenchmarking.orgGFLOPS, More Is BetterArrayFire 3.7Test: BLAS CPUgpu3Multicore190180270360450SE +/- 0.49, N = 3410.331. (CXX) g++ options: -rdynamic

ASKAP

Test: tConvolve OpenMP - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Degriddinggpu3Multicore16001200180024003000SE +/- 0.00, N = 32716.91. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve OpenMP - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - Griddinggpu3Multicore1400800120016002000SE +/- 11.39, N = 31857.741. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUgpu3Multicore10.92331.84662.76993.69324.6165SE +/- 0.01135, N = 34.10359MIN: 3.821. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.66141.32281.98422.64563.307SE +/- 0.00193, N = 32.93945MIN: 2.791. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

NAS Parallel Benchmarks

Test / Class: EP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Cgpu3Multicore1140280420560700SE +/- 0.54, N = 3631.901. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.41e12 Prime Number Generationgpu3Multicore148121620SE +/- 0.03, N = 313.641. (CXX) g++ options: -O3 -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUgpu3Multicore10.58351.1671.75052.3342.9175SE +/- 0.00134, N = 32.59347MIN: 2.531. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.55231.10461.65692.20922.7615SE +/- 0.00098, N = 32.45445MIN: 2.291. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OSPray

Demo: NASA Streamlines - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: SciVisgpu3Multicore1510152025SE +/- 0.00, N = 322.22MIN: 21.74 / MAX: 22.73

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: Rhodopsin Proteingpu3Multicore13691215SE +/- 0.11, N = 1510.021. (CXX) g++ options: -O3 -pthread -lm

SVT-AV1

Encoder Mode: Enc Mode 8 - Input: 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8Encoder Mode: Enc Mode 8 - Input: 1080pgpu3Multicore1816243240SE +/- 0.12, N = 335.691. (CXX) g++ options: -O3 -fcommon -fPIE -fPIC -pie

NAS Parallel Benchmarks

Test / Class: MG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.Cgpu3Multicore14K8K12K16K20KSE +/- 8.84, N = 317236.601. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi2. Open MPI 1.10.2

Tungsten Renderer

Scene: Volumetric Caustic

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Volumetric Causticgpu3Multicore13691215SE +/- 0.01, N = 310.151. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

Tungsten Renderer

Scene: Non-Exponential

OpenBenchmarking.orgSeconds, Fewer Is BetterTungsten Renderer 0.2.2Scene: Non-Exponentialgpu3Multicore13691215SE +/- 0.04, N = 310.101. (CXX) g++ options: -std=c++0x -march=broadwell -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -mfma -mbmi2 -mno-avx -mno-avx2 -mno-xop -mno-fma4 -mno-avx512f -mno-avx512vl -mno-avx512pf -mno-avx512er -mno-avx512cd -mno-avx512dq -mno-avx512bw -mno-avx512ifma -mno-avx512vbmi -fstrict-aliasing -O3 -rdynamic -lpthread -ldl

Parboil

Test: OpenMP Stencil

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP Stencilgpu3Multicore1246810SE +/- 0.029315, N = 38.2517481. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Kvazaar

Video Input: Bosphorus 1080p - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Very Fastgpu3Multicore11326395265SE +/- 0.04, N = 359.841. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 2018-07-28Test: CPUgpu3Multicore17K14K21K28K35KSE +/- 2.71, N = 334334.371. (CC) gcc options: -pthread -O3 -funroll-loops -ggdb3 -march=amdfam10 -rdynamic -ldl -lm

Sysbench

Test: Memory

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 2018-07-28Test: Memorygpu3Multicore11.5M3M4.5M6M7.5MSE +/- 3413.58, N = 37093621.191. (CC) gcc options: -pthread -O3 -funroll-loops -ggdb3 -march=amdfam10 -rdynamic -ldl -lm

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUgpu3Multicore10.33780.67561.01341.35121.689SE +/- 0.00670, N = 31.50128MIN: 1.411. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUgpu3Multicore11.13392.26783.40174.53565.6695SE +/- 0.00243, N = 35.03966MIN: 4.981. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

SVT-HEVC

1080p 8-bit YUV To HEVC Video Encode

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.4.11080p 8-bit YUV To HEVC Video Encodegpu3Multicore11632486480SE +/- 0.05, N = 372.351. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

N-Queens

Elapsed Time

OpenBenchmarking.orgSeconds, Fewer Is BetterN-Queens 1.0Elapsed Timegpu3Multicore1246810SE +/- 0.001, N = 38.0101. (CC) gcc options: -static -fopenmp -O3 -march=native

FFmpeg

H.264 HD To NTSC DV

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 4.0.2H.264 HD To NTSC DVgpu3Multicore1246810SE +/- 0.050, N = 37.6881. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm -lxcb -lxcb-shape -lxcb-xfixes -lxcb-render -pthread -lbz2 -std=c11 -fomit-frame-pointer -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT

Smallpt

Global Illumination Renderer; 128 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 128 Samplesgpu3Multicore1246810SE +/- 0.008, N = 36.6181. (CXX) g++ options: -fopenmp -O3

Parboil

Test: OpenMP MRI-Q

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP MRI-Qgpu3Multicore1246810SE +/- 0.001530, N = 36.0890531. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUgpu3Multicore13691215SE +/- 0.01, N = 313.43MIN: 13.211. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUgpu3Multicore13691215SE +/- 0.01, N = 310.04MIN: 9.891. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

Kvazaar

Video Input: Bosphorus 1080p - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.0Video Input: Bosphorus 1080p - Video Preset: Ultra Fastgpu3Multicore120406080100SE +/- 0.08, N = 3106.811. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUgpu3Multicore11.19562.39123.58684.78245.978SE +/- 0.00230, N = 35.31369MIN: 5.081. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUgpu3Multicore1246810SE +/- 0.00364, N = 37.52129MIN: 6.831. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OSPray

Demo: Magnetic Reconnection - Renderer: Path Tracer

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: Path Tracergpu3Multicore14080120160200SE +/- 0.00, N = 3166.67MIN: 125 / MAX: 200

Parallel BZIP2 Compression

256MB File Compression

OpenBenchmarking.orgSeconds, Fewer Is BetterParallel BZIP2 Compression 1.1.12256MB File Compressiongpu3Multicore10.54681.09361.64042.18722.734SE +/- 0.009, N = 32.4301. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread

Parboil

Test: OpenMP CUTCP

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP CUTCPgpu3Multicore10.47750.9551.43251.912.3875SE +/- 0.020422, N = 32.1220721. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp


Phoronix Test Suite v10.8.4