AMD Ryzen Threadripper 2990WX 32-Core testing with a ASUS ROG ZENITH EXTREME (1701 BIOS) and Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB on Ubuntu 20.10 via the Phoronix Test Suite.
1 Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820dGraphics Notes: GLAMORJava Notes: OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10)Python Notes: Python 3.8.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
2 3 Processor: AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads), Motherboard: ASUS ROG ZENITH EXTREME (1701 BIOS), Chipset: AMD 17h, Memory: 32GB, Disk: Samsung SSD 970 EVO 500GB + 250GB Western Digital WDS250G2X0C-00L350, Graphics: Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB (1244/1750MHz), Audio: Realtek ALC1220, Monitor: LG Ultra HD, Network: Intel I211 + Qualcomm Atheros QCA6174 802.11ac + Wilocity Wil6200 802.11ad
OS: Ubuntu 20.10, Kernel: 5.8.0-33-generic (x86_64), Desktop: GNOME Shell 3.38.1, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, OpenGL: 4.6 Mesa 20.2.1 (LLVM 11.0.0), Vulkan: 1.2.131, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1920x1080
VkFFT VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 3 2 1 2K 4K 6K 8K 10K SE +/- 6.74, N = 3 SE +/- 49.17, N = 3 SE +/- 13.38, N = 3 9417 9451 9412 1. (CXX) g++ options: -O3 -pthread
Libplacebo Libplacebo is a multimedia rendering library based on the core rendering code of the MPV player. The libplacebo benchmark relies on the Vulkan API and tests various primitives. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: deband_heavy 3 2 1 40 80 120 160 200 SE +/- 0.29, N = 3 SE +/- 0.37, N = 3 SE +/- 0.36, N = 3 192.22 192.43 192.54 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: polar_nocompute 3 2 1 60 120 180 240 300 SE +/- 0.38, N = 3 SE +/- 0.53, N = 3 SE +/- 0.83, N = 3 267.90 268.36 268.72 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: hdr_peakdetect 3 2 1 400 800 1200 1600 2000 SE +/- 0.10, N = 3 SE +/- 0.18, N = 3 SE +/- 0.03, N = 3 1762.55 1762.78 1762.98 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: av1_grain_lap 3 2 1 100 200 300 400 500 SE +/- 1.25, N = 3 SE +/- 3.25, N = 3 SE +/- 3.42, N = 3 446.38 441.37 448.24 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
yquake2 This is a test of Yamagi Quake II. Yamagi Quake II is an enhanced client for id Software's Quake II with focus on offline and coop gameplay. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 3 2 1 150 300 450 600 750 SE +/- 5.54, N = 15 SE +/- 6.33, N = 8 SE +/- 4.44, N = 3 623.8 619.9 683.8 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 3 2 1 200 400 600 800 1000 SE +/- 10.37, N = 3 SE +/- 13.22, N = 3 SE +/- 12.14, N = 3 972.5 972.2 949.7 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 3 2 1 20 40 60 80 100 SE +/- 0.93, N = 3 SE +/- 1.01, N = 3 SE +/- 1.02, N = 3 106.2 107.1 106.6 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown 3 2 1 5 10 15 20 25 SE +/- 0.22, N = 3 SE +/- 0.38, N = 3 SE +/- 0.21, N = 3 22.65 22.24 22.73 MIN: 21.43 / MAX: 23.62 MIN: 20.82 / MAX: 23.48 MIN: 21.97 / MAX: 23.66
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon 3 2 1 5 10 15 20 25 SE +/- 0.26, N = 15 SE +/- 0.25, N = 15 SE +/- 0.31, N = 15 22.27 22.50 21.96 MIN: 20.37 / MAX: 25.26 MIN: 20.59 / MAX: 25.14 MIN: 19.03 / MAX: 24.66
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon Obj 3 2 1 5 10 15 20 25 SE +/- 0.44, N = 12 SE +/- 0.37, N = 15 SE +/- 0.25, N = 15 19.40 19.30 18.24 MIN: 16.59 / MAX: 22.2 MIN: 16.68 / MAX: 22.31 MIN: 16.42 / MAX: 21.1
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon 3 2 1 5 10 15 20 25 SE +/- 0.19, N = 15 SE +/- 0.26, N = 15 SE +/- 0.09, N = 3 21.56 21.67 21.32 MIN: 19.33 / MAX: 23.88 MIN: 19.41 / MAX: 23.55 MIN: 20.36 / MAX: 22.38
OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon Obj 3 2 1 4 8 12 16 20 SE +/- 0.30, N = 15 SE +/- 0.39, N = 12 SE +/- 0.39, N = 12 17.40 17.24 17.68 MIN: 15.12 / MAX: 19.79 MIN: 14.95 / MAX: 20.29 MIN: 15.66 / MAX: 20.6
Kvazaar This is a test of Kvazaar as a CPU-based H.265 video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Slow 3 2 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 10.32 10.38 10.29 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 3 2 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 10.53 10.56 10.47 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Slow 3 2 1 6 12 18 24 30 SE +/- 0.18, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 26.92 27.13 26.99 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 3 2 1 7 14 21 28 35 SE +/- 0.15, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 27.78 27.84 27.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 3 2 1 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 23.61 23.57 23.54 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 3 2 1 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.15, N = 3 SE +/- 0.28, N = 3 39.39 39.67 39.68 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 3 2 1 14 28 42 56 70 SE +/- 0.26, N = 3 SE +/- 0.20, N = 3 SE +/- 0.21, N = 3 60.79 60.69 60.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 3 2 1 30 60 90 120 150 SE +/- 0.26, N = 3 SE +/- 0.66, N = 3 SE +/- 0.15, N = 3 116.68 116.09 116.80 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 3 2 1 0.2147 0.4294 0.6441 0.8588 1.0735 SE +/- 0.004, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 0.953 0.951 0.954
OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 3 2 1 0.2867 0.5734 0.8601 1.1468 1.4335 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 1.274 1.273 1.270
OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 3 2 1 0.6433 1.2866 1.9299 2.5732 3.2165 SE +/- 0.012, N = 3 SE +/- 0.018, N = 3 SE +/- 0.006, N = 3 2.853 2.830 2.859
x265 This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 3 2 1 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 16.55 15.26 16.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 3 2 1 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.23, N = 3 41.06 40.36 41.02 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 3 2 1 0.9767 1.9534 2.9301 3.9068 4.8835 SE +/- 0.08946, N = 3 SE +/- 0.03755, N = 3 SE +/- 0.25728, N = 3 4.17428 4.34106 4.00164 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 3 2 1 0.3011 0.6022 0.9033 1.2044 1.5055 SE +/- 0.07867, N = 3 SE +/- 0.03714, N = 3 SE +/- 0.05676, N = 3 1.33820 1.30771 1.30966 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 3 2 1 0.1336 0.2672 0.4008 0.5344 0.668 SE +/- 0.00691, N = 3 SE +/- 0.01337, N = 3 SE +/- 0.00203, N = 3 0.59385 0.58300 0.57120 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 3 2 1 0.099 0.198 0.297 0.396 0.495 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.44 0.44 0.44 1. (CXX) g++ options: -O3 -pthread
OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 3 2 1 0.09 0.18 0.27 0.36 0.45 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.40 0.39 0.39 1. (CXX) g++ options: -O3 -pthread
OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 3 2 1 0.1148 0.2296 0.3444 0.4592 0.574 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.51 0.50 0.51 1. (CXX) g++ options: -O3 -pthread
OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 3 2 1 0.117 0.234 0.351 0.468 0.585 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.52 0.52 0.52 1. (CXX) g++ options: -O3 -pthread
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 3 2 1 12 24 36 48 60 SE +/- 0.24, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 54.31 52.78 54.60 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 3 2 1 3 6 9 12 15 SE +/- 0.14488, N = 3 SE +/- 0.17781, N = 3 SE +/- 0.04480, N = 3 9.67646 9.74083 10.04669 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 3 2 1 3 6 9 12 15 SE +/- 0.34471, N = 3 SE +/- 0.14154, N = 3 SE +/- 0.86774, N = 3 13.03417 8.80303 12.61727 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 3 2 1 0.0064 0.0128 0.0192 0.0256 0.032 SE +/- 0.00018, N = 3 SE +/- 0.00067, N = 3 SE +/- 0.00038, N = 3 0.02866 0.02798 0.02819 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 3 2 1 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 11.04 10.96 11.04
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 3 2 1 2K 4K 6K 8K 10K SE +/- 206.43, N = 3 SE +/- 181.12, N = 3 SE +/- 250.74, N = 3 10972.49 10831.97 10862.04 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 3 2 1 2K 4K 6K 8K 10K SE +/- 47.16, N = 3 SE +/- 9.81, N = 3 SE +/- 34.06, N = 3 9540.0 9647.9 9506.2 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 3 2 1 11 22 33 44 55 SE +/- 0.29, N = 3 SE +/- 0.25, N = 3 SE +/- 0.54, N = 3 46.87 47.12 47.46 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 3 2 1 2K 4K 6K 8K 10K SE +/- 57.74, N = 3 SE +/- 19.51, N = 3 SE +/- 64.14, N = 3 9188.3 9191.9 9128.2 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 3 2 1 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.77, N = 3 SE +/- 0.43, N = 15 45.94 47.55 46.60 1. (CC) gcc options: -O3
OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 3 2 1 2K 4K 6K 8K 10K SE +/- 18.60, N = 3 SE +/- 4.46, N = 3 SE +/- 20.12, N = 15 9130.5 9217.2 9106.9 1. (CC) gcc options: -O3
Stockfish This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 3 2 1 12M 24M 36M 48M 60M SE +/- 749500.39, N = 3 SE +/- 577141.31, N = 3 SE +/- 796222.48, N = 3 54984012 55404772 49821254 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing on the CPU with the water_GMX50 data. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 3 2 1 0.4181 0.8362 1.2543 1.6724 2.0905 SE +/- 0.003, N = 2 SE +/- 0.004, N = 3 SE +/- 0.004, N = 3 1.837 1.858 1.834 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 3 2 1 400K 800K 1200K 1600K 2000K SE +/- 16528.72, N = 3 SE +/- 25792.40, N = 15 SE +/- 20792.00, N = 15 1974193.46 1826296.85 1892562.21 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 3 2 1 300K 600K 900K 1200K 1500K SE +/- 13973.47, N = 3 SE +/- 19080.84, N = 4 SE +/- 18400.26, N = 3 1366164.67 1340079.29 1368828.88 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 3 2 1 500K 1000K 1500K 2000K 2500K SE +/- 27660.05, N = 15 SE +/- 31097.88, N = 15 SE +/- 31653.62, N = 15 2091130.67 2099666.52 2201882.03 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 3 2 1 400K 800K 1200K 1600K 2000K SE +/- 22941.42, N = 3 SE +/- 23714.82, N = 4 SE +/- 19958.53, N = 3 1577289.79 1654163.60 1675529.38 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Node.js V8 Web Tooling Benchmark Running the V8 project's Web-Tooling-Benchmark under Node.js. The Web-Tooling-Benchmark stresses JavaScript-related workloads common to web developers like Babel and TypeScript and Babylon. This test profile can test the system's JavaScript performance with Node.js. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 3 2 1 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.12, N = 4 SE +/- 0.03, N = 3 8.21 8.25 8.47 1. Nodejs
v12.18.2
PHPBench PHPBench is a benchmark suite for PHP. It performs a large number of simple tests in order to bench various aspects of the PHP interpreter. PHPBench can be used to compare hardware, operating systems, PHP versions, PHP accelerators and caches, compiler options, etc. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 2 1 120K 240K 360K 480K 600K SE +/- 145.08, N = 3 SE +/- 2547.76, N = 3 579360 577229
CLOMP CLOMP is the C version of the Livermore OpenMP benchmark developed to measure OpenMP overheads and other performance impacts due to threading in order to influence future system designs. This particular test profile configuration is currently set to look at the OpenMP static schedule speed-up across all available CPU cores using the recommended test configuration. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 3 2 1 12 24 36 48 60 SE +/- 0.68, N = 3 SE +/- 0.20, N = 2 SE +/- 0.57, N = 3 51.8 51.6 51.7 1. (CC) gcc options: -fopenmp -O3 -lm
BRL-CAD BRL-CAD 7.28.0 is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 2 1 60K 120K 180K 240K 300K 291013 286847 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1920 x 1080 3 2 1 900 1800 2700 3600 4500 SE +/- 2.85, N = 3 SE +/- 2.65, N = 3 SE +/- 2.96, N = 3 4268 4267 4384 1. (CXX) g++ options: -ldl -pipe -std=c++14 -fPIC -MD -MQ -MF
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3 2 1 2 4 6 8 10 SE +/- 0.09143, N = 3 SE +/- 0.08570, N = 3 SE +/- 0.05437, N = 3 6.21623 6.25011 6.27715 MIN: 5.67 MIN: 5.72 MIN: 5.63 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3 2 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.15, N = 15 SE +/- 0.19, N = 12 10.93 11.21 11.47 MIN: 10.81 MIN: 10.67 MIN: 10.94 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3 2 1 0.4784 0.9568 1.4352 1.9136 2.392 SE +/- 0.00796, N = 3 SE +/- 0.00369, N = 3 SE +/- 0.00488, N = 3 2.12611 2.10893 2.12408 MIN: 2.05 MIN: 2.03 MIN: 2.04 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3 2 1 0.7899 1.5798 2.3697 3.1596 3.9495 SE +/- 0.03734, N = 3 SE +/- 0.04824, N = 3 SE +/- 0.02421, N = 3 3.51087 3.49064 3.45370 MIN: 2 MIN: 2.01 MIN: 1.93 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 3 2 1 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.13, N = 3 20.25 20.15 19.97 MIN: 19.16 MIN: 19.19 MIN: 14.86 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 3 2 1 0.6789 1.3578 2.0367 2.7156 3.3945 SE +/- 0.01010, N = 3 SE +/- 0.01913, N = 3 SE +/- 0.02534, N = 3 3.01645 3.01721 2.99559 MIN: 2.85 MIN: 2.84 MIN: 2.82 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3 2 1 1.3362 2.6724 4.0086 5.3448 6.681 SE +/- 0.00240, N = 3 SE +/- 0.00211, N = 3 SE +/- 0.01022, N = 3 5.92300 5.93856 5.92604 MIN: 5.5 MIN: 5.71 MIN: 5.7 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 3 2 1 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 25.10 25.10 25.16 MIN: 23.92 MIN: 24.22 MIN: 24.12 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3 2 1 0.8464 1.6928 2.5392 3.3856 4.232 SE +/- 0.01146, N = 3 SE +/- 0.07933, N = 15 SE +/- 0.07703, N = 15 3.41699 3.76164 3.69657 MIN: 3.22 MIN: 3.26 MIN: 3.25 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 3 2 1 0.7805 1.561 2.3415 3.122 3.9025 SE +/- 0.02456, N = 3 SE +/- 0.00155, N = 3 SE +/- 0.00942, N = 3 3.46909 3.44980 3.43727 MIN: 3.35 MIN: 3.36 MIN: 3.34 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 3 2 1 3K 6K 9K 12K 15K SE +/- 132.58, N = 3 SE +/- 161.18, N = 3 SE +/- 110.93, N = 3 13753.9 13529.0 13861.2 MIN: 12649.5 MIN: 13190.1 MIN: 13490.9 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 3 2 1 800 1600 2400 3200 4000 SE +/- 8.76, N = 3 SE +/- 8.72, N = 3 SE +/- 16.86, N = 3 3783.40 3782.36 3749.18 MIN: 3758.06 MIN: 3756.39 MIN: 3631.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3 2 1 3K 6K 9K 12K 15K SE +/- 122.96, N = 11 SE +/- 123.89, N = 10 SE +/- 48.11, N = 3 13669.2 13295.0 13991.1 MIN: 12754.3 MIN: 12213.4 MIN: 13812.5 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 3 2 1 800 1600 2400 3200 4000 SE +/- 38.84, N = 8 SE +/- 46.19, N = 3 SE +/- 10.92, N = 3 3711.96 3821.24 3802.52 MIN: 3503.46 MIN: 3746.24 MIN: 3624.28 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 3 2 1 0.4715 0.943 1.4145 1.886 2.3575 SE +/- 0.03417, N = 15 SE +/- 0.05017, N = 15 SE +/- 0.05725, N = 15 2.00984 2.09567 1.65767 MIN: 1.39 MIN: 1.4 MIN: 1.11 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 3 2 1 3K 6K 9K 12K 15K SE +/- 128.38, N = 12 SE +/- 53.39, N = 3 SE +/- 141.29, N = 3 13668.4 13783.3 13826.2 MIN: 12362.1 MIN: 13567.9 MIN: 13450.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3 2 1 800 1600 2400 3200 4000 SE +/- 10.39, N = 3 SE +/- 13.99, N = 3 SE +/- 3.43, N = 3 3777.65 3780.47 3808.57 MIN: 3644.35 MIN: 3731.84 MIN: 3705.89 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 3 2 1 0.3352 0.6704 1.0056 1.3408 1.676 SE +/- 0.00013, N = 3 SE +/- 0.00191, N = 3 SE +/- 0.01123, N = 15 1.48957 1.48549 1.46928 MIN: 1.44 MIN: 1.45 MIN: 1.34 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 3 2 1 9 18 27 36 45 SE +/- 0.99, N = 12 SE +/- 0.55, N = 12 SE +/- 1.43, N = 12 34.91 34.56 37.24 MIN: 29.38 / MAX: 419.39 MIN: 30.05 / MAX: 404.5 MIN: 29.37 / MAX: 419.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 3 2 1 4 8 12 16 20 SE +/- 0.73, N = 12 SE +/- 0.34, N = 12 SE +/- 0.44, N = 12 15.78 15.60 15.67 MIN: 13.2 / MAX: 388.68 MIN: 13.33 / MAX: 383.7 MIN: 13.15 / MAX: 357.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 3 2 1 4 8 12 16 20 SE +/- 1.31, N = 12 SE +/- 1.39, N = 12 SE +/- 0.50, N = 12 15.21 15.70 14.44 MIN: 12.52 / MAX: 378.12 MIN: 12.6 / MAX: 382.98 MIN: 12.49 / MAX: 358.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 3 2 1 4 8 12 16 20 SE +/- 0.64, N = 12 SE +/- 0.15, N = 12 SE +/- 0.85, N = 12 15.30 14.42 15.28 MIN: 12.9 / MAX: 306.7 MIN: 13.2 / MAX: 104.86 MIN: 13.6 / MAX: 309.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 3 2 1 4 8 12 16 20 SE +/- 0.31, N = 12 SE +/- 0.64, N = 12 SE +/- 0.16, N = 12 14.17 14.50 13.60 MIN: 12.78 / MAX: 347.15 MIN: 12.55 / MAX: 348.4 MIN: 12.42 / MAX: 189.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 3 2 1 5 10 15 20 25 SE +/- 0.37, N = 12 SE +/- 1.79, N = 12 SE +/- 0.67, N = 12 18.97 20.92 20.01 MIN: 17.3 / MAX: 414.55 MIN: 17.12 / MAX: 465.23 MIN: 17.38 / MAX: 456.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 3 2 1 2 4 6 8 10 SE +/- 0.60, N = 12 SE +/- 0.10, N = 12 SE +/- 0.16, N = 12 7.65 6.67 6.83 MIN: 6.15 / MAX: 211.21 MIN: 6.12 / MAX: 175.66 MIN: 6.11 / MAX: 191.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 3 2 1 11 22 33 44 55 SE +/- 3.13, N = 12 SE +/- 2.39, N = 12 SE +/- 2.92, N = 12 42.95 42.04 47.87 MIN: 27.93 / MAX: 530.03 MIN: 28.93 / MAX: 517.65 MIN: 27.73 / MAX: 542.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 3 2 1 20 40 60 80 100 SE +/- 2.05, N = 12 SE +/- 1.61, N = 12 SE +/- 3.95, N = 12 95.61 92.98 102.38 MIN: 64.1 / MAX: 227.91 MIN: 61.58 / MAX: 223.7 MIN: 62.04 / MAX: 220.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 3 2 1 14 28 42 56 70 SE +/- 4.57, N = 12 SE +/- 5.28, N = 12 SE +/- 4.97, N = 12 64.49 54.29 59.91 MIN: 21.25 / MAX: 228.14 MIN: 22 / MAX: 230.1 MIN: 23.15 / MAX: 227.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 3 2 1 9 18 27 36 45 SE +/- 1.62, N = 12 SE +/- 2.63, N = 12 SE +/- 1.89, N = 12 35.69 34.34 37.72 MIN: 16.27 / MAX: 106.47 MIN: 15.18 / MAX: 103.77 MIN: 15.66 / MAX: 106.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 3 2 1 20 40 60 80 100 SE +/- 4.79, N = 12 SE +/- 2.78, N = 12 SE +/- 5.46, N = 12 73.65 75.03 79.53 MIN: 39.25 / MAX: 638.31 MIN: 38.58 / MAX: 559.97 MIN: 38.42 / MAX: 565.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 3 2 1 11 22 33 44 55 SE +/- 0.86, N = 12 SE +/- 1.04, N = 12 SE +/- 0.68, N = 12 49.25 49.69 50.19 MIN: 39.85 / MAX: 267.24 MIN: 39.24 / MAX: 224.21 MIN: 39.36 / MAX: 224.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 3 2 1 10 20 30 40 50 SE +/- 1.90, N = 12 SE +/- 1.74, N = 12 SE +/- 1.53, N = 12 39.13 40.80 41.89 MIN: 32.02 / MAX: 459.23 MIN: 31.26 / MAX: 438.66 MIN: 31.38 / MAX: 429.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 3 2 1 20 40 60 80 100 SE +/- 3.17, N = 12 SE +/- 1.29, N = 12 SE +/- 1.12, N = 12 103.89 101.47 102.93 MIN: 90.75 / MAX: 3380.17 MIN: 90.53 / MAX: 2458.71 MIN: 90.68 / MAX: 1519.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet 3 2 1 9 18 27 36 45 SE +/- 1.11, N = 9 SE +/- 0.72, N = 12 SE +/- 1.11, N = 12 34.39 33.78 37.30 MIN: 29.33 / MAX: 412.06 MIN: 29.45 / MAX: 408.23 MIN: 29.33 / MAX: 427.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 3 2 1 4 8 12 16 20 SE +/- 0.24, N = 9 SE +/- 0.63, N = 12 SE +/- 0.35, N = 12 14.97 16.30 15.59 MIN: 13.4 / MAX: 357.21 MIN: 13.31 / MAX: 389.24 MIN: 12.98 / MAX: 359.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 3 2 1 4 8 12 16 20 SE +/- 1.16, N = 9 SE +/- 0.34, N = 12 SE +/- 0.55, N = 12 15.95 14.22 14.58 MIN: 12.47 / MAX: 359.98 MIN: 12.46 / MAX: 387.96 MIN: 12.63 / MAX: 382.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 3 2 1 4 8 12 16 20 SE +/- 0.93, N = 9 SE +/- 0.14, N = 11 SE +/- 0.51, N = 12 15.75 14.65 14.92 MIN: 12.97 / MAX: 295.26 MIN: 13.43 / MAX: 114.64 MIN: 13.15 / MAX: 283.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet 3 2 1 4 8 12 16 20 SE +/- 0.35, N = 9 SE +/- 1.06, N = 12 SE +/- 0.70, N = 11 13.77 16.26 15.64 MIN: 12.4 / MAX: 378.94 MIN: 12.28 / MAX: 390.46 MIN: 12.3 / MAX: 352.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 3 2 1 5 10 15 20 25 SE +/- 0.66, N = 9 SE +/- 0.70, N = 12 SE +/- 0.25, N = 12 20.49 19.28 19.14 MIN: 17.45 / MAX: 438.66 MIN: 16.96 / MAX: 430.49 MIN: 17.51 / MAX: 352.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface 3 2 1 3 6 9 12 15 SE +/- 0.51, N = 9 SE +/- 0.76, N = 12 SE +/- 1.94, N = 12 7.09 7.40 9.82 MIN: 6.15 / MAX: 204.55 MIN: 6.14 / MAX: 215.25 MIN: 6.19 / MAX: 229.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet 3 2 1 9 18 27 36 45 SE +/- 1.78, N = 9 SE +/- 1.77, N = 12 SE +/- 2.33, N = 12 38.32 38.90 41.08 MIN: 28.17 / MAX: 505.55 MIN: 28.91 / MAX: 532.2 MIN: 28.69 / MAX: 513.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 3 2 1 20 40 60 80 100 SE +/- 2.35, N = 9 SE +/- 2.43, N = 12 SE +/- 2.14, N = 12 100.17 100.73 103.41 MIN: 63.35 / MAX: 242.13 MIN: 65.03 / MAX: 221.49 MIN: 63 / MAX: 216.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 3 2 1 13 26 39 52 65 SE +/- 3.64, N = 9 SE +/- 5.95, N = 12 SE +/- 3.22, N = 12 56.35 53.19 55.89 MIN: 22.77 / MAX: 219.04 MIN: 21.74 / MAX: 222.51 MIN: 23.8 / MAX: 226.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet 3 2 1 8 16 24 32 40 SE +/- 1.85, N = 9 SE +/- 1.09, N = 12 SE +/- 1.15, N = 12 35.93 32.39 33.56 MIN: 17.55 / MAX: 96.42 MIN: 17.56 / MAX: 91.67 MIN: 15.36 / MAX: 104.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 3 2 1 16 32 48 64 80 SE +/- 6.02, N = 9 SE +/- 5.68, N = 12 SE +/- 3.10, N = 12 70.92 73.86 72.06 MIN: 39.2 / MAX: 557.13 MIN: 40.4 / MAX: 546.35 MIN: 39.39 / MAX: 562.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny 3 2 1 11 22 33 44 55 SE +/- 1.42, N = 9 SE +/- 1.28, N = 12 SE +/- 1.15, N = 12 48.23 50.57 50.25 MIN: 39.54 / MAX: 230.79 MIN: 39.57 / MAX: 214.93 MIN: 39.88 / MAX: 213.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd 3 2 1 9 18 27 36 45 SE +/- 1.29, N = 9 SE +/- 1.39, N = 12 SE +/- 1.34, N = 12 39.14 40.01 41.28 MIN: 31.19 / MAX: 435.95 MIN: 31.52 / MAX: 514.49 MIN: 31.56 / MAX: 448.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m 3 2 1 20 40 60 80 100 SE +/- 2.24, N = 9 SE +/- 2.13, N = 12 SE +/- 1.78, N = 12 101.27 100.26 101.01 MIN: 89.99 / MAX: 2458.9 MIN: 90.99 / MAX: 1833.25 MIN: 90.72 / MAX: 1587.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Waifu2x-NCNN Vulkan Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No 1 0.4617 0.9234 1.3851 1.8468 2.3085 2.052
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 3 2 1 2 4 6 8 10 SE +/- 0.021, N = 5 SE +/- 0.012, N = 5 SE +/- 0.012, N = 5 7.774 7.795 7.753 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 3 2 1 1.1745 2.349 3.5235 4.698 5.8725 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 5.20 5.22 5.22 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 3 2 1 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.35 6.32 6.34 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 3 2 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 9.45 9.49 9.54 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 3 2 1 16 32 48 64 80 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 73.10 73.14 73.36 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Basis Universal Basis Universal is a GPU texture codoec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S 3 2 1 11 22 33 44 55 SE +/- 0.24, N = 3 SE +/- 0.10, N = 3 SE +/- 0.22, N = 3 48.16 47.60 47.60 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 3 2 1 2 4 6 8 10 SE +/- 0.019, N = 3 SE +/- 0.021, N = 3 SE +/- 0.057, N = 3 7.600 7.626 7.595 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 3 2 1 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 15.76 15.79 15.79 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 3 2 1 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 25.14 25.22 25.23 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing 3 2 1 140 280 420 560 700 SE +/- 0.07, N = 3 SE +/- 0.39, N = 3 SE +/- 1.61, N = 3 647.09 647.92 651.21 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 3 2 1 0.3458 0.6916 1.0374 1.3832 1.729 SE +/- 0.00358, N = 3 SE +/- 0.00474, N = 3 SE +/- 0.00435, N = 3 1.53509 1.53453 1.53681 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
1 Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820dGraphics Notes: GLAMORJava Notes: OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10)Python Notes: Python 3.8.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 23 December 2020 13:17 by user phoronix.
2 Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820dGraphics Notes: GLAMORJava Notes: OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10)Python Notes: Python 3.8.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 24 December 2020 05:54 by user phoronix.
3 Processor: AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads), Motherboard: ASUS ROG ZENITH EXTREME (1701 BIOS), Chipset: AMD 17h, Memory: 32GB, Disk: Samsung SSD 970 EVO 500GB + 250GB Western Digital WDS250G2X0C-00L350, Graphics: Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB (1244/1750MHz), Audio: Realtek ALC1220, Monitor: LG Ultra HD, Network: Intel I211 + Qualcomm Atheros QCA6174 802.11ac + Wilocity Wil6200 802.11ad
OS: Ubuntu 20.10, Kernel: 5.8.0-33-generic (x86_64), Desktop: GNOME Shell 3.38.1, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, OpenGL: 4.6 Mesa 20.2.1 (LLVM 11.0.0), Vulkan: 1.2.131, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1920x1080
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820dGraphics Notes: GLAMORJava Notes: OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10)Python Notes: Python 3.8.6Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 24 December 2020 17:35 by user phoronix.