TR 2990WX 2020 AMD Ryzen Threadripper 2990WX 32-Core testing with a ASUS ROG ZENITH EXTREME (1701 BIOS) and Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012260-HA-TR2990WX254&sor&grw&export=pdf .
TR 2990WX 2020 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads) ASUS ROG ZENITH EXTREME (1701 BIOS) AMD 17h 32GB Samsung SSD 970 EVO 500GB + 250GB Western Digital WDS250G2X0C-00L350 Gigabyte AMD Radeon RX 470/480/570/570X/580/580X/590 4GB (1244/1750MHz) Realtek ALC1220 LG Ultra HD Intel I211 + Qualcomm Atheros QCA6174 802.11ac + Wilocity Wil6200 802.11ad Ubuntu 20.10 5.8.0-33-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 modesetting 1.20.9 4.6 Mesa 20.2.1 (LLVM 11.0.0) 1.2.131 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x800820d Graphics Details - GLAMOR Java Details - OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10) Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
TR 2990WX 2020 vkmark: 1280 x 1024 yquake2: OpenGL 3.x - 1920 x 1080 yquake2: Software CPU - 1920 x 1080 compress-lz4: 1 - Decompression Speed yquake2: OpenGL 1.x - 1920 x 1080 vkmark: 1920 x 1080 compress-lz4: 1 - Compression Speed compress-lz4: 3 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 9 - Decompression Speed crafty: Elapsed Time clomp: Static OMP Speedup basis: ETC1S basis: UASTC Level 0 basis: UASTC Level 2 basis: UASTC Level 3 basis: UASTC Level 2 + RDO Post-Processing brl-cad: VGR Performance Metric encode-ape: WAV To APE encode-opus: WAV To Opus Encode encode-wavpack: WAV To WavPack astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive betsy: ETC1 - Highest betsy: ETC2 RGB - Highest espeak: Text-To-Speech Synthesis hmmer: Pfam Database Search mafft: Multiple Sequence Alignment - LSU RNA numpy: hpcc: G-HPL hpcc: G-Ffte hpcc: EP-DGEMM hpcc: G-Ptrans hpcc: EP-STREAM Triad hpcc: G-Rand Access hpcc: Rand Ring Latency hpcc: Rand Ring Bandwidth hpcc: Max Ping Pong Bandwidth ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m gromacs: Water Benchmark lammps: 20k Atoms lammps: Rhodopsin Protein onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU sunflow: Global Illumination + Image Synthesis build-clash: Time To Compile coremark: CoreMark Size 666 - Iterations Per Second build-ffmpeg: Time To Compile stockfish: Total Time compress-zstd: 3 compress-zstd: 19 asmfish: 1024 Hash Memory, 26 Depth kvazaar: Bosphorus 4K - Slow kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 1080p - Slow kvazaar: Bosphorus 1080p - Medium kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Ultra Fast kvazaar: Bosphorus 1080p - Very Fast kvazaar: Bosphorus 1080p - Ultra Fast x265: Bosphorus 4K x265: Bosphorus 1080p rav1e: 1 rav1e: 5 rav1e: 6 rav1e: 10 embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Asian Dragon Obj indigobench: CPU - Bedroom indigobench: CPU - Supercar build2: Time To Compile build-eigen: Time To Compile libplacebo: deband_heavy libplacebo: polar_nocompute libplacebo: hdr_peakdetect libplacebo: av1_grain_lap vkfft: vkresample: 2x - Single waifu2x-ncnn: 2x - 3 - No waifu2x-ncnn: 2x - 3 - Yes phpbench: PHP Benchmark Suite redis: LPOP redis: SADD redis: LPUSH redis: GET redis: SET sqlite-speedtest: Timed Time - Size 1,000 node-web-tooling: simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID 1 2 3 5850 949.7 106.6 9506.2 683.8 4384 8518.04 47.46 9128.2 46.60 9106.9 7329531 51.7 47.602 7.595 15.785 25.229 651.213 286847 14.029 7.753 13.199 5.22 6.34 9.54 73.36 10.754 12.697 30.847 163.935 12.153 296.80 54.59837 10.04669 12.61727 4.00164 1.30966 0.02819 1.53681 0.57120 10862.041 1304 966 2270 37.24 15.67 14.44 15.28 13.60 20.01 6.83 47.87 102.38 59.91 37.72 79.53 50.19 41.89 102.93 37.30 15.59 14.58 14.92 15.64 19.14 9.82 41.08 103.41 55.89 33.56 72.06 50.25 41.28 101.01 1.834 15.501 13.399 6.27715 11.4694 2.12408 3.45370 19.9734 2.99559 5.92604 25.1643 3.69657 3.43727 13861.2 3749.18 13991.1 3802.52 1.65767 13826.2 3808.57 1.46928 0.789 463.220 1146059.294691 46.084 49821254 3606.7 33.9 73804001 10.29 10.47 26.99 27.69 23.54 39.68 60.69 116.80 16.62 41.02 0.315 0.954 1.270 2.859 24.0193 22.7259 21.9564 18.2418 21.3156 17.6791 5.036 11.035 96.200 93.252 192.54 268.72 1762.98 448.24 9412 49.012 2.052 10.038 577229 2431710.35 1892562.21 1368828.88 2201882.03 1675529.38 67.207 8.47 0.44 0.39 0.51 0.52 5644 972.2 107.1 9647.9 619.9 4267 8619.84 47.12 9191.9 47.55 9217.2 7371374 51.6 47.602 7.626 15.790 25.217 647.924 291013 14.010 7.795 13.215 5.22 6.32 9.49 73.14 10.519 12.543 31.153 162.561 12.264 298.31 52.77753 9.74083 8.80303 4.34106 1.30771 0.02798 1.53453 0.58300 10831.974 1319 998 2317 34.56 15.60 15.70 14.42 14.50 20.92 6.67 42.04 92.98 54.29 34.34 75.03 49.69 40.80 101.47 33.78 16.30 14.22 14.65 16.26 19.28 7.40 38.90 100.73 53.19 32.39 73.86 50.57 40.01 100.26 1.858 15.418 12.913 6.25011 11.2085 2.10893 3.49064 20.1538 3.01721 5.93856 25.1026 3.76164 3.44980 13529.0 3782.36 13295.0 3821.24 2.09567 13783.3 3780.47 1.48549 0.731 477.084 1148000.044932 30.918 55404772 3798.9 42.0 74895187 10.38 10.56 27.13 27.84 23.57 39.67 60.69 116.09 15.26 40.36 0.320 0.951 1.273 2.830 24.6155 22.2365 22.4966 19.2992 21.6674 17.2357 4.994 10.963 96.340 92.981 192.43 268.36 1762.78 441.37 9451 48.996 10.055 579360 2058658.41 1826296.85 1340079.29 2099666.52 1654163.60 67.458 8.25 0.44 0.39 0.50 0.52 5640 972.5 106.2 9540.0 623.8 4268 8508.64 46.87 9188.3 45.94 9130.5 7366701 51.8 48.164 7.600 15.758 25.144 647.089 14.041 7.774 5.20 6.35 9.45 73.10 10.529 12.553 31.028 161.632 12.157 295.79 54.30507 9.67646 13.03417 4.17428 1.33820 0.02866 1.53509 0.59385 10972.494 34.91 15.78 15.21 15.30 14.17 18.97 7.65 42.95 95.61 64.49 35.69 73.65 49.25 39.13 103.89 34.39 14.97 15.95 15.75 13.77 20.49 7.09 38.32 100.17 56.35 35.93 70.92 48.23 39.14 101.27 1.837 15.187 12.551 6.21623 10.9285 2.12611 3.51087 20.2543 3.01645 5.92300 25.1041 3.41699 3.46909 13753.9 3783.40 13669.2 3711.96 2.00984 13668.4 3777.65 1.48957 485.485 1146667.406114 32.699 54984012 3781.2 36.2 72630737 10.32 10.53 26.92 27.78 23.61 39.39 60.79 116.68 16.55 41.06 0.318 0.953 1.274 2.853 24.3636 22.6453 22.2655 19.4014 21.5622 17.3999 5.057 11.036 93.058 192.22 267.90 1762.55 446.38 9417 49.021 10.074 1998586.16 1974193.46 1366164.67 2091130.67 1577289.79 67.247 8.21 0.44 0.40 0.51 0.52 OpenBenchmarking.org
VKMark Resolution: 1280 x 1024 OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1280 x 1024 1 2 3 1300 2600 3900 5200 6500 SE +/- 2.19, N = 3 SE +/- 5.90, N = 3 SE +/- 2.33, N = 3 5850 5644 5640 1. (CXX) g++ options: -ldl -pipe -std=c++14 -fPIC -MD -MQ -MF
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 3 2 1 200 400 600 800 1000 SE +/- 10.37, N = 3 SE +/- 13.22, N = 3 SE +/- 12.14, N = 3 972.5 972.2 949.7 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 2 1 3 20 40 60 80 100 SE +/- 1.01, N = 3 SE +/- 1.02, N = 3 SE +/- 0.93, N = 3 107.1 106.6 106.2 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 2 3 1 2K 4K 6K 8K 10K SE +/- 9.81, N = 3 SE +/- 47.16, N = 3 SE +/- 34.06, N = 3 9647.9 9540.0 9506.2 1. (CC) gcc options: -O3
yquake2 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 1 3 2 150 300 450 600 750 SE +/- 4.44, N = 3 SE +/- 5.54, N = 15 SE +/- 6.33, N = 8 683.8 623.8 619.9 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
VKMark Resolution: 1920 x 1080 OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1920 x 1080 1 3 2 900 1800 2700 3600 4500 SE +/- 2.96, N = 3 SE +/- 2.85, N = 3 SE +/- 2.65, N = 3 4384 4268 4267 1. (CXX) g++ options: -ldl -pipe -std=c++14 -fPIC -MD -MQ -MF
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 2 1 3 2K 4K 6K 8K 10K SE +/- 79.09, N = 3 SE +/- 65.90, N = 3 SE +/- 51.76, N = 3 8619.84 8518.04 8508.64 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 1 2 3 11 22 33 44 55 SE +/- 0.54, N = 3 SE +/- 0.25, N = 3 SE +/- 0.29, N = 3 47.46 47.12 46.87 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 2 3 1 2K 4K 6K 8K 10K SE +/- 19.51, N = 3 SE +/- 57.74, N = 3 SE +/- 64.14, N = 3 9191.9 9188.3 9128.2 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 2 1 3 11 22 33 44 55 SE +/- 0.77, N = 3 SE +/- 0.43, N = 15 SE +/- 0.01, N = 3 47.55 46.60 45.94 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 2 3 1 2K 4K 6K 8K 10K SE +/- 4.46, N = 3 SE +/- 18.60, N = 3 SE +/- 20.12, N = 15 9217.2 9130.5 9106.9 1. (CC) gcc options: -O3
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 2 3 1 1.6M 3.2M 4.8M 6.4M 8M SE +/- 6808.24, N = 3 SE +/- 5337.04, N = 3 SE +/- 7194.80, N = 3 7371374 7366701 7329531 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 3 1 2 12 24 36 48 60 SE +/- 0.68, N = 3 SE +/- 0.57, N = 3 SE +/- 0.20, N = 2 51.8 51.7 51.6 1. (CC) gcc options: -fopenmp -O3 -lm
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S 1 2 3 11 22 33 44 55 SE +/- 0.22, N = 3 SE +/- 0.10, N = 3 SE +/- 0.24, N = 3 47.60 47.60 48.16 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 1 3 2 2 4 6 8 10 SE +/- 0.057, N = 3 SE +/- 0.019, N = 3 SE +/- 0.021, N = 3 7.595 7.600 7.626 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 3 1 2 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 15.76 15.79 15.79 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 3 2 1 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 25.14 25.22 25.23 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing 3 2 1 140 280 420 560 700 SE +/- 0.07, N = 3 SE +/- 0.39, N = 3 SE +/- 1.61, N = 3 647.09 647.92 651.21 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 2 1 60K 120K 180K 240K 300K 291013 286847 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 2 1 3 4 8 12 16 20 SE +/- 0.06, N = 5 SE +/- 0.06, N = 5 SE +/- 0.06, N = 5 14.01 14.03 14.04 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 1 3 2 2 4 6 8 10 SE +/- 0.012, N = 5 SE +/- 0.021, N = 5 SE +/- 0.012, N = 5 7.753 7.774 7.795 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1 2 3 6 9 12 15 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 13.20 13.22 1. (CXX) g++ options: -rdynamic
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 3 1 2 1.1745 2.349 3.5235 4.698 5.8725 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 5.20 5.22 5.22 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 2 1 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 6.32 6.34 6.35 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 3 2 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 9.45 9.49 9.54 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 3 2 1 16 32 48 64 80 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 73.10 73.14 73.36 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest 2 3 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.23, N = 15 10.52 10.53 10.75 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest 2 3 1 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.24, N = 15 12.54 12.55 12.70 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 1 3 2 7 14 21 28 35 SE +/- 0.06, N = 4 SE +/- 0.07, N = 4 SE +/- 0.17, N = 4 30.85 31.03 31.15 1. (CC) gcc options: -O2 -std=c99
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 3 2 1 40 80 120 160 200 SE +/- 0.19, N = 3 SE +/- 0.23, N = 3 SE +/- 1.51, N = 10 161.63 162.56 163.94 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 1 3 2 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 SE +/- 0.15, N = 6 12.15 12.16 12.26 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 2 1 3 60 120 180 240 300 SE +/- 0.29, N = 3 SE +/- 0.54, N = 3 SE +/- 0.70, N = 3 298.31 296.80 295.79
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 1 3 2 12 24 36 48 60 SE +/- 0.14, N = 3 SE +/- 0.24, N = 3 SE +/- 0.10, N = 3 54.60 54.31 52.78 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 1 2 3 3 6 9 12 15 SE +/- 0.04480, N = 3 SE +/- 0.17781, N = 3 SE +/- 0.14488, N = 3 10.04669 9.74083 9.67646 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 3 1 2 3 6 9 12 15 SE +/- 0.34471, N = 3 SE +/- 0.86774, N = 3 SE +/- 0.14154, N = 3 13.03417 12.61727 8.80303 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 2 3 1 0.9767 1.9534 2.9301 3.9068 4.8835 SE +/- 0.03755, N = 3 SE +/- 0.08946, N = 3 SE +/- 0.25728, N = 3 4.34106 4.17428 4.00164 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 3 1 2 0.3011 0.6022 0.9033 1.2044 1.5055 SE +/- 0.07867, N = 3 SE +/- 0.05676, N = 3 SE +/- 0.03714, N = 3 1.33820 1.30966 1.30771 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 3 1 2 0.0064 0.0128 0.0192 0.0256 0.032 SE +/- 0.00018, N = 3 SE +/- 0.00038, N = 3 SE +/- 0.00067, N = 3 0.02866 0.02819 0.02798 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 2 3 1 0.3458 0.6916 1.0374 1.3832 1.729 SE +/- 0.00474, N = 3 SE +/- 0.00358, N = 3 SE +/- 0.00435, N = 3 1.53453 1.53509 1.53681 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 3 2 1 0.1336 0.2672 0.4008 0.5344 0.668 SE +/- 0.00691, N = 3 SE +/- 0.01337, N = 3 SE +/- 0.00203, N = 3 0.59385 0.58300 0.57120 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 3 1 2 2K 4K 6K 8K 10K SE +/- 206.43, N = 3 SE +/- 250.74, N = 3 SE +/- 181.12, N = 3 10972.49 10862.04 10831.97 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score 2 1 300 600 900 1200 1500 1319 1304
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score 2 1 200 400 600 800 1000 998 966
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score 2 1 500 1000 1500 2000 2500 2317 2270
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 2 3 1 9 18 27 36 45 SE +/- 0.55, N = 12 SE +/- 0.99, N = 12 SE +/- 1.43, N = 12 34.56 34.91 37.24 MIN: 30.05 / MAX: 404.5 MIN: 29.38 / MAX: 419.39 MIN: 29.37 / MAX: 419.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 2 1 3 4 8 12 16 20 SE +/- 0.34, N = 12 SE +/- 0.44, N = 12 SE +/- 0.73, N = 12 15.60 15.67 15.78 MIN: 13.33 / MAX: 383.7 MIN: 13.15 / MAX: 357.88 MIN: 13.2 / MAX: 388.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 1 3 2 4 8 12 16 20 SE +/- 0.50, N = 12 SE +/- 1.31, N = 12 SE +/- 1.39, N = 12 14.44 15.21 15.70 MIN: 12.49 / MAX: 358.11 MIN: 12.52 / MAX: 378.12 MIN: 12.6 / MAX: 382.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 2 1 3 4 8 12 16 20 SE +/- 0.15, N = 12 SE +/- 0.85, N = 12 SE +/- 0.64, N = 12 14.42 15.28 15.30 MIN: 13.2 / MAX: 104.86 MIN: 13.6 / MAX: 309.56 MIN: 12.9 / MAX: 306.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 1 3 2 4 8 12 16 20 SE +/- 0.16, N = 12 SE +/- 0.31, N = 12 SE +/- 0.64, N = 12 13.60 14.17 14.50 MIN: 12.42 / MAX: 189.74 MIN: 12.78 / MAX: 347.15 MIN: 12.55 / MAX: 348.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 3 1 2 5 10 15 20 25 SE +/- 0.37, N = 12 SE +/- 0.67, N = 12 SE +/- 1.79, N = 12 18.97 20.01 20.92 MIN: 17.3 / MAX: 414.55 MIN: 17.38 / MAX: 456.25 MIN: 17.12 / MAX: 465.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 2 1 3 2 4 6 8 10 SE +/- 0.10, N = 12 SE +/- 0.16, N = 12 SE +/- 0.60, N = 12 6.67 6.83 7.65 MIN: 6.12 / MAX: 175.66 MIN: 6.11 / MAX: 191.61 MIN: 6.15 / MAX: 211.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 2 3 1 11 22 33 44 55 SE +/- 2.39, N = 12 SE +/- 3.13, N = 12 SE +/- 2.92, N = 12 42.04 42.95 47.87 MIN: 28.93 / MAX: 517.65 MIN: 27.93 / MAX: 530.03 MIN: 27.73 / MAX: 542.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 2 3 1 20 40 60 80 100 SE +/- 1.61, N = 12 SE +/- 2.05, N = 12 SE +/- 3.95, N = 12 92.98 95.61 102.38 MIN: 61.58 / MAX: 223.7 MIN: 64.1 / MAX: 227.91 MIN: 62.04 / MAX: 220.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 2 1 3 14 28 42 56 70 SE +/- 5.28, N = 12 SE +/- 4.97, N = 12 SE +/- 4.57, N = 12 54.29 59.91 64.49 MIN: 22 / MAX: 230.1 MIN: 23.15 / MAX: 227.75 MIN: 21.25 / MAX: 228.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 2 3 1 9 18 27 36 45 SE +/- 2.63, N = 12 SE +/- 1.62, N = 12 SE +/- 1.89, N = 12 34.34 35.69 37.72 MIN: 15.18 / MAX: 103.77 MIN: 16.27 / MAX: 106.47 MIN: 15.66 / MAX: 106.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 3 2 1 20 40 60 80 100 SE +/- 4.79, N = 12 SE +/- 2.78, N = 12 SE +/- 5.46, N = 12 73.65 75.03 79.53 MIN: 39.25 / MAX: 638.31 MIN: 38.58 / MAX: 559.97 MIN: 38.42 / MAX: 565.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 3 2 1 11 22 33 44 55 SE +/- 0.86, N = 12 SE +/- 1.04, N = 12 SE +/- 0.68, N = 12 49.25 49.69 50.19 MIN: 39.85 / MAX: 267.24 MIN: 39.24 / MAX: 224.21 MIN: 39.36 / MAX: 224.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 3 2 1 10 20 30 40 50 SE +/- 1.90, N = 12 SE +/- 1.74, N = 12 SE +/- 1.53, N = 12 39.13 40.80 41.89 MIN: 32.02 / MAX: 459.23 MIN: 31.26 / MAX: 438.66 MIN: 31.38 / MAX: 429.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 2 1 3 20 40 60 80 100 SE +/- 1.29, N = 12 SE +/- 1.12, N = 12 SE +/- 3.17, N = 12 101.47 102.93 103.89 MIN: 90.53 / MAX: 2458.71 MIN: 90.68 / MAX: 1519.43 MIN: 90.75 / MAX: 3380.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet 2 3 1 9 18 27 36 45 SE +/- 0.72, N = 12 SE +/- 1.11, N = 9 SE +/- 1.11, N = 12 33.78 34.39 37.30 MIN: 29.45 / MAX: 408.23 MIN: 29.33 / MAX: 412.06 MIN: 29.33 / MAX: 427.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 3 1 2 4 8 12 16 20 SE +/- 0.24, N = 9 SE +/- 0.35, N = 12 SE +/- 0.63, N = 12 14.97 15.59 16.30 MIN: 13.4 / MAX: 357.21 MIN: 12.98 / MAX: 359.55 MIN: 13.31 / MAX: 389.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 2 1 3 4 8 12 16 20 SE +/- 0.34, N = 12 SE +/- 0.55, N = 12 SE +/- 1.16, N = 9 14.22 14.58 15.95 MIN: 12.46 / MAX: 387.96 MIN: 12.63 / MAX: 382.37 MIN: 12.47 / MAX: 359.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 2 1 3 4 8 12 16 20 SE +/- 0.14, N = 11 SE +/- 0.51, N = 12 SE +/- 0.93, N = 9 14.65 14.92 15.75 MIN: 13.43 / MAX: 114.64 MIN: 13.15 / MAX: 283.76 MIN: 12.97 / MAX: 295.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet 3 1 2 4 8 12 16 20 SE +/- 0.35, N = 9 SE +/- 0.70, N = 11 SE +/- 1.06, N = 12 13.77 15.64 16.26 MIN: 12.4 / MAX: 378.94 MIN: 12.3 / MAX: 352.98 MIN: 12.28 / MAX: 390.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 5 10 15 20 25 SE +/- 0.25, N = 12 SE +/- 0.70, N = 12 SE +/- 0.66, N = 9 19.14 19.28 20.49 MIN: 17.51 / MAX: 352.05 MIN: 16.96 / MAX: 430.49 MIN: 17.45 / MAX: 438.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface 3 2 1 3 6 9 12 15 SE +/- 0.51, N = 9 SE +/- 0.76, N = 12 SE +/- 1.94, N = 12 7.09 7.40 9.82 MIN: 6.15 / MAX: 204.55 MIN: 6.14 / MAX: 215.25 MIN: 6.19 / MAX: 229.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet 3 2 1 9 18 27 36 45 SE +/- 1.78, N = 9 SE +/- 1.77, N = 12 SE +/- 2.33, N = 12 38.32 38.90 41.08 MIN: 28.17 / MAX: 505.55 MIN: 28.91 / MAX: 532.2 MIN: 28.69 / MAX: 513.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 3 2 1 20 40 60 80 100 SE +/- 2.35, N = 9 SE +/- 2.43, N = 12 SE +/- 2.14, N = 12 100.17 100.73 103.41 MIN: 63.35 / MAX: 242.13 MIN: 65.03 / MAX: 221.49 MIN: 63 / MAX: 216.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 2 1 3 13 26 39 52 65 SE +/- 5.95, N = 12 SE +/- 3.22, N = 12 SE +/- 3.64, N = 9 53.19 55.89 56.35 MIN: 21.74 / MAX: 222.51 MIN: 23.8 / MAX: 226.22 MIN: 22.77 / MAX: 219.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet 2 1 3 8 16 24 32 40 SE +/- 1.09, N = 12 SE +/- 1.15, N = 12 SE +/- 1.85, N = 9 32.39 33.56 35.93 MIN: 17.56 / MAX: 91.67 MIN: 15.36 / MAX: 104.32 MIN: 17.55 / MAX: 96.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 3 1 2 16 32 48 64 80 SE +/- 6.02, N = 9 SE +/- 3.10, N = 12 SE +/- 5.68, N = 12 70.92 72.06 73.86 MIN: 39.2 / MAX: 557.13 MIN: 39.39 / MAX: 562.21 MIN: 40.4 / MAX: 546.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny 3 1 2 11 22 33 44 55 SE +/- 1.42, N = 9 SE +/- 1.15, N = 12 SE +/- 1.28, N = 12 48.23 50.25 50.57 MIN: 39.54 / MAX: 230.79 MIN: 39.88 / MAX: 213.87 MIN: 39.57 / MAX: 214.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd 3 2 1 9 18 27 36 45 SE +/- 1.29, N = 9 SE +/- 1.39, N = 12 SE +/- 1.34, N = 12 39.14 40.01 41.28 MIN: 31.19 / MAX: 435.95 MIN: 31.52 / MAX: 514.49 MIN: 31.56 / MAX: 448.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m 2 1 3 20 40 60 80 100 SE +/- 2.13, N = 12 SE +/- 1.78, N = 12 SE +/- 2.24, N = 9 100.26 101.01 101.27 MIN: 90.99 / MAX: 1833.25 MIN: 90.72 / MAX: 1587.31 MIN: 89.99 / MAX: 2458.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 2 3 1 0.4181 0.8362 1.2543 1.6724 2.0905 SE +/- 0.004, N = 3 SE +/- 0.003, N = 2 SE +/- 0.004, N = 3 1.858 1.837 1.834 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 1 2 3 4 8 12 16 20 SE +/- 0.21, N = 3 SE +/- 0.02, N = 3 SE +/- 0.21, N = 4 15.50 15.42 15.19 1. (CXX) g++ options: -O3 -pthread -lm
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 3 6 9 12 15 SE +/- 0.36, N = 12 SE +/- 0.30, N = 15 SE +/- 0.15, N = 3 13.40 12.91 12.55 1. (CXX) g++ options: -O3 -pthread -lm
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3 2 1 2 4 6 8 10 SE +/- 0.09143, N = 3 SE +/- 0.08570, N = 3 SE +/- 0.05437, N = 3 6.21623 6.25011 6.27715 MIN: 5.67 MIN: 5.72 MIN: 5.63 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3 2 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.15, N = 15 SE +/- 0.19, N = 12 10.93 11.21 11.47 MIN: 10.81 MIN: 10.67 MIN: 10.94 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 2 1 3 0.4784 0.9568 1.4352 1.9136 2.392 SE +/- 0.00369, N = 3 SE +/- 0.00488, N = 3 SE +/- 0.00796, N = 3 2.10893 2.12408 2.12611 MIN: 2.03 MIN: 2.04 MIN: 2.05 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.7899 1.5798 2.3697 3.1596 3.9495 SE +/- 0.02421, N = 3 SE +/- 0.04824, N = 3 SE +/- 0.03734, N = 3 3.45370 3.49064 3.51087 MIN: 1.93 MIN: 2.01 MIN: 2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 19.97 20.15 20.25 MIN: 14.86 MIN: 19.19 MIN: 19.16 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 3 2 0.6789 1.3578 2.0367 2.7156 3.3945 SE +/- 0.02534, N = 3 SE +/- 0.01010, N = 3 SE +/- 0.01913, N = 3 2.99559 3.01645 3.01721 MIN: 2.82 MIN: 2.85 MIN: 2.84 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3 1 2 1.3362 2.6724 4.0086 5.3448 6.681 SE +/- 0.00240, N = 3 SE +/- 0.01022, N = 3 SE +/- 0.00211, N = 3 5.92300 5.92604 5.93856 MIN: 5.5 MIN: 5.7 MIN: 5.71 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 2 3 1 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 25.10 25.10 25.16 MIN: 24.22 MIN: 23.92 MIN: 24.12 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3 1 2 0.8464 1.6928 2.5392 3.3856 4.232 SE +/- 0.01146, N = 3 SE +/- 0.07703, N = 15 SE +/- 0.07933, N = 15 3.41699 3.69657 3.76164 MIN: 3.22 MIN: 3.25 MIN: 3.26 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.7805 1.561 2.3415 3.122 3.9025 SE +/- 0.00942, N = 3 SE +/- 0.00155, N = 3 SE +/- 0.02456, N = 3 3.43727 3.44980 3.46909 MIN: 3.34 MIN: 3.36 MIN: 3.35 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 2 3 1 3K 6K 9K 12K 15K SE +/- 161.18, N = 3 SE +/- 132.58, N = 3 SE +/- 110.93, N = 3 13529.0 13753.9 13861.2 MIN: 13190.1 MIN: 12649.5 MIN: 13490.9 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 16.86, N = 3 SE +/- 8.72, N = 3 SE +/- 8.76, N = 3 3749.18 3782.36 3783.40 MIN: 3631.66 MIN: 3756.39 MIN: 3758.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 2 3 1 3K 6K 9K 12K 15K SE +/- 123.89, N = 10 SE +/- 122.96, N = 11 SE +/- 48.11, N = 3 13295.0 13669.2 13991.1 MIN: 12213.4 MIN: 12754.3 MIN: 13812.5 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 3 1 2 800 1600 2400 3200 4000 SE +/- 38.84, N = 8 SE +/- 10.92, N = 3 SE +/- 46.19, N = 3 3711.96 3802.52 3821.24 MIN: 3503.46 MIN: 3624.28 MIN: 3746.24 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 3 2 0.4715 0.943 1.4145 1.886 2.3575 SE +/- 0.05725, N = 15 SE +/- 0.03417, N = 15 SE +/- 0.05017, N = 15 1.65767 2.00984 2.09567 MIN: 1.11 MIN: 1.39 MIN: 1.4 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 3 2 1 3K 6K 9K 12K 15K SE +/- 128.38, N = 12 SE +/- 53.39, N = 3 SE +/- 141.29, N = 3 13668.4 13783.3 13826.2 MIN: 12362.1 MIN: 13567.9 MIN: 13450.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3 2 1 800 1600 2400 3200 4000 SE +/- 10.39, N = 3 SE +/- 13.99, N = 3 SE +/- 3.43, N = 3 3777.65 3780.47 3808.57 MIN: 3644.35 MIN: 3731.84 MIN: 3705.89 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.3352 0.6704 1.0056 1.3408 1.676 SE +/- 0.01123, N = 15 SE +/- 0.00191, N = 3 SE +/- 0.00013, N = 3 1.46928 1.48549 1.48957 MIN: 1.34 MIN: 1.45 MIN: 1.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Sunflow Rendering System Global Illumination + Image Synthesis OpenBenchmarking.org Seconds, Fewer Is Better Sunflow Rendering System 0.07.2 Global Illumination + Image Synthesis 2 1 0.1775 0.355 0.5325 0.71 0.8875 SE +/- 0.011, N = 15 SE +/- 0.004, N = 3 0.731 0.789 MIN: 0.49 / MAX: 1.57 MIN: 0.57 / MAX: 1.63
Timed Clash Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Clash Compilation Time To Compile 1 2 3 110 220 330 440 550 SE +/- 13.67, N = 9 SE +/- 4.96, N = 9 SE +/- 0.58, N = 3 463.22 477.08 485.49
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 2 3 1 200K 400K 600K 800K 1000K SE +/- 686.93, N = 3 SE +/- 1713.53, N = 3 SE +/- 2731.02, N = 3 1148000.04 1146667.41 1146059.29 1. (CC) gcc options: -O2 -lrt" -lrt
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 2 3 1 10 20 30 40 50 SE +/- 0.15, N = 3 SE +/- 0.18, N = 3 SE +/- 0.18, N = 3 30.92 32.70 46.08
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 2 3 1 12M 24M 36M 48M 60M SE +/- 577141.31, N = 3 SE +/- 749500.39, N = 3 SE +/- 796222.48, N = 3 55404772 54984012 49821254 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 2 3 1 800 1600 2400 3200 4000 SE +/- 129.73, N = 12 SE +/- 107.86, N = 15 SE +/- 88.27, N = 12 3798.9 3781.2 3606.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 2 3 1 10 20 30 40 50 SE +/- 0.34, N = 3 SE +/- 0.39, N = 3 SE +/- 0.37, N = 15 42.0 36.2 33.9 1. (CC) gcc options: -O3 -pthread -lz -llzma
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 2 1 3 16M 32M 48M 64M 80M SE +/- 126998.98, N = 3 SE +/- 738089.20, N = 3 SE +/- 633798.86, N = 3 74895187 73804001 72630737
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Slow 2 3 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 10.38 10.32 10.29 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 2 3 1 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 10.56 10.53 10.47 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Slow 2 1 3 6 12 18 24 30 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.18, N = 3 27.13 26.99 26.92 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 2 3 1 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.15, N = 3 SE +/- 0.07, N = 3 27.84 27.78 27.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 3 2 1 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 23.61 23.57 23.54 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 1 2 3 9 18 27 36 45 SE +/- 0.28, N = 3 SE +/- 0.15, N = 3 SE +/- 0.03, N = 3 39.68 39.67 39.39 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 3 2 1 14 28 42 56 70 SE +/- 0.26, N = 3 SE +/- 0.20, N = 3 SE +/- 0.21, N = 3 60.79 60.69 60.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 1 3 2 30 60 90 120 150 SE +/- 0.15, N = 3 SE +/- 0.26, N = 3 SE +/- 0.66, N = 3 116.80 116.68 116.09 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 1 3 2 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 16.62 16.55 15.26 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 3 1 2 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.23, N = 3 SE +/- 0.05, N = 3 41.06 41.02 40.36 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 2 3 1 0.072 0.144 0.216 0.288 0.36 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.320 0.318 0.315
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 1 3 2 0.2147 0.4294 0.6441 0.8588 1.0735 SE +/- 0.002, N = 3 SE +/- 0.004, N = 3 SE +/- 0.002, N = 3 0.954 0.953 0.951
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 3 2 1 0.2867 0.5734 0.8601 1.1468 1.4335 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 1.274 1.273 1.270
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 1 3 2 0.6433 1.2866 1.9299 2.5732 3.2165 SE +/- 0.006, N = 3 SE +/- 0.012, N = 3 SE +/- 0.018, N = 3 2.859 2.853 2.830
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown 2 3 1 6 12 18 24 30 SE +/- 0.22, N = 3 SE +/- 0.31, N = 3 SE +/- 0.33, N = 15 24.62 24.36 24.02 MIN: 23.95 / MAX: 25.61 MIN: 23 / MAX: 25.59 MIN: 19.33 / MAX: 25.81
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown 1 3 2 5 10 15 20 25 SE +/- 0.21, N = 3 SE +/- 0.22, N = 3 SE +/- 0.38, N = 3 22.73 22.65 22.24 MIN: 21.97 / MAX: 23.66 MIN: 21.43 / MAX: 23.62 MIN: 20.82 / MAX: 23.48
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon 2 3 1 5 10 15 20 25 SE +/- 0.25, N = 15 SE +/- 0.26, N = 15 SE +/- 0.31, N = 15 22.50 22.27 21.96 MIN: 20.59 / MAX: 25.14 MIN: 20.37 / MAX: 25.26 MIN: 19.03 / MAX: 24.66
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon Obj 3 2 1 5 10 15 20 25 SE +/- 0.44, N = 12 SE +/- 0.37, N = 15 SE +/- 0.25, N = 15 19.40 19.30 18.24 MIN: 16.59 / MAX: 22.2 MIN: 16.68 / MAX: 22.31 MIN: 16.42 / MAX: 21.1
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon 2 3 1 5 10 15 20 25 SE +/- 0.26, N = 15 SE +/- 0.19, N = 15 SE +/- 0.09, N = 3 21.67 21.56 21.32 MIN: 19.41 / MAX: 23.55 MIN: 19.33 / MAX: 23.88 MIN: 20.36 / MAX: 22.38
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon Obj 1 3 2 4 8 12 16 20 SE +/- 0.39, N = 12 SE +/- 0.30, N = 15 SE +/- 0.39, N = 12 17.68 17.40 17.24 MIN: 15.66 / MAX: 20.6 MIN: 15.12 / MAX: 19.79 MIN: 14.95 / MAX: 20.29
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 3 1 2 1.1378 2.2756 3.4134 4.5512 5.689 SE +/- 0.019, N = 3 SE +/- 0.025, N = 3 SE +/- 0.054, N = 12 5.057 5.036 4.994
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 3 1 2 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 11.04 11.04 10.96
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 1 2 20 40 60 80 100 SE +/- 0.24, N = 3 SE +/- 0.21, N = 3 96.20 96.34
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 2 3 1 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 92.98 93.06 93.25
Libplacebo Test: deband_heavy OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: deband_heavy 1 2 3 40 80 120 160 200 SE +/- 0.36, N = 3 SE +/- 0.37, N = 3 SE +/- 0.29, N = 3 192.54 192.43 192.22 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
Libplacebo Test: polar_nocompute OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: polar_nocompute 1 2 3 60 120 180 240 300 SE +/- 0.83, N = 3 SE +/- 0.53, N = 3 SE +/- 0.38, N = 3 268.72 268.36 267.90 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
Libplacebo Test: hdr_peakdetect OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: hdr_peakdetect 1 2 3 400 800 1200 1600 2000 SE +/- 0.03, N = 3 SE +/- 0.18, N = 3 SE +/- 0.10, N = 3 1762.98 1762.78 1762.55 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
Libplacebo Test: av1_grain_lap OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: av1_grain_lap 1 3 2 100 200 300 400 500 SE +/- 3.42, N = 3 SE +/- 1.25, N = 3 SE +/- 3.25, N = 3 448.24 446.38 441.37 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 2 3 1 2K 4K 6K 8K 10K SE +/- 49.17, N = 3 SE +/- 6.74, N = 3 SE +/- 13.38, N = 3 9451 9417 9412 1. (CXX) g++ options: -O3 -pthread
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single 2 1 3 11 22 33 44 55 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 49.00 49.01 49.02 1. (CXX) g++ options: -O3 -pthread
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No 1 0.4617 0.9234 1.3851 1.8468 2.3085 2.052
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes 1 2 3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 10.04 10.06 10.07
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 2 1 120K 240K 360K 480K 600K SE +/- 145.08, N = 3 SE +/- 2547.76, N = 3 579360 577229
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 3 500K 1000K 1500K 2000K 2500K SE +/- 44065.81, N = 15 SE +/- 132410.87, N = 12 SE +/- 139985.77, N = 12 2431710.35 2058658.41 1998586.16 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 3 1 2 400K 800K 1200K 1600K 2000K SE +/- 16528.72, N = 3 SE +/- 20792.00, N = 15 SE +/- 25792.40, N = 15 1974193.46 1892562.21 1826296.85 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 3 2 300K 600K 900K 1200K 1500K SE +/- 18400.26, N = 3 SE +/- 13973.47, N = 3 SE +/- 19080.84, N = 4 1368828.88 1366164.67 1340079.29 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 3 500K 1000K 1500K 2000K 2500K SE +/- 31653.62, N = 15 SE +/- 31097.88, N = 15 SE +/- 27660.05, N = 15 2201882.03 2099666.52 2091130.67 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 19958.53, N = 3 SE +/- 23714.82, N = 4 SE +/- 22941.42, N = 3 1675529.38 1654163.60 1577289.79 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 1 3 2 15 30 45 60 75 SE +/- 0.47, N = 3 SE +/- 0.13, N = 3 SE +/- 0.25, N = 3 67.21 67.25 67.46 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 1 2 3 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.12, N = 4 SE +/- 0.04, N = 3 8.47 8.25 8.21 1. Nodejs
v12.18.2
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 3 2 1 0.099 0.198 0.297 0.396 0.495 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.44 0.44 0.44 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 3 2 1 0.09 0.18 0.27 0.36 0.45 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.40 0.39 0.39 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 3 1 2 0.1148 0.2296 0.3444 0.4592 0.574 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 0.51 0.51 0.50 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 3 2 1 0.117 0.234 0.351 0.468 0.585 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.52 0.52 0.52 1. (CXX) g++ options: -O3 -pthread
Phoronix Test Suite v10.8.5