Ryzen 7 1800X Ubuntu 2020 AMD Ryzen 7 1800X Eight-Core testing with a MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0 (1.F0 BIOS) and AMD Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X 2GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012221-HA-RYZEN718094&grr .
Ryzen 7 1800X Ubuntu 2020 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 4 AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (8 Cores / 16 Threads) MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0 (1.F0 BIOS) AMD 17h 8GB Samsung SSD 950 PRO 256GB AMD Radeon RX 460/560D / Pro 450/455/460/555/555X/560/560X 2GB (1212/1750MHz) AMD Baffin HDMI/DP LG Ultra HD Intel I211 Ubuntu 20.10 5.8.0-21-generic (x86_64) GNOME Shell 3.38.0 X Server 1.20.8 modesetting 1.20.8 4.6 Mesa 20.2.0 (LLVM 11.0.0) 1.2.131 GCC 10.2.0 ext4 3840x2160 GNOME Shell 3.38.1 4.6 Mesa 20.2.1 (LLVM 11.0.0) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8001137 Java Details - OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10) Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Ryzen 7 1800X Ubuntu 2020 basis: UASTC Level 2 + RDO Post-Processing hpcc: G-HPL build-clash: Time To Compile gromacs: Water Benchmark numpy: brl-cad: VGR Performance Metric astcenc: Exhaustive build2: Time To Compile asmfish: 1024 Hash Memory, 26 Depth clomp: Static OMP Speedup kvazaar: Bosphorus 4K - Slow kvazaar: Bosphorus 4K - Medium hmmer: Pfam Database Search simdjson: LargeRand simdjson: PartialTweets embree: Pathtracer ISPC - Crown onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU build-eigen: Time To Compile ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet node-web-tooling: embree: Pathtracer ISPC - Asian Dragon Obj compress-lz4: 9 - Decompression Speed compress-lz4: 9 - Compression Speed onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU embree: Pathtracer - Asian Dragon Obj onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU compress-lz4: 3 - Decompression Speed compress-lz4: 3 - Compression Speed build-ffmpeg: Time To Compile embree: Pathtracer - Crown x265: Bosphorus 4K sqlite-speedtest: Timed Time - Size 1,000 basis: UASTC Level 3 embree: Pathtracer ISPC - Asian Dragon rav1e: 1 rav1e: 5 stockfish: Total Time embree: Pathtracer - Asian Dragon indigobench: CPU - Bedroom indigobench: CPU - Supercar onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Ultra Fast sunflow: Global Illumination + Image Synthesis simdjson: DistinctUserID basis: ETC1S espeak: Text-To-Speech Synthesis rav1e: 6 simdjson: Kostya redis: GET onednn: IP Shapes 1D - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU basis: UASTC Level 2 astcenc: Thorough kvazaar: Bosphorus 1080p - Slow phpbench: PHP Benchmark Suite kvazaar: Bosphorus 1080p - Medium compress-lz4: 1 - Decompression Speed compress-lz4: 1 - Compression Speed rav1e: 10 crafty: Elapsed Time redis: LPOP encode-ape: WAV To APE coremark: CoreMark Size 666 - Iterations Per Second encode-wavpack: WAV To WavPack astcenc: Medium x265: Bosphorus 1080p kvazaar: Bosphorus 1080p - Very Fast onednn: IP Shapes 1D - u8s8f32 - CPU encode-opus: WAV To Opus Encode lammps: Rhodopsin Protein onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU redis: LPUSH redis: SET redis: SADD astcenc: Fast onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU basis: UASTC Level 0 kvazaar: Bosphorus 1080p - Ultra Fast onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU hpcc: Max Ping Pong Bandwidth hpcc: Rand Ring Bandwidth hpcc: Rand Ring Latency hpcc: G-Rand Access hpcc: EP-STREAM Triad hpcc: G-Ptrans hpcc: EP-DGEMM hpcc: G-Ffte 1 2 3 4 749.561 64.57053 311.809 0.597 275.19 94326 265.75 203.706 22191924 8.2 3.65 3.72 129.777 0.38 0.50 7.5511 7668.24 7691.66 7647.51 100.165 28.24 31.39 40.78 49.40 15.20 21.11 73.59 26.13 3.06 12.64 8.80 9.27 8.68 9.69 32.98 8.65 8.0929 8632.6 41.76 13.3026 8.7299 4159.90 4131.16 4140.23 8678.0 43.21 73.965 8.2428 8.32 72.358 71.202 8.8661 0.294 0.886 15806704 9.3790 1.542 3.271 5.42581 10.25 18.17 1.483 0.51 55.890 32.944 1.189 0.42 2249907.25 10.71635 12.3645 37.904 32.29 16.12 546954 16.49 9086.1 8097.82 2.742 6427875 2375150.83 15.048 310974.493565 14.073 9.82 36.04 37.80 6.17482 8.280 5.082 3.95326 1219147.04 1530872.96 1809735.13 6.68 3.82872 11.1318 8.938 68.41 20.2789 19.8740 11.5219 16.7372 11544.769 1.45770 0.56116 0.02620 3.43263 2.48824 19.76267 4.44974 749.116 65.50565 310.718 0.599 278.37 92840 265.51 203.427 22669744 7.7 3.66 3.72 129.686 0.38 0.50 7.7130 7632.59 7622.54 7614.36 99.913 27.10 31.01 40.33 48.68 15.05 21.31 73.31 25.55 3.06 12.46 8.69 9.06 8.30 9.47 33.18 8.62 8.1077 8609.2 42.74 12.8109 8.7041 4128.18 4123.22 4109.42 8644.6 42.62 73.725 8.3572 8.35 71.637 71.124 8.9993 0.296 0.887 15878386 9.3104 1.544 3.251 5.49371 10.24 18.21 1.468 0.51 56.226 32.917 1.192 0.42 2005461.67 9.14346 10.38859 37.911 32.32 16.16 551754 16.53 8892.2 8113.28 2.740 6677773 2023932.30 15.023 311216.666941 14.086 9.88 36.59 37.98 5.99918 8.274 5.035 4.01853 1235368.71 1506982.37 1742844.0 6.65 3.21759 10.9921 8.975 68.78 20.7838 20.1134 11.6473 15.5972 11544.788 1.46282 0.56775 0.02615 3.47112 2.83742 20.09887 4.46749 749.465 66.14843 310.671 0.597 274.84 92276 270.54 203.144 22499028 7.9 3.66 3.74 129.866 0.38 0.5 7.6994 7632.29 7602.27 7601.15 99.638 27.46 30.91 40.99 49.47 15.29 21.33 74.36 26.04 3.21 13.38 9.05 9.36 8.45 9.74 32.41 8.59 8.1225 8540.5 41.46 12.8327 8.6989 4033.64 4041.05 4019.42 8604.3 42.42 73.803 8.3161 8.20 71.529 71.485 9.0116 0.296 0.890 16120996 9.4291 1.541 3.267 5.47616 10.20 18.26 1.489 0.51 55.800 33.226 1.193 0.41 1958857.87 7.90181 9.63577 38.053 32.82 16.18 546637 16.55 8941.1 8087.54 2.748 6815751 1253261.46 14.960 311939.821068 14.045 9.05 36.92 38.01 5.94456 8.271 5.181 4.76518 1263884.50 1529187.75 1799244.42 6.86 2.49334 10.1379 8.999 68.99 21.7239 21.1536 11.5394 14.7606 11926.906 1.46626 0.55880 0.02623 3.46886 2.84596 20.31133 4.46806 750.052 67.87867 312.511 0.597 272.72 93406 270.67 203.498 22355945 8.1 3.63 3.69 131.038 0.37 0.48 7.6548 7629.66 7637.10 7615.86 99.869 27.08 31.29 40.94 50.43 15.16 21.09 74.16 26.35 3.25 12.41 8.36 9.22 8.19 9.49 32.46 8.58 8.1055 8581.1 39.95 11.7175 8.7484 4036.80 4050.96 4053.79 8527.7 42.91 73.947 8.3815 8.41 71.692 71.591 8.8984 0.296 0.887 15865176 9.4739 1.540 3.259 5.46446 10.17 17.75 1.504 0.51 56.245 33.113 1.188 0.41 1987133.40 8.13194 9.79079 38.136 32.82 16.04 544820 16.41 8886.7 8150.13 2.724 6942505 1981072.63 15.019 278575.832132 13.996 9.94 36.18 37.63 5.97066 8.430 4.983 4.72633 1273184.81 1515400.67 1795470.25 6.82 2.49930 9.80423 8.979 68.49 21.5680 20.9694 11.4645 14.7377 11599.971 1.46081 0.56703 0.02621 3.46879 2.77808 20.32227 4.41886 OpenBenchmarking.org
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing 1 2 3 4 160 320 480 640 800 SE +/- 0.61, N = 3 SE +/- 0.30, N = 3 SE +/- 0.52, N = 3 SE +/- 0.62, N = 3 749.56 749.12 749.47 750.05 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 1 2 3 4 15 30 45 60 75 SE +/- 0.51, N = 3 SE +/- 0.93, N = 4 SE +/- 0.03, N = 3 SE +/- 0.39, N = 3 64.57 65.51 66.15 67.88 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
Timed Clash Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Clash Compilation Time To Compile 1 2 3 4 70 140 210 280 350 SE +/- 0.57, N = 3 SE +/- 0.85, N = 3 SE +/- 0.68, N = 3 SE +/- 0.75, N = 3 311.81 310.72 310.67 312.51
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 1 2 3 4 0.1348 0.2696 0.4044 0.5392 0.674 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.597 0.599 0.597 0.597 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 1 2 3 4 60 120 180 240 300 SE +/- 1.13, N = 3 SE +/- 0.56, N = 3 SE +/- 0.53, N = 3 SE +/- 0.70, N = 3 275.19 278.37 274.84 272.72
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 3 4 20K 40K 60K 80K 100K 94326 92840 92276 93406 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 1 2 3 4 60 120 180 240 300 SE +/- 0.31, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.70, N = 3 265.75 265.51 270.54 270.67 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 1 2 3 4 40 80 120 160 200 SE +/- 0.68, N = 3 SE +/- 1.04, N = 3 SE +/- 1.32, N = 3 SE +/- 0.38, N = 3 203.71 203.43 203.14 203.50
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 1 2 3 4 5M 10M 15M 20M 25M SE +/- 126857.99, N = 3 SE +/- 181612.11, N = 3 SE +/- 139738.13, N = 3 SE +/- 107724.99, N = 3 22191924 22669744 22499028 22355945
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 2 3 4 2 4 6 8 10 SE +/- 0.12, N = 3 SE +/- 0.07, N = 15 SE +/- 0.12, N = 3 SE +/- 0.07, N = 10 8.2 7.7 7.9 8.1 1. (CC) gcc options: -fopenmp -O3 -lm
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Slow 1 2 3 4 0.8235 1.647 2.4705 3.294 4.1175 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.65 3.66 3.66 3.63 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 1 2 3 4 0.8415 1.683 2.5245 3.366 4.2075 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.72 3.72 3.74 3.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 4 30 60 90 120 150 SE +/- 0.12, N = 3 SE +/- 0.30, N = 3 SE +/- 0.29, N = 3 SE +/- 0.35, N = 3 129.78 129.69 129.87 131.04 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 1 2 3 4 0.0855 0.171 0.2565 0.342 0.4275 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 15 0.38 0.38 0.38 0.37 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 1 2 3 4 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 15 0.50 0.50 0.50 0.48 1. (CXX) g++ options: -O3 -pthread
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown 1 2 3 4 2 4 6 8 10 SE +/- 0.0837, N = 7 SE +/- 0.0357, N = 3 SE +/- 0.0561, N = 3 SE +/- 0.0273, N = 3 7.5511 7.7130 7.6994 7.6548 MIN: 6.99 / MAX: 7.77 MIN: 7.63 / MAX: 7.89 MIN: 7.43 / MAX: 7.87 MIN: 7.54 / MAX: 7.79
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 1600 3200 4800 6400 8000 SE +/- 13.57, N = 3 SE +/- 10.03, N = 3 SE +/- 5.63, N = 3 SE +/- 17.72, N = 3 7668.24 7632.59 7632.29 7629.66 MIN: 7615.61 MIN: 7582.41 MIN: 7596.46 MIN: 7565.32 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 1600 3200 4800 6400 8000 SE +/- 62.42, N = 3 SE +/- 12.97, N = 3 SE +/- 5.57, N = 3 SE +/- 3.89, N = 3 7691.66 7622.54 7602.27 7637.10 MIN: 7573.71 MIN: 7569.78 MIN: 7559.13 MIN: 7599.54 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 4 1600 3200 4800 6400 8000 SE +/- 6.51, N = 3 SE +/- 18.50, N = 3 SE +/- 16.63, N = 3 SE +/- 20.02, N = 3 7647.51 7614.36 7601.15 7615.86 MIN: 7608.91 MIN: 7557.67 MIN: 7549.66 MIN: 7564.68 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 2 3 4 20 40 60 80 100 SE +/- 0.17, N = 3 SE +/- 0.02, N = 3 SE +/- 0.29, N = 3 SE +/- 0.14, N = 3 100.17 99.91 99.64 99.87
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 1 2 3 4 7 14 21 28 35 SE +/- 1.01, N = 3 SE +/- 0.33, N = 3 SE +/- 0.17, N = 3 SE +/- 0.16, N = 3 28.24 27.10 27.46 27.08 MIN: 25.09 / MAX: 90.56 MIN: 25.08 / MAX: 69.15 MIN: 25.03 / MAX: 86.63 MIN: 25.17 / MAX: 62.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 1 2 3 4 7 14 21 28 35 SE +/- 0.55, N = 3 SE +/- 0.20, N = 3 SE +/- 0.30, N = 3 SE +/- 0.28, N = 3 31.39 31.01 30.91 31.29 MIN: 27.72 / MAX: 93.62 MIN: 27.42 / MAX: 92.87 MIN: 27.74 / MAX: 74.48 MIN: 27.91 / MAX: 91.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 1 2 3 4 9 18 27 36 45 SE +/- 0.14, N = 3 SE +/- 0.26, N = 3 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 40.78 40.33 40.99 40.94 MIN: 34.36 / MAX: 65.77 MIN: 33.98 / MAX: 69.99 MIN: 34.83 / MAX: 65.41 MIN: 34.04 / MAX: 100.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 1 2 3 4 11 22 33 44 55 SE +/- 0.49, N = 3 SE +/- 0.27, N = 3 SE +/- 0.19, N = 3 SE +/- 0.27, N = 3 49.40 48.68 49.47 50.43 MIN: 41.46 / MAX: 109.15 MIN: 40.99 / MAX: 104.67 MIN: 41.96 / MAX: 101.85 MIN: 42.18 / MAX: 103.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 1 2 3 4 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 15.20 15.05 15.29 15.16 MIN: 14.19 / MAX: 53.26 MIN: 14.16 / MAX: 39.79 MIN: 14.23 / MAX: 41.93 MIN: 14.26 / MAX: 42.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 1 2 3 4 5 10 15 20 25 SE +/- 0.12, N = 3 SE +/- 0.27, N = 3 SE +/- 0.14, N = 3 SE +/- 0.10, N = 3 21.11 21.31 21.33 21.09 MIN: 17.27 / MAX: 53.65 MIN: 17.11 / MAX: 64 MIN: 17.27 / MAX: 74.94 MIN: 17.35 / MAX: 50.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 1 2 3 4 20 40 60 80 100 SE +/- 0.24, N = 3 SE +/- 0.06, N = 3 SE +/- 0.31, N = 3 SE +/- 0.33, N = 3 73.59 73.31 74.36 74.16 MIN: 68.98 / MAX: 102.41 MIN: 69.22 / MAX: 103.56 MIN: 69.28 / MAX: 98.86 MIN: 68.87 / MAX: 113.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 1 2 3 4 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 SE +/- 0.48, N = 3 26.13 25.55 26.04 26.35 MIN: 20.41 / MAX: 71.52 MIN: 20.25 / MAX: 56.58 MIN: 20.27 / MAX: 72.59 MIN: 20.56 / MAX: 67.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 1 2 3 4 0.7313 1.4626 2.1939 2.9252 3.6565 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.11, N = 3 3.06 3.06 3.21 3.25 MIN: 2.77 / MAX: 13.58 MIN: 2.78 / MAX: 9.98 MIN: 2.79 / MAX: 31.43 MIN: 2.84 / MAX: 31.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 1 2 3 4 3 6 9 12 15 SE +/- 0.40, N = 3 SE +/- 0.29, N = 3 SE +/- 0.40, N = 3 SE +/- 0.16, N = 3 12.64 12.46 13.38 12.41 MIN: 11.17 / MAX: 59.1 MIN: 11.22 / MAX: 52.06 MIN: 11.27 / MAX: 55.65 MIN: 11.19 / MAX: 47.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 1 2 3 4 3 6 9 12 15 SE +/- 0.19, N = 3 SE +/- 0.22, N = 2 SE +/- 0.05, N = 3 SE +/- 0.07, N = 2 8.80 8.69 9.05 8.36 MIN: 7.59 / MAX: 67.87 MIN: 7.71 / MAX: 44.91 MIN: 7.58 / MAX: 97.5 MIN: 7.74 / MAX: 26.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 1 2 3 4 3 6 9 12 15 SE +/- 0.36, N = 3 SE +/- 0.07, N = 3 SE +/- 0.19, N = 3 SE +/- 0.02, N = 3 9.27 9.06 9.36 9.22 MIN: 8.28 / MAX: 67.75 MIN: 8.42 / MAX: 39.79 MIN: 8.57 / MAX: 69.21 MIN: 8.63 / MAX: 31.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 4 2 4 6 8 10 SE +/- 0.41, N = 3 SE +/- 0.29, N = 3 SE +/- 0.28, N = 3 SE +/- 0.09, N = 3 8.68 8.30 8.45 8.19 MIN: 7.2 / MAX: 61.54 MIN: 7.15 / MAX: 68.15 MIN: 7.19 / MAX: 61.48 MIN: 7.06 / MAX: 34.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 4 3 6 9 12 15 SE +/- 0.26, N = 3 SE +/- 0.17, N = 3 SE +/- 0.24, N = 3 SE +/- 0.26, N = 3 9.69 9.47 9.74 9.49 MIN: 8.2 / MAX: 69.64 MIN: 8.22 / MAX: 59.67 MIN: 8.42 / MAX: 52.88 MIN: 8.37 / MAX: 32.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 1 2 3 4 8 16 24 32 40 SE +/- 0.19, N = 3 SE +/- 0.39, N = 3 SE +/- 0.33, N = 3 SE +/- 0.19, N = 3 32.98 33.18 32.41 32.46 MIN: 28.26 / MAX: 78.43 MIN: 28.2 / MAX: 91.27 MIN: 28.41 / MAX: 63.13 MIN: 28.48 / MAX: 81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 1 2 3 4 2 4 6 8 10 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 8.65 8.62 8.59 8.58 1. Nodejs
v12.18.2
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon Obj 1 2 3 4 2 4 6 8 10 SE +/- 0.0098, N = 3 SE +/- 0.0066, N = 3 SE +/- 0.0169, N = 3 SE +/- 0.0021, N = 3 8.0929 8.1077 8.1225 8.1055 MIN: 8.04 / MAX: 8.18 MIN: 8.07 / MAX: 8.19 MIN: 8.06 / MAX: 8.22 MIN: 8.07 / MAX: 8.21
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 1 2 3 4 2K 4K 6K 8K 10K SE +/- 35.17, N = 3 SE +/- 48.72, N = 3 SE +/- 3.95, N = 3 SE +/- 8.05, N = 5 8632.6 8609.2 8540.5 8581.1 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 1 2 3 4 10 20 30 40 50 SE +/- 0.18, N = 3 SE +/- 0.59, N = 3 SE +/- 0.68, N = 3 SE +/- 0.48, N = 5 41.76 42.74 41.46 39.95 1. (CC) gcc options: -O3
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.30, N = 15 SE +/- 0.35, N = 15 SE +/- 0.33, N = 15 SE +/- 0.03, N = 3 13.30 12.81 12.83 11.72 MIN: 11.83 MIN: 11.43 MIN: 11.13 MIN: 11.31 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon Obj 1 2 3 4 2 4 6 8 10 SE +/- 0.0257, N = 3 SE +/- 0.0331, N = 3 SE +/- 0.0181, N = 3 SE +/- 0.0146, N = 3 8.7299 8.7041 8.6989 8.7484 MIN: 8.66 / MAX: 8.87 MIN: 8.61 / MAX: 8.85 MIN: 8.63 / MAX: 8.81 MIN: 8.69 / MAX: 8.87
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 900 1800 2700 3600 4500 SE +/- 18.33, N = 3 SE +/- 5.21, N = 3 SE +/- 9.61, N = 3 SE +/- 2.27, N = 3 4159.90 4128.18 4033.64 4036.80 MIN: 4097.03 MIN: 4094.43 MIN: 3993.31 MIN: 4003.09 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 4 900 1800 2700 3600 4500 SE +/- 7.69, N = 3 SE +/- 4.74, N = 3 SE +/- 4.50, N = 3 SE +/- 8.55, N = 3 4131.16 4123.22 4041.05 4050.96 MIN: 4094.86 MIN: 4081.81 MIN: 4008.21 MIN: 4004.81 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 4 900 1800 2700 3600 4500 SE +/- 6.45, N = 3 SE +/- 11.44, N = 3 SE +/- 10.88, N = 3 SE +/- 12.86, N = 3 4140.23 4109.42 4019.42 4053.79 MIN: 4108.52 MIN: 4057.07 MIN: 3984.28 MIN: 4006.16 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 1 2 3 4 2K 4K 6K 8K 10K SE +/- 4.85, N = 3 SE +/- 26.35, N = 3 SE +/- 42.93, N = 3 SE +/- 3.80, N = 4 8678.0 8644.6 8604.3 8527.7 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 1 2 3 4 10 20 30 40 50 SE +/- 0.64, N = 3 SE +/- 0.12, N = 3 SE +/- 0.18, N = 3 SE +/- 0.60, N = 4 43.21 42.62 42.42 42.91 1. (CC) gcc options: -O3
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 1 2 3 4 16 32 48 64 80 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 73.97 73.73 73.80 73.95
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown 1 2 3 4 2 4 6 8 10 SE +/- 0.0132, N = 3 SE +/- 0.0330, N = 3 SE +/- 0.0343, N = 3 SE +/- 0.0144, N = 3 8.2428 8.3572 8.3161 8.3815 MIN: 8.17 / MAX: 8.38 MIN: 8.25 / MAX: 8.51 MIN: 8.2 / MAX: 8.47 MIN: 8.31 / MAX: 8.52
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 1 2 3 4 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 8.32 8.35 8.20 8.41 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 1 2 3 4 16 32 48 64 80 SE +/- 0.03, N = 3 SE +/- 0.31, N = 3 SE +/- 0.35, N = 3 SE +/- 0.14, N = 3 72.36 71.64 71.53 71.69 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 1 2 3 4 16 32 48 64 80 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 71.20 71.12 71.49 71.59 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon 1 2 3 4 3 6 9 12 15 SE +/- 0.0541, N = 3 SE +/- 0.0651, N = 3 SE +/- 0.0605, N = 3 SE +/- 0.0112, N = 3 8.8661 8.9993 9.0116 8.8984 MIN: 8.75 / MAX: 9.05 MIN: 8.87 / MAX: 9.22 MIN: 8.87 / MAX: 9.2 MIN: 8.84 / MAX: 9.01
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 1 2 3 4 0.0666 0.1332 0.1998 0.2664 0.333 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.294 0.296 0.296 0.296
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 1 2 3 4 0.2003 0.4006 0.6009 0.8012 1.0015 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 0.886 0.887 0.890 0.887
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 1 2 3 4 3M 6M 9M 12M 15M SE +/- 155452.58, N = 3 SE +/- 45505.69, N = 3 SE +/- 133376.42, N = 3 SE +/- 251354.29, N = 3 15806704 15878386 16120996 15865176 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon 1 2 3 4 3 6 9 12 15 SE +/- 0.1023, N = 3 SE +/- 0.0676, N = 3 SE +/- 0.0053, N = 3 SE +/- 0.0870, N = 3 9.3790 9.3104 9.4291 9.4739 MIN: 9.18 / MAX: 9.67 MIN: 9.15 / MAX: 9.54 MIN: 9.36 / MAX: 9.56 MIN: 9.3 / MAX: 9.76
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 1 2 3 4 0.3474 0.6948 1.0422 1.3896 1.737 SE +/- 0.006, N = 3 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 1.542 1.544 1.541 1.540
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 1 2 3 4 0.736 1.472 2.208 2.944 3.68 SE +/- 0.006, N = 3 SE +/- 0.026, N = 3 SE +/- 0.005, N = 3 SE +/- 0.020, N = 3 3.271 3.251 3.267 3.259
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 1.2361 2.4722 3.7083 4.9444 6.1805 SE +/- 0.06192, N = 15 SE +/- 0.05422, N = 15 SE +/- 0.06592, N = 15 SE +/- 0.18172, N = 15 5.42581 5.49371 5.47616 5.46446 MIN: 4.84 MIN: 4.8 MIN: 4.66 MIN: 4.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 1 2 3 4 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 10.25 10.24 10.20 10.17 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 1 2 3 4 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.32, N = 12 18.17 18.21 18.26 17.75 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Sunflow Rendering System Global Illumination + Image Synthesis OpenBenchmarking.org Seconds, Fewer Is Better Sunflow Rendering System 0.07.2 Global Illumination + Image Synthesis 1 2 3 4 0.3384 0.6768 1.0152 1.3536 1.692 SE +/- 0.025, N = 12 SE +/- 0.015, N = 3 SE +/- 0.021, N = 15 SE +/- 0.019, N = 15 1.483 1.468 1.489 1.504 MIN: 1.2 / MAX: 2.51 MIN: 1.3 / MAX: 2.03 MIN: 1.19 / MAX: 2.55 MIN: 1.25 / MAX: 2.55
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 1 2 3 4 0.1148 0.2296 0.3444 0.4592 0.574 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.51 0.51 0.51 0.51 1. (CXX) g++ options: -O3 -pthread
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S 1 2 3 4 13 26 39 52 65 SE +/- 0.31, N = 3 SE +/- 0.28, N = 3 SE +/- 0.21, N = 3 SE +/- 0.39, N = 3 55.89 56.23 55.80 56.25 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 1 2 3 4 8 16 24 32 40 SE +/- 0.21, N = 4 SE +/- 0.07, N = 4 SE +/- 0.36, N = 7 SE +/- 0.10, N = 4 32.94 32.92 33.23 33.11 1. (CC) gcc options: -O2 -std=c99
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 1 2 3 4 0.2684 0.5368 0.8052 1.0736 1.342 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 1.189 1.192 1.193 1.188
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 1 2 3 4 0.0945 0.189 0.2835 0.378 0.4725 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.42 0.42 0.41 0.41 1. (CXX) g++ options: -O3 -pthread
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 3 4 500K 1000K 1500K 2000K 2500K SE +/- 28538.84, N = 3 SE +/- 22781.06, N = 15 SE +/- 17321.00, N = 15 SE +/- 23864.60, N = 15 2249907.25 2005461.67 1958857.87 1987133.40 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.21181, N = 15 SE +/- 0.18072, N = 12 SE +/- 0.01281, N = 3 SE +/- 0.04220, N = 3 10.71635 9.14346 7.90181 8.13194 MIN: 9.13 MIN: 7.91 MIN: 7.35 MIN: 7.45 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.14975, N = 3 SE +/- 0.11415, N = 13 SE +/- 0.02059, N = 3 SE +/- 0.02113, N = 3 12.36450 10.38859 9.63577 9.79079 MIN: 11.22 MIN: 9.55 MIN: 9.24 MIN: 9.41 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 1 2 3 4 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 37.90 37.91 38.05 38.14 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 1 2 3 4 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 32.29 32.32 32.82 32.82 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Slow 1 2 3 4 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 16.12 16.16 16.18 16.04 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 1 2 3 4 120K 240K 360K 480K 600K SE +/- 2554.27, N = 3 SE +/- 2501.46, N = 3 SE +/- 2838.45, N = 3 SE +/- 4775.98, N = 3 546954 551754 546637 544820
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 1 2 3 4 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 16.49 16.53 16.55 16.41 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 1 2 3 4 2K 4K 6K 8K 10K SE +/- 24.79, N = 3 SE +/- 60.99, N = 3 SE +/- 52.54, N = 3 SE +/- 98.41, N = 3 9086.1 8892.2 8941.1 8886.7 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 1 2 3 4 2K 4K 6K 8K 10K SE +/- 28.48, N = 3 SE +/- 17.78, N = 3 SE +/- 32.31, N = 3 SE +/- 131.10, N = 3 8097.82 8113.28 8087.54 8150.13 1. (CC) gcc options: -O3
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 1 2 3 4 0.6183 1.2366 1.8549 2.4732 3.0915 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 SE +/- 0.001, N = 3 SE +/- 0.008, N = 3 2.742 2.740 2.748 2.724
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 1 2 3 4 1.5M 3M 4.5M 6M 7.5M SE +/- 5253.17, N = 3 SE +/- 16676.43, N = 3 SE +/- 36420.36, N = 3 SE +/- 14408.91, N = 3 6427875 6677773 6815751 6942505 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 3 4 500K 1000K 1500K 2000K 2500K SE +/- 38987.55, N = 3 SE +/- 130082.76, N = 12 SE +/- 13558.64, N = 3 SE +/- 128187.02, N = 12 2375150.83 2023932.30 1253261.46 1981072.63 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 2 3 4 4 8 12 16 20 SE +/- 0.04, N = 5 SE +/- 0.05, N = 5 SE +/- 0.05, N = 5 SE +/- 0.03, N = 5 15.05 15.02 14.96 15.02 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 1 2 3 4 70K 140K 210K 280K 350K SE +/- 687.11, N = 3 SE +/- 712.12, N = 3 SE +/- 726.18, N = 3 SE +/- 754.42, N = 3 310974.49 311216.67 311939.82 278575.83 1. (CC) gcc options: -O2 -lrt" -lrt
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1 2 3 4 4 8 12 16 20 SE +/- 0.04, N = 5 SE +/- 0.04, N = 5 SE +/- 0.06, N = 5 SE +/- 0.02, N = 5 14.07 14.09 14.05 14.00 1. (CXX) g++ options: -rdynamic
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 4 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.48, N = 15 SE +/- 0.03, N = 3 9.82 9.88 9.05 9.94 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 1 2 3 4 8 16 24 32 40 SE +/- 0.11, N = 3 SE +/- 0.41, N = 3 SE +/- 0.53, N = 3 SE +/- 0.02, N = 3 36.04 36.59 36.92 36.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 1 2 3 4 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 37.80 37.98 38.01 37.63 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 2 4 6 8 10 SE +/- 0.02396, N = 3 SE +/- 0.03310, N = 3 SE +/- 0.01457, N = 3 SE +/- 0.01932, N = 3 6.17482 5.99918 5.94456 5.97066 MIN: 5.91 MIN: 5.73 MIN: 5.67 MIN: 5.67 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 1 2 3 4 2 4 6 8 10 SE +/- 0.032, N = 5 SE +/- 0.073, N = 5 SE +/- 0.072, N = 5 SE +/- 0.085, N = 5 8.280 8.274 8.271 8.430 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 4 1.1657 2.3314 3.4971 4.6628 5.8285 SE +/- 0.048, N = 15 SE +/- 0.084, N = 3 SE +/- 0.028, N = 3 SE +/- 0.067, N = 15 5.082 5.035 5.181 4.983 1. (CXX) g++ options: -O3 -pthread -lm
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 4 1.0722 2.1444 3.2166 4.2888 5.361 SE +/- 0.00294, N = 3 SE +/- 0.01890, N = 3 SE +/- 0.00744, N = 3 SE +/- 0.00471, N = 3 3.95326 4.01853 4.76518 4.72633 MIN: 3.76 MIN: 3.77 MIN: 4.45 MIN: 4.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 2 3 4 300K 600K 900K 1200K 1500K SE +/- 4917.09, N = 3 SE +/- 20084.29, N = 3 SE +/- 1897.80, N = 3 SE +/- 18142.13, N = 4 1219147.04 1235368.71 1263884.50 1273184.81 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 3 4 300K 600K 900K 1200K 1500K SE +/- 8681.25, N = 3 SE +/- 24558.39, N = 3 SE +/- 5357.49, N = 3 SE +/- 23926.24, N = 3 1530872.96 1506982.37 1529187.75 1515400.67 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1 2 3 4 400K 800K 1200K 1600K 2000K SE +/- 5744.03, N = 3 SE +/- 19144.33, N = 3 SE +/- 18395.63, N = 3 SE +/- 1945.31, N = 3 1809735.13 1742844.00 1799244.42 1795470.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 1 2 3 4 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.11, N = 3 SE +/- 0.07, N = 8 6.68 6.65 6.86 6.82 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.8615 1.723 2.5845 3.446 4.3075 SE +/- 0.02757, N = 3 SE +/- 0.03800, N = 3 SE +/- 0.00288, N = 3 SE +/- 0.00658, N = 3 3.82872 3.21759 2.49334 2.49930 MIN: 3.54 MIN: 2.82 MIN: 2.24 MIN: 2.22 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.02924, N = 3 SE +/- 0.05948, N = 3 SE +/- 0.01279, N = 3 SE +/- 0.00941, N = 3 11.13180 10.99210 10.13790 9.80423 MIN: 10.5 MIN: 10.44 MIN: 9.63 MIN: 9.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 1 2 3 4 3 6 9 12 15 SE +/- 0.057, N = 3 SE +/- 0.010, N = 3 SE +/- 0.013, N = 3 SE +/- 0.020, N = 3 8.938 8.975 8.999 8.979 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 1 2 3 4 15 30 45 60 75 SE +/- 0.05, N = 3 SE +/- 0.14, N = 3 SE +/- 0.23, N = 3 SE +/- 0.09, N = 3 68.41 68.78 68.99 68.49 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 20.28 20.78 21.72 21.57 MIN: 19.49 MIN: 19.98 MIN: 20.68 MIN: 20.54 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 4 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 19.87 20.11 21.15 20.97 MIN: 19.26 MIN: 19.41 MIN: 20.01 MIN: 19.86 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.21, N = 14 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 11.52 11.65 11.54 11.46 MIN: 11.08 MIN: 11.1 MIN: 11.26 MIN: 11.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 4 4 8 12 16 20 SE +/- 0.22, N = 4 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 16.74 15.60 14.76 14.74 MIN: 16 MIN: 15.32 MIN: 14.61 MIN: 14.63 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 1 2 3 4 3K 6K 9K 12K 15K SE +/- 77.45, N = 3 SE +/- 76.15, N = 3 SE +/- 142.05, N = 3 SE +/- 414.56, N = 3 11544.77 11544.79 11926.91 11599.97 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 1 2 3 4 0.3299 0.6598 0.9897 1.3196 1.6495 SE +/- 0.00651, N = 3 SE +/- 0.00883, N = 3 SE +/- 0.01126, N = 3 SE +/- 0.00683, N = 3 1.45770 1.46282 1.46626 1.46081 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 1 2 3 4 0.1277 0.2554 0.3831 0.5108 0.6385 SE +/- 0.00487, N = 3 SE +/- 0.00616, N = 3 SE +/- 0.00273, N = 3 SE +/- 0.00200, N = 3 0.56116 0.56775 0.55880 0.56703 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 1 2 3 4 0.0059 0.0118 0.0177 0.0236 0.0295 SE +/- 0.00003, N = 3 SE +/- 0.00005, N = 3 SE +/- 0.00003, N = 3 SE +/- 0.00002, N = 3 0.02620 0.02615 0.02623 0.02621 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 1 2 3 4 0.781 1.562 2.343 3.124 3.905 SE +/- 0.01066, N = 3 SE +/- 0.00723, N = 3 SE +/- 0.00101, N = 3 SE +/- 0.00653, N = 3 3.43263 3.47112 3.46886 3.46879 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 1 2 3 4 0.6403 1.2806 1.9209 2.5612 3.2015 SE +/- 0.16798, N = 3 SE +/- 0.01878, N = 3 SE +/- 0.03257, N = 3 SE +/- 0.05255, N = 3 2.48824 2.83742 2.84596 2.77808 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 1 2 3 4 5 10 15 20 25 SE +/- 0.37, N = 3 SE +/- 0.20, N = 3 SE +/- 0.17, N = 3 SE +/- 0.10, N = 3 19.76 20.10 20.31 20.32 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 1 2 3 4 1.0053 2.0106 3.0159 4.0212 5.0265 SE +/- 0.11339, N = 3 SE +/- 0.07093, N = 3 SE +/- 0.04593, N = 3 SE +/- 0.05917, N = 3 4.44974 4.46749 4.46806 4.41886 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
Phoronix Test Suite v10.8.4