7773x Tests for a future article. 2 x AMD EPYC 7573X 32-Core testing with a AMD DAYTONA_X (RYM1009B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2305044-NE-7773X849132&grr&sro .
7773x Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b 5 a 5 b 5 2p a 5 2p b AMD EPYC 7773X 64-Core @ 2.20GHz (64 Cores / 128 Threads) AMD DAYTONA_X (RYM1009B BIOS) AMD Starship/Matisse 256GB 3841GB Micron_9300_MTFDHAL3T8TDP ASPEED VE228 2 x Mellanox MT27710 Ubuntu 22.04 5.15.0-47-generic (x86_64) GNOME Shell 42.4 X Server 1.21.1.3 1.2.204 GCC 11.2.0 ext4 1920x1080 AMD EPYC 7573X 32-Core @ 2.80GHz (32 Cores / 64 Threads) 2 x AMD EPYC 7573X 32-Core @ 2.80GHz (64 Cores / 128 Threads) 512GB OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa001229 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
7773x petsc: Streams openvkl: vklBenchmark ISPC openfoam: drivaerFastback, Medium Mesh Size - Execution Time openfoam: drivaerFastback, Medium Mesh Size - Mesh Time blender: Barbershop - CPU-Only lczero: Eigen lczero: BLAS build-llvm: Unix Makefiles opencv: Graph API ffmpeg: libx265 - Platform ffmpeg: libx265 - Platform ffmpeg: libx265 - Video On Demand ffmpeg: libx265 - Video On Demand ncnn: CPU - FastestDet ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet ffmpeg: libx265 - Upload ffmpeg: libx265 - Upload build-llvm: Ninja opencv: Stitching askap: tConvolve MT - Degridding askap: tConvolve MT - Gridding clickhouse: 100M Rows Hits Dataset, Third Run clickhouse: 100M Rows Hits Dataset, Second Run clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache vvenc: Bosphorus 4K - Fast blender: Pabellon Barcelona - CPU-Only opencv: Core blender: Classroom - CPU-Only ffmpeg: libx265 - Live ffmpeg: libx265 - Live onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU john-the-ripper: MD5 john-the-ripper: HMAC-SHA512 compress-zstd: 19, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed vvenc: Bosphorus 4K - Faster openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openfoam: drivaerFastback, Small Mesh Size - Execution Time openfoam: drivaerFastback, Small Mesh Size - Mesh Time openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU compress-zstd: 19 - Decompression Speed compress-zstd: 19 - Compression Speed openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 3, Long Mode - Compression Speed compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 12 - Decompression Speed compress-zstd: 12 - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 8 - Compression Speed compress-zstd: 3 - Decompression Speed compress-zstd: 3 - Compression Speed openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU opencv: DNN - Deep Neural Network vvenc: Bosphorus 1080p - Fast blender: Fishy Cat - CPU-Only svt-av1: Preset 4 - Bosphorus 4K opencv: Object Detection specfem3d: Layered Halfspace gromacs: MPI CPU - water_GMX50_bare specfem3d: Water-layered Halfspace blender: BMW27 - CPU-Only john-the-ripper: WPA PSK john-the-ripper: bcrypt john-the-ripper: Blowfish quantlib: svt-av1: Preset 8 - Bosphorus 4K compress-7zip: Decompression Rating compress-7zip: Compression Rating vvenc: Bosphorus 1080p - Faster embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon Obj espeak: Text-To-Speech Synthesis mt-dgemm: Sustained Floating-Point Rate specfem3d: Homogeneous Halfspace build-ffmpeg: Time To Compile svt-av1: Preset 4 - Bosphorus 1080p specfem3d: Tomographic Model onednn: Deconvolution Batch shapes_1d - f32 - CPU askap: tConvolve MPI - Gridding askap: tConvolve MPI - Degridding specfem3d: Mount St. Helens embree: Pathtracer ISPC - Crown incompact3d: input.i3d 193 Cells Per Direction embree: Pathtracer - Crown onednn: IP Shapes 1D - f32 - CPU embree: Pathtracer - Asian Dragon embree: Pathtracer ISPC - Asian Dragon askap: Hogbom Clean OpenMP cloverleaf: Lagrangian-Eulerian Hydrodynamics lulesh: svt-av1: Preset 8 - Bosphorus 1080p pennant: sedovbig draco: Church Facade onednn: IP Shapes 3D - f32 - CPU askap: tConvolve OpenMP - Degridding askap: tConvolve OpenMP - Gridding draco: Lion svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K onednn: Convolution Batch Shapes Auto - f32 - CPU pennant: leblancbig incompact3d: input.i3d 129 Cells Per Direction onednn: Deconvolution Batch shapes_3d - f32 - CPU svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p a b 5 a 5 b 5 2p a 5 2p b 470 40.481652 25.348583 260.83 246.957 43.73 173.23 43.67 173.47 21.59 116.93 165.707 6.286 88.76 72.45 105.78 47.74 5534000 136796000 11.367 40.450568 24.729957 16.552 34.77 4.863 30.518912661 7.290 28.081540950 27.61 201388 86230 86460 2746.5 73.442 30.006 70.5238 64.5117 29.383496 17.209349082 16.928 10.295 13.862202300 11.678414616 63.3711 69.1077 78.4751 73.8446 109.984 5793 4867 222.837 195.924 604.842 552.241 56185.7504 469 363.7359 120.39337 259.87 5142 5554 248.481 235810 43.90 172.56956694 43.90 172.533939506 19.08 133.39 53.52 19.32 23.92 19.58 7.13 10.89 24.04 19.09 8.12 15.44 11.71 15.6 10.29 11.3 17.84 21.62 116.80915814 164.449 201160 8308.33 5493.35 437.35 439.07 428.51 6.128 88.73 77061 72.02 106.98 47.206774889 1183.63 749.622 5543000 135165000 1190.8 9.3 11.364 3911.97 8.05 40.244428 25.596085 3927 8.04 1276.8 17.6 2608.34 12.16 1175.01 27.03 230.27 138.84 1300.8 752.6 1434.2 744.5 1445.9 270.8 1390 988.5 1277.3 2819 16.93 1888.2 16.93 1889.05 26.54 1205.11 26.68 1198.45 23.92 2674.29 1.74 35737.82 1.62 38091.46 39756 16.374 34.61 4.826 31910 30.115899704 7.299 27.467540273 27.6 202957 91276 87360 2692.4 67.548 411213 396928 30.114 64.6409 28.157 29.04569 16.817080246 17.042 10.234 13.530179394 6.92145 36827.5 32799.5 11.75332146 63.3647 17.2118473 1.26107 73.828 806.452 11.34 22257.137 109.711 9.56361 5833 1.24618 19018.3 20481.2 4885 214.933 198.135 0.827622 5.104078 4.43804121 3.12487 601.984 549.834 31992.6068 340 428.6592 117.4378 408.79 1238 1419 290.539 206970 46.70 162.19 46.89 161.54 9.25 126.26 26.34 14.85 19.98 14.8 5.48 8.12 21.08 14.11 3.82 9.03 6.08 7.73 6.19 6.46 13.7 22.95 110.01 230.378 184032 9611.05 6599.68 456.30 457.81 437.20 6.406 136.58 68343 113.31 110.49 45.71 1182.61 679.645 3621000 96059000 1229.1 9.54 11.961 2881.04 5.48 50.745904 22.337396 2886.12 5.47 1304.2 18.2 1935.6 8.23 886.41 17.92 172.53 92.61 1361.5 854.2 1460.1 792.1 1478.1 289.1 1438.8 1067.8 1326.5 2856.6 12.77 1252.05 12.84 1244.86 19 841.49 19.45 822.19 18.1 1767.13 1.26 25209.62 1.16 27340.96 39997 16.846 52.68 4.743 27739 49.136149469 5.022 45.70942713 42.52 130848 60192 60325 2756.9 65.337 244220 271918 31.837 43.9589 39.1941 27.826 17.8745 24.987445867 20.378 10.76 19.089664857 4.49565 24990.1 20991.7 18.86219907 40.2408 19.6305008 43.9492 1.37996 48.5228 44.9118 869.565 12.09 20920.487 107.382 14.00111 5718 0.62593 19018.3 17750.4 4760 220.813 199.69 1.14656 7.974022 5.06756401 2.76234 629.3 571.084 339 427.71276 117.6045 297.223 22.93 110.14 231.091 6.39 110.29 45.79 3605000 94924000 12.02 50.699752 22.706399 17.253 4.76 48.738676454 44.334574335 131072 60152 60345 2820.4 70.856 31.795 44.1884 39.2236 18.6403 25.089622797 20.64 10.828 19.630096342 19.355553092 40.1199 43.9787 48.5074 45.1054 108.364 226.733 196.063 622.732 560.565 452 205.46392 99.384145 213.6 8286 8284 219.403 44.31 170.942711463 44.97 168.456941186 40.45 144.31 100.65 37.17 38.1 80.81 18.6 38.94 33.55 93.29 14.65 35.12 38.49 33.27 21.26 27.7 77.77 22.38 112.819760537 138.717 12506.7 7738.59 461.53 463.68 445.18 5.548 71.64 58.55 92.52 54.58 1390.36 1212.84 6845000 79002000 1222.9 9.57 9.574 2975.15 10.54 33.001079 22.563946 2946.97 10.64 1315.6 18 2013.52 15.86 914.24 34.83 182.1 175.48 1332.3 808.5 1454.7 702.8 1482.7 283.6 1437.5 1024 1306.7 2783.7 12.66 2523.57 12.76 2504.36 19.58 1632.81 20.06 1594.14 18.26 3501.69 1.37 39455.86 14.812 27.96 4.784 25.221006745 8.285 23.936065997 22.59 259413 118502 118579 2760.9 62.836 453996 403739 24.492 68.6922 64.4091 27.868 31.344515 14.491419469 15.455 10.41 11.224767813 10.5221 49980.1 41983.3 9.609453868 73.3301 10.6241302 81.4508 2.67801 76.8067 73.9612 436.681 15.67 42030.038 107.1 7.656369 5682 0.911806 16641 12102.5 4734 169.112 169.658 0.674336 4.294731 2.48654294 1.85167 540.729 525.154 74130.1789 452 203.65118 99.93268 213.46 7816 7991 221.789 419702 43.64 173.56002958 43.77 173.052257283 53.25 145.49 128.35 35.81 45.89 120.27 28.55 51.8 32.81 112.43 24.56 60.79 49.46 37.89 28.88 38.67 115.68 22.10 114.261775562 138.565 287492 11068.8 7176.41 448.03 467.46 440.56 5.768 71.73 236445 58.57 107.29 47.07 1385.33 1305.96 6683000 73260000 1227.1 9.5 9.906 2948.39 10.63 32.974316 23.335275 2952.76 10.64 1312 18 2020.21 15.76 912.42 34.91 180.91 176.76 1336.5 741.5 1445.1 702.2 1498.9 282.8 1441.1 1013.4 1309.5 2775.9 12.66 2524.05 12.79 2498.74 19.69 1623.83 20.07 1592.71 18.25 3504.15 1.4 38800.34 1.33 40352.48 87133 15.044 27.9 4.842 88553 25.472190291 8.222 23.268032641 22.51 258457 117542 117771 2850.8 69.536 440740 396844 24.268 68.7022 64.0441 28.137 31.760876 14.415931158 15.344 10.495 11.797525585 10.3685 51199.1 41983.3 9.59476935 73.2602 10.7624416 81.0913 2.88451 76.6287 73.7779 434.783 15.51 42551.898 104.169 7.793325 5729 0.917226 15662.1 12678.9 4818 176.107 172.573 0.669349 4.418012 2.48386908 1.89859 565.057 546.034 OpenBenchmarking.org
PETSc Test: Streams OpenBenchmarking.org MB/s, More Is Better PETSc 3.19 Test: Streams 5 2p b 5 a b 16K 32K 48K 64K 80K 74130.18 31992.61 56185.75 1. (CC) gcc options: -fPIC -O3 -O2 -lpthread -ludev -lpciaccess -lm
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC 5 2p a 5 2p b 5 a 5 b a b 100 200 300 400 500 SE +/- 0.33, N = 3 452 452 340 339 470 469 MIN: 98 / MAX: 1875 MIN: 99 / MAX: 2013 MIN: 55 / MAX: 2309 MIN: 54 / MAX: 2307 MIN: 84 / MAX: 2616 MIN: 84 / MAX: 2565
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time 5 2p a 5 2p b 5 a 5 b a b 90 180 270 360 450 205.46 203.65 428.66 427.71 40.48 363.74 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time 5 2p a 5 2p b 5 a 5 b a b 30 60 90 120 150 99.38 99.93 117.44 117.60 25.35 120.39 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Barbershop - Compute: CPU-Only 5 2p a 5 2p b 5 a a b 90 180 270 360 450 SE +/- 0.09, N = 3 213.60 213.46 408.79 260.83 259.87
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen 5 2p a 5 2p b 5 a b 2K 4K 6K 8K 10K 8286 7816 1238 5142 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS 5 2p a 5 2p b 5 a b 2K 4K 6K 8K 10K 8284 7991 1419 5554 1. (CXX) g++ options: -flto -pthread
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles 5 2p a 5 2p b 5 a 5 b a b 60 120 180 240 300 SE +/- 0.96, N = 3 219.40 221.79 290.54 297.22 246.96 248.48
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API 5 2p b 5 a b 90K 180K 270K 360K 450K 419702 206970 235810 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform 5 2p a 5 2p b 5 a a b 11 22 33 44 55 SE +/- 0.02, N = 3 44.31 43.64 46.70 43.73 43.90 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform 5 2p a 5 2p b 5 a a b 40 80 120 160 200 SE +/- 0.08, N = 3 170.94 173.56 162.19 173.23 172.57 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand 5 2p a 5 2p b 5 a a b 11 22 33 44 55 SE +/- 0.04, N = 3 44.97 43.77 46.89 43.67 43.90 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand 5 2p a 5 2p b 5 a a b 40 80 120 160 200 SE +/- 0.15, N = 3 168.46 173.05 161.54 173.47 172.53 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: FastestDet 5 2p a 5 2p b 5 a b 12 24 36 48 60 40.45 53.25 9.25 19.08 MIN: 27.4 / MAX: 462.07 MIN: 30.07 / MAX: 66.77 MIN: 9.12 / MAX: 9.81 MIN: 13.65 / MAX: 21.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer 5 2p a 5 2p b 5 a b 30 60 90 120 150 144.31 145.49 126.26 133.39 MIN: 140.5 / MAX: 157.63 MIN: 141.27 / MAX: 245.34 MIN: 125.42 / MAX: 132.03 MIN: 129.7 / MAX: 252.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m 5 2p a 5 2p b 5 a b 30 60 90 120 150 100.65 128.35 26.34 53.52 MIN: 97.81 / MAX: 136.12 MIN: 111.43 / MAX: 240.27 MIN: 25.99 / MAX: 28.26 MIN: 50.93 / MAX: 71.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: squeezenet_ssd 5 2p a 5 2p b 5 a b 9 18 27 36 45 37.17 35.81 14.85 19.32 MIN: 30.62 / MAX: 51.54 MIN: 29.14 / MAX: 58.45 MIN: 14.49 / MAX: 25.32 MIN: 18.86 / MAX: 22.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: yolov4-tiny 5 2p a 5 2p b 5 a b 10 20 30 40 50 38.10 45.89 19.98 23.92 MIN: 29.08 / MAX: 101.85 MIN: 34.08 / MAX: 64.66 MIN: 19.53 / MAX: 23.19 MIN: 23.17 / MAX: 30.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 5 2p a 5 2p b 5 a b 30 60 90 120 150 80.81 120.27 14.80 19.58 MIN: 62.51 / MAX: 112.38 MIN: 42.99 / MAX: 192.41 MIN: 14.6 / MAX: 16.84 MIN: 19.15 / MAX: 44.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: alexnet 5 2p a 5 2p b 5 a b 7 14 21 28 35 18.60 28.55 5.48 7.13 MIN: 17.59 / MAX: 34.64 MIN: 12.6 / MAX: 62.18 MIN: 5.36 / MAX: 6.34 MIN: 6.97 / MAX: 7.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet18 5 2p a 5 2p b 5 a b 12 24 36 48 60 38.94 51.80 8.12 10.89 MIN: 16.05 / MAX: 124.98 MIN: 16.09 / MAX: 93.56 MIN: 8.01 / MAX: 10.04 MIN: 10.68 / MAX: 11.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vgg16 5 2p a 5 2p b 5 a b 8 16 24 32 40 33.55 32.81 21.08 24.04 MIN: 28.65 / MAX: 42.53 MIN: 30.06 / MAX: 44.76 MIN: 20.78 / MAX: 24.35 MIN: 23.49 / MAX: 30.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet 5 2p a 5 2p b 5 a b 30 60 90 120 150 93.29 112.43 14.11 19.09 MIN: 52.46 / MAX: 137.19 MIN: 29.99 / MAX: 148.59 MIN: 13.96 / MAX: 17.22 MIN: 18.77 / MAX: 25.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: blazeface 5 2p a 5 2p b 5 a b 6 12 18 24 30 14.65 24.56 3.82 8.12 MIN: 11.34 / MAX: 73.36 MIN: 20.64 / MAX: 91.2 MIN: 3.45 / MAX: 79.74 MIN: 6.94 / MAX: 11.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 5 2p a 5 2p b 5 a b 14 28 42 56 70 35.12 60.79 9.03 15.44 MIN: 33.83 / MAX: 41.96 MIN: 47.67 / MAX: 141.85 MIN: 8.93 / MAX: 11.07 MIN: 13.63 / MAX: 18.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet 5 2p a 5 2p b 5 a b 11 22 33 44 55 38.49 49.46 6.08 11.71 MIN: 25.51 / MAX: 75.24 MIN: 36.42 / MAX: 176.46 MIN: 6.01 / MAX: 6.54 MIN: 9.43 / MAX: 20.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: shufflenet-v2 5 2p a 5 2p b 5 a b 9 18 27 36 45 33.27 37.89 7.73 15.60 MIN: 29.37 / MAX: 96.69 MIN: 34.63 / MAX: 113.32 MIN: 7.58 / MAX: 9.71 MIN: 12.95 / MAX: 19.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 5 2p a 5 2p b 5 a b 7 14 21 28 35 21.26 28.88 6.19 10.29 MIN: 20.79 / MAX: 28.25 MIN: 23.66 / MAX: 168.43 MIN: 6.05 / MAX: 6.93 MIN: 9.51 / MAX: 11.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 5 2p a 5 2p b 5 a b 9 18 27 36 45 27.70 38.67 6.46 11.30 MIN: 23.18 / MAX: 43.54 MIN: 28.02 / MAX: 119.64 MIN: 6.35 / MAX: 8.73 MIN: 9.74 / MAX: 14.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mobilenet 5 2p a 5 2p b 5 a b 30 60 90 120 150 77.77 115.68 13.70 17.84 MIN: 67.11 / MAX: 156 MIN: 64.45 / MAX: 159.93 MIN: 13.55 / MAX: 14.38 MIN: 17.55 / MAX: 25.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload 5 2p a 5 2p b 5 a 5 b a b 5 10 15 20 25 SE +/- 0.01, N = 3 22.38 22.10 22.95 22.93 21.59 21.62 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload 5 2p a 5 2p b 5 a 5 b a b 30 60 90 120 150 SE +/- 0.05, N = 3 112.82 114.26 110.01 110.14 116.93 116.81 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja 5 2p a 5 2p b 5 a 5 b a b 50 100 150 200 250 SE +/- 0.08, N = 3 138.72 138.57 230.38 231.09 165.71 164.45
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching 5 2p b 5 a b 60K 120K 180K 240K 300K 287492 184032 201160 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
ASKAP Test: tConvolve MT - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding 5 2p a 5 2p b 5 a b 3K 6K 9K 12K 15K 12506.70 11068.80 9611.05 8308.33 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding 5 2p a 5 2p b 5 a b 1700 3400 5100 6800 8500 7738.59 7176.41 6599.68 5493.35 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run 5 2p a 5 2p b 5 a b 100 200 300 400 500 461.53 448.03 456.30 437.35 MIN: 41.49 / MAX: 3000 MIN: 41.38 / MAX: 2608.7 MIN: 24.65 / MAX: 5454.55 MIN: 35.59 / MAX: 5454.55
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run 5 2p a 5 2p b 5 a b 100 200 300 400 500 463.68 467.46 457.81 439.07 MIN: 41.64 / MAX: 4615.38 MIN: 40.6 / MAX: 4615.38 MIN: 24.13 / MAX: 5454.55 MIN: 35.82 / MAX: 6000
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache 5 2p a 5 2p b 5 a b 100 200 300 400 500 445.18 440.56 437.20 428.51 MIN: 41.18 / MAX: 3157.89 MIN: 40.98 / MAX: 4000 MIN: 24.3 / MAX: 6000 MIN: 34.8 / MAX: 5454.55
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Fast 5 2p a 5 2p b 5 a 5 b a b 2 4 6 8 10 SE +/- 0.010, N = 3 5.548 5.768 6.406 6.390 6.286 6.128 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Pabellon Barcelona - Compute: CPU-Only 5 2p a 5 2p b 5 a a b 30 60 90 120 150 SE +/- 0.10, N = 3 71.64 71.73 136.58 88.76 88.73
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core 5 2p b 5 a b 50K 100K 150K 200K 250K 236445 68343 77061 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Classroom - Compute: CPU-Only 5 2p a 5 2p b 5 a a b 30 60 90 120 150 SE +/- 0.17, N = 3 58.55 58.57 113.31 72.45 72.02
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live 5 2p a 5 2p b 5 a 5 b a b 20 40 60 80 100 SE +/- 0.29, N = 3 92.52 107.29 110.49 110.29 105.78 106.98 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live 5 2p a 5 2p b 5 a 5 b a b 12 24 36 48 60 SE +/- 0.13, N = 3 54.58 47.07 45.71 45.79 47.74 47.21 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1390.36 1385.33 1182.61 1183.63 MIN: 1287.43 MIN: 1314.76 MIN: 1161.83 MIN: 1162.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1212.84 1305.96 679.65 749.62 MIN: 1070.52 MIN: 1060.61 MIN: 664.64 MIN: 731.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 5 2p a 5 2p b 5 a 5 b a b 1.5M 3M 4.5M 6M 7.5M SE +/- 1000.00, N = 3 6845000 6683000 3621000 3605000 5534000 5543000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 5 2p a 5 2p b 5 a 5 b a b 30M 60M 90M 120M 150M SE +/- 266743.32, N = 3 79002000 73260000 96059000 94924000 136796000 135165000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1222.9 1227.1 1229.1 1190.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed 5 2p a 5 2p b 5 a b 3 6 9 12 15 9.57 9.50 9.54 9.30 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 4K - Video Preset: Faster 5 2p a 5 2p b 5 a 5 b a b 3 6 9 12 15 SE +/- 0.022, N = 3 9.574 9.906 11.961 12.020 11.367 11.364 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 800 1600 2400 3200 4000 2975.15 2948.39 2881.04 3911.97 MIN: 2241.37 / MAX: 3616.82 MIN: 1547.54 / MAX: 3537.3 MIN: 1536.21 / MAX: 3142.35 MIN: 3337.4 / MAX: 4451.63 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 3 6 9 12 15 10.54 10.63 5.48 8.05 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenFOAM Input: drivaerFastback, Small Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time 5 2p a 5 2p b 5 a 5 b a b 11 22 33 44 55 33.00 32.97 50.75 50.70 40.45 40.24 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenFOAM Input: drivaerFastback, Small Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time 5 2p a 5 2p b 5 a 5 b a b 6 12 18 24 30 22.56 23.34 22.34 22.71 24.73 25.60 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU 5 2p a 5 2p b 5 a b 800 1600 2400 3200 4000 2946.97 2952.76 2886.12 3927.00 MIN: 2004.15 / MAX: 3534.38 MIN: 2193.59 / MAX: 3652.32 MIN: 1694.68 / MAX: 3104.67 MIN: 3402.58 / MAX: 4474.47 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU 5 2p a 5 2p b 5 a b 3 6 9 12 15 10.64 10.64 5.47 8.04 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1315.6 1312.0 1304.2 1276.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed 5 2p a 5 2p b 5 a b 4 8 12 16 20 18.0 18.0 18.2 17.6 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 600 1200 1800 2400 3000 2013.52 2020.21 1935.60 2608.34 MIN: 1890.96 / MAX: 2802.82 MIN: 1823.96 / MAX: 3111.51 MIN: 1852.05 / MAX: 1974.19 MIN: 2421.38 / MAX: 2754.89 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 4 8 12 16 20 15.86 15.76 8.23 12.16 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 914.24 912.42 886.41 1175.01 MIN: 797.83 / MAX: 966.84 MIN: 878.63 / MAX: 988.74 MIN: 851.09 / MAX: 900.44 MIN: 982.69 / MAX: 1202.21 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU 5 2p a 5 2p b 5 a b 8 16 24 32 40 34.83 34.91 17.92 27.03 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU 5 2p a 5 2p b 5 a b 50 100 150 200 250 182.10 180.91 172.53 230.27 MIN: 117.14 / MAX: 548.18 MIN: 124.49 / MAX: 288.01 MIN: 81.35 / MAX: 207.95 MIN: 166.99 / MAX: 311.89 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU 5 2p a 5 2p b 5 a b 40 80 120 160 200 175.48 176.76 92.61 138.84 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1332.3 1336.5 1361.5 1300.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed 5 2p a 5 2p b 5 a b 200 400 600 800 1000 808.5 741.5 854.2 752.6 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1454.7 1445.1 1460.1 1434.2 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed 5 2p a 5 2p b 5 a b 200 400 600 800 1000 702.8 702.2 792.1 744.5 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1482.7 1498.9 1478.1 1445.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed 5 2p a 5 2p b 5 a b 60 120 180 240 300 283.6 282.8 289.1 270.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1437.5 1441.1 1438.8 1390.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed 5 2p a 5 2p b 5 a b 200 400 600 800 1000 1024.0 1013.4 1067.8 988.5 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1306.7 1309.5 1326.5 1277.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed 5 2p a 5 2p b 5 a b 600 1200 1800 2400 3000 2783.7 2775.9 2856.6 2819.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 4 8 12 16 20 12.66 12.66 12.77 16.93 MIN: 7.58 / MAX: 53.45 MIN: 8.38 / MAX: 48.88 MIN: 7.39 / MAX: 23.62 MIN: 13.88 / MAX: 33.11 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 500 1000 1500 2000 2500 2523.57 2524.05 1252.05 1888.20 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU 5 2p a 5 2p b 5 a b 4 8 12 16 20 12.76 12.79 12.84 16.93 MIN: 7.65 / MAX: 43.58 MIN: 7.68 / MAX: 43.18 MIN: 6.96 / MAX: 23.28 MIN: 10.73 / MAX: 31.24 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU 5 2p a 5 2p b 5 a b 500 1000 1500 2000 2500 2504.36 2498.74 1244.86 1889.05 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 6 12 18 24 30 19.58 19.69 19.00 26.54 MIN: 11.25 / MAX: 73.87 MIN: 11.03 / MAX: 75.83 MIN: 13.62 / MAX: 32.75 MIN: 14.19 / MAX: 63.19 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 400 800 1200 1600 2000 1632.81 1623.83 841.49 1205.11 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 6 12 18 24 30 20.06 20.07 19.45 26.68 MIN: 11.41 / MAX: 47.48 MIN: 13.26 / MAX: 75.83 MIN: 11.98 / MAX: 28.4 MIN: 15.32 / MAX: 46.54 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU 5 2p a 5 2p b 5 a b 300 600 900 1200 1500 1594.14 1592.71 822.19 1198.45 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 5 2p a 5 2p b 5 a b 6 12 18 24 30 18.26 18.25 18.10 23.92 MIN: 10.53 / MAX: 60.19 MIN: 8.84 / MAX: 40.73 MIN: 9.23 / MAX: 28.11 MIN: 15.02 / MAX: 35.74 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 5 2p a 5 2p b 5 a b 800 1600 2400 3200 4000 3501.69 3504.15 1767.13 2674.29 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 5 2p a 5 2p b 5 a b 0.3915 0.783 1.1745 1.566 1.9575 1.37 1.40 1.26 1.74 MIN: 0.67 / MAX: 28.86 MIN: 0.68 / MAX: 42.1 MIN: 0.69 / MAX: 12.96 MIN: 0.84 / MAX: 14.57 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 5 2p a 5 2p b 5 a b 8K 16K 24K 32K 40K 39455.86 38800.34 25209.62 35737.82 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 5 2p b 5 a b 0.3645 0.729 1.0935 1.458 1.8225 1.33 1.16 1.62 MIN: 0.64 / MAX: 26.34 MIN: 0.66 / MAX: 12.23 MIN: 0.69 / MAX: 13.34 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 5 2p b 5 a b 9K 18K 27K 36K 45K 40352.48 27340.96 38091.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network 5 2p b 5 a b 20K 40K 60K 80K 100K 87133 39997 39756 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Fast 5 2p a 5 2p b 5 a 5 b a b 4 8 12 16 20 SE +/- 0.10, N = 3 14.81 15.04 16.85 17.25 16.55 16.37 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: Fishy Cat - Compute: CPU-Only 5 2p a 5 2p b 5 a a b 12 24 36 48 60 SE +/- 0.04, N = 3 27.96 27.90 52.68 34.77 34.61
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 4 - Input: Bosphorus 4K 5 2p a 5 2p b 5 a 5 b a b 1.0942 2.1884 3.2826 4.3768 5.471 SE +/- 0.021, N = 3 4.784 4.842 4.743 4.760 4.863 4.826 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection 5 2p b 5 a b 20K 40K 60K 80K 100K 88553 27739 31910 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace 5 2p a 5 2p b 5 a 5 b a b 11 22 33 44 55 SE +/- 0.29, N = 3 25.22 25.47 49.14 48.74 30.52 30.12 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare 5 2p a 5 2p b 5 a a b 2 4 6 8 10 SE +/- 0.026, N = 3 8.285 8.222 5.022 7.290 7.299 1. (CXX) g++ options: -O3
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace 5 2p a 5 2p b 5 a 5 b a b 10 20 30 40 50 SE +/- 0.13, N = 3 23.94 23.27 45.71 44.33 28.08 27.47 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.5 Blend File: BMW27 - Compute: CPU-Only 5 2p a 5 2p b 5 a a b 10 20 30 40 50 SE +/- 0.03, N = 3 22.59 22.51 42.52 27.61 27.60
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK 5 2p a 5 2p b 5 a 5 b a b 60K 120K 180K 240K 300K SE +/- 297.59, N = 3 259413 258457 130848 131072 201388 202957 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt 5 2p a 5 2p b 5 a 5 b a b 30K 60K 90K 120K 150K SE +/- 99.80, N = 3 118502 117542 60192 60152 86230 91276 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish 5 2p a 5 2p b 5 a 5 b a b 30K 60K 90K 120K 150K SE +/- 185.66, N = 3 118579 117771 60325 60345 86460 87360 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
QuantLib OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.30 5 2p a 5 2p b 5 a 5 b a b 600 1200 1800 2400 3000 SE +/- 25.52, N = 3 2760.9 2850.8 2756.9 2820.4 2746.5 2692.4 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 8 - Input: Bosphorus 4K 5 2p a 5 2p b 5 a 5 b a b 16 32 48 64 80 SE +/- 0.56, N = 12 62.84 69.54 65.34 70.86 73.44 67.55 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating 5 2p a 5 2p b 5 a b 100K 200K 300K 400K 500K 453996 440740 244220 411213 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating 5 2p a 5 2p b 5 a b 90K 180K 270K 360K 450K 403739 396844 271918 396928 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.8 Video Input: Bosphorus 1080p - Video Preset: Faster 5 2p a 5 2p b 5 a 5 b a b 7 14 21 28 35 SE +/- 0.03, N = 3 24.49 24.27 31.84 31.80 30.01 30.11 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon Obj 5 2p a 5 2p b 5 a 5 b a 16 32 48 64 80 SE +/- 0.08, N = 3 68.69 68.70 43.96 44.19 70.52 MIN: 68.03 / MAX: 70.15 MIN: 67.96 / MAX: 69.64 MIN: 43.73 / MAX: 44.3 MIN: 43.94 / MAX: 44.51 MIN: 69.67 / MAX: 71.94
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj 5 2p a 5 2p b 5 a 5 b a b 14 28 42 56 70 SE +/- 0.03, N = 3 64.41 64.04 39.19 39.22 64.51 64.64 MIN: 63.78 / MAX: 65.49 MIN: 63.29 / MAX: 64.92 MIN: 38.97 / MAX: 39.59 MIN: 39 / MAX: 39.63 MIN: 63.96 / MAX: 66.18 MIN: 64.13 / MAX: 65.46
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 5 2p a 5 2p b 5 a b 7 14 21 28 35 27.87 28.14 27.83 28.16 1. (CC) gcc options: -O2 -std=c99
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate 5 2p a 5 2p b 5 a 5 b a b 7 14 21 28 35 SE +/- 0.25, N = 15 31.34 31.76 17.87 18.64 29.38 29.05 1. (CC) gcc options: -O3 -march=native -fopenmp
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace 5 2p a 5 2p b 5 a 5 b a b 6 12 18 24 30 SE +/- 0.20, N = 3 14.49 14.42 24.99 25.09 17.21 16.82 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile 5 2p a 5 2p b 5 a 5 b a b 5 10 15 20 25 SE +/- 0.03, N = 3 15.46 15.34 20.38 20.64 16.93 17.04
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 4 - Input: Bosphorus 1080p 5 2p a 5 2p b 5 a 5 b a b 3 6 9 12 15 SE +/- 0.02, N = 3 10.41 10.50 10.76 10.83 10.30 10.23 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model 5 2p a 5 2p b 5 a 5 b a b 5 10 15 20 25 SE +/- 0.15, N = 3 11.22 11.80 19.09 19.63 13.86 13.53 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 5 2p a 5 2p b 5 a b 3 6 9 12 15 10.52210 10.36850 4.49565 6.92145 MIN: 8.88 MIN: 8.54 MIN: 3.9 MIN: 6.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding 5 2p a 5 2p b 5 a b 11K 22K 33K 44K 55K 49980.1 51199.1 24990.1 36827.5 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding 5 2p a 5 2p b 5 a b 9K 18K 27K 36K 45K 41983.3 41983.3 20991.7 32799.5 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens 5 2p a 5 2p b 5 a 5 b a b 5 10 15 20 25 SE +/- 0.038048290, N = 3 9.609453868 9.594769350 18.862199070 19.355553092 11.678414616 11.753321460 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown 5 2p a 5 2p b 5 a 5 b a b 16 32 48 64 80 SE +/- 0.11, N = 3 73.33 73.26 40.24 40.12 63.37 63.36 MIN: 71.99 / MAX: 75.06 MIN: 72.13 / MAX: 74.82 MIN: 39.64 / MAX: 40.95 MIN: 39.64 / MAX: 40.56 MIN: 62.18 / MAX: 66.58 MIN: 62.44 / MAX: 66.52
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction 5 2p a 5 2p b 5 a b 5 10 15 20 25 10.62 10.76 19.63 17.21 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Crown 5 2p a 5 2p b 5 a 5 b a 20 40 60 80 100 SE +/- 0.05, N = 3 81.45 81.09 43.95 43.98 69.11 MIN: 80.49 / MAX: 83.06 MIN: 80.03 / MAX: 82.56 MIN: 43.49 / MAX: 44.42 MIN: 43.29 / MAX: 44.67 MIN: 68.18 / MAX: 71.75
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 5 2p a 5 2p b 5 a b 0.649 1.298 1.947 2.596 3.245 2.67801 2.88451 1.37996 1.26107 MIN: 1.89 MIN: 1.79 MIN: 1.25 MIN: 1.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon 5 2p a 5 2p b 5 a 5 b a 20 40 60 80 100 SE +/- 0.14, N = 3 76.81 76.63 48.52 48.51 78.48 MIN: 76.07 / MAX: 77.73 MIN: 75.97 / MAX: 78.17 MIN: 48.32 / MAX: 48.91 MIN: 48.29 / MAX: 48.79 MIN: 77.64 / MAX: 80.46
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon 5 2p a 5 2p b 5 a 5 b a b 16 32 48 64 80 SE +/- 0.01, N = 3 73.96 73.78 44.91 45.11 73.84 73.83 MIN: 73.12 / MAX: 75.27 MIN: 72.91 / MAX: 74.82 MIN: 44.68 / MAX: 45.26 MIN: 44.88 / MAX: 45.48 MIN: 73.31 / MAX: 75.9 MIN: 73.34 / MAX: 74.66
ASKAP Test: Hogbom Clean OpenMP OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP 5 2p a 5 2p b 5 a b 200 400 600 800 1000 436.68 434.78 869.57 806.45 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 5 2p a 5 2p b 5 a b 4 8 12 16 20 15.67 15.51 12.09 11.34 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 5 2p a 5 2p b 5 a b 9K 18K 27K 36K 45K 42030.04 42551.90 20920.49 22257.14 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 8 - Input: Bosphorus 1080p 5 2p a 5 2p b 5 a 5 b a b 20 40 60 80 100 SE +/- 1.23, N = 5 107.10 104.17 107.38 108.36 109.98 109.71 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Pennant Test: sedovbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: sedovbig 5 2p a 5 2p b 5 a b 4 8 12 16 20 7.656369 7.793325 14.001110 9.563610 1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade 5 2p a 5 2p b 5 a a b 1300 2600 3900 5200 6500 SE +/- 3.21, N = 3 5682 5729 5718 5793 5833 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 5 2p a 5 2p b 5 a b 0.2804 0.5608 0.8412 1.1216 1.402 0.911806 0.917226 0.625930 1.246180 MIN: 0.78 MIN: 0.77 MIN: 0.57 MIN: 1.13 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
ASKAP Test: tConvolve OpenMP - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding 5 2p a 5 2p b 5 a b 4K 8K 12K 16K 20K 16641.0 15662.1 19018.3 19018.3 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve OpenMP - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding 5 2p a 5 2p b 5 a b 4K 8K 12K 16K 20K 12102.5 12678.9 17750.4 20481.2 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion 5 2p a 5 2p b 5 a a b 1000 2000 3000 4000 5000 SE +/- 7.02, N = 3 4734 4818 4760 4867 4885 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 12 - Input: Bosphorus 4K 5 2p a 5 2p b 5 a 5 b a b 50 100 150 200 250 SE +/- 0.73, N = 3 169.11 176.11 220.81 226.73 222.84 214.93 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 13 - Input: Bosphorus 4K 5 2p a 5 2p b 5 a 5 b a b 40 80 120 160 200 SE +/- 0.70, N = 3 169.66 172.57 199.69 196.06 195.92 198.14 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 5 2p a 5 2p b 5 a b 0.258 0.516 0.774 1.032 1.29 0.674336 0.669349 1.146560 0.827622 MIN: 0.62 MIN: 0.6 MIN: 1.09 MIN: 0.78 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Pennant Test: leblancbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: leblancbig 5 2p a 5 2p b 5 a b 2 4 6 8 10 4.294731 4.418012 7.974022 5.104078 1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction 5 2p a 5 2p b 5 a b 1.1402 2.2804 3.4206 4.5608 5.701 2.48654294 2.48386908 5.06756401 4.43804121 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 5 2p a 5 2p b 5 a b 0.7031 1.4062 2.1093 2.8124 3.5155 1.85167 1.89859 2.76234 3.12487 MIN: 1.61 MIN: 1.58 MIN: 2.59 MIN: 2.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 12 - Input: Bosphorus 1080p 5 2p a 5 2p b 5 a 5 b a b 140 280 420 560 700 SE +/- 2.97, N = 3 540.73 565.06 629.30 622.73 604.84 601.98 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.5 Encoder Mode: Preset 13 - Input: Bosphorus 1080p 5 2p a 5 2p b 5 a 5 b a b 120 240 360 480 600 SE +/- 5.77, N = 3 525.15 546.03 571.08 560.57 552.24 549.83 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Phoronix Test Suite v10.8.5