Benchmarks by Michael Larabel for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2212240-NE-AMDEPYCGE62 AMD EPYC Genoa Memory Scaling - Phoronix Test Suite AMD EPYC Genoa Memory Scaling Benchmarks by Michael Larabel for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2212240-NE-AMDEPYCGE62&grr .
AMD EPYC Genoa Memory Scaling Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution 12c 10c 8c 6c 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1002E BIOS) AMD Device 14a4 1520GB 800GB INTEL SSDPF21Q800GB ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Ubuntu 22.10 6.1.0-phx (x86_64) GNOME Shell 43.0 X Server 1.21.1.4 1.3.224 GCC 12.2.0 + Clang 15.0.2-1 ext4 1920x1080 1264GB 1008GB 768GB OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10110d Java Details - OpenJDK Runtime Environment (build 11.0.17+8-post-Ubuntu-1ubuntu2) Python Details - Python 3.10.7 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AMD EPYC Genoa Memory Scaling wrf: conus 2.5km hpcg: openvkl: vklBenchmark ISPC incompact3d: X3D-benchmarking input.i3d nwchem: C240 Buckyball cockroach: KV, 10% Reads - 512 cockroach: KV, 95% Reads - 1024 cockroach: KV, 60% Reads - 512 cockroach: KV, 50% Reads - 1024 ospray: particle_volume/scivis/real_time cockroach: KV, 50% Reads - 512 cockroach: KV, 95% Reads - 512 ospray: particle_volume/pathtracer/real_time relion: Basic - CPU tensorflow: CPU - 256 - ResNet-50 luxcorerender: Danish Mood - CPU onnx: fcn-resnet101-11 - CPU - Standard luxcorerender: Orange Juice - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU openradioss: Bird Strike on Windshield cassandra: Writes onednn: Recurrent Neural Network Training - u8s8f32 - CPU ospray: particle_volume/ao/real_time cockroach: KV, 60% Reads - 1024 onednn: Recurrent Neural Network Inference - u8s8f32 - CPU graph500: 26 build-linux-kernel: allmodconfig build-gem5: Time To Compile ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time cockroach: KV, 10% Reads - 1024 deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream build-nodejs: Time To Compile openradioss: INIVOL and Fluid Structure Interaction Drop Container openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU cockroach: MoVR - 512 cockroach: MoVR - 1024 openfoam: drivaerFastback, Medium Mesh Size - Execution Time nginx: 500 build-linux-kernel: defconfig simdjson: TopTweet openradioss: Bumper Beam blender: Barbershop - CPU-Only deepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Stream openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU build-llvm: Ninja simdjson: DistinctUserID simdjson: PartialTweets openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU avifenc: 0 namd: ATPase Simulation - 327,506 Atoms openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU stargate: 192000 - 1024 openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream simdjson: Kostya deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream dacapobench: H2 simdjson: LargeRand build2: Time To Compile build-php: Time To Compile deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream build-gdb: Time To Compile deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream compress-7zip: Decompression Rating compress-7zip: Compression Rating nekrs: TurboPipe Periodic stargate: 96000 - 1024 build-godot: Time To Compile avifenc: 2 gpaw: Carbon Nanotube svt-av1: Preset 12 - Bosphorus 4K rodinia: OpenMP Streamcluster gromacs: MPI CPU - water_GMX50_bare onednn: IP Shapes 3D - bf16bf16bf16 - CPU npb: IS.D blender: Classroom - CPU-Only build-apache: Time To Compile build-mesa: Time To Compile liquid-dsp: 384 - 256 - 57 liquid-dsp: 256 - 256 - 57 oidn: RTLightmap.hdr.4096x4096 minibude: OpenMP - BM2 minibude: OpenMP - BM2 onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU kvazaar: Bosphorus 4K - Very Fast xmrig: Monero - 1M npb: CG.C kvazaar: Bosphorus 4K - Medium xmrig: Wownero - 1M oidn: RT.hdr_alb_nrm.3840x2160 blender: BMW27 - CPU-Only npb: MG.C npb: SP.C kvazaar: Bosphorus 4K - Ultra Fast astcenc: Exhaustive build-mplayer: Time To Compile npb: LU.C astcenc: Thorough dacapobench: Jython rodinia: OpenMP CFD Solver avifenc: 6, Lossless avifenc: 10, Lossless embree: Pathtracer ISPC - Crown embree: Pathtracer ISPC - Asian Dragon mt-dgemm: Sustained Floating-Point Rate avifenc: 6 12c 10c 8c 6c 4070.19 86.8143 1325 125.526248 1537.1 35970.0 64661.8 52330.1 47465.5 42.7970 47621.9 64467.6 229.269 128.101 109.13 9.69 254 28.82 2275.86 216.88 251793 1968.70 43.7061 52573.3 2344.29 565152000 147.147 139.238 43.1262 43.9775 53.7748 36846.9 125.7196 761.4853 101.465 81.57 0.55 147769.26 948.5 953.8 109.53721 201032.06 25.501 6.59 79.86 81.03 111.8909 856.0186 49.98 959.16 75.655 6.86 5.65 1109.45 42.98 1110.68 42.95 250.34 191.43 470.98 101.74 63.247 0.12783 5.30 9038.47 9.95 19171.51 0.97 119606.21 2.829061 4.35 11018.37 6.48 7394.65 1133.2823 84.3506 4.85 9867.41 1133.4774 84.2470 4.11 155.4797 615.4474 4802 1.25 49.917 44.519 80.0790 1195.9102 41.709 48.7673 1964.2730 1181435 923176 821462000000 4.345890 34.032 34.848 23.151 251.769 6.001 18.706 3.95471 8491.01 20.92 20.461 20.117 10347000000 10347000000 1.65 345.612 8640.310 0.446930 73.44 104604.6 80225.01 62.56 126465.6 3.52 8.58 209846.76 260471.50 77.83 11.7250 7.777 489164.65 106.5663 3380 6.050 5.287 4.241 182.4498 213.7507 70.407733 2.459 4563.183 48.2945 1317 146.289830 1531 35993.1 62029.8 51748.8 48449.0 42.9999 49102.7 60769.7 230.282 151.398 105.91 9.62 255 28.19 2325.71 218.22 243603 2030.72 43.0316 51959.5 2438.00 574018000 145.410 134.373 43.3287 43.9969 54.4083 35776.8 128.9249 742.7956 101.941 81.15 0.55 147717.32 949.6 949.5 117.94003 198858.66 25.407 6.49 79.70 80.37 113.4079 844.4268 51.29 934.71 75.440 6.84 5.67 1110.44 42.94 1104.59 43.18 249.12 192.30 469.43 102.01 63.246 0.12759 5.28 9063.84 9.91 19254.08 0.98 122938.23 2.806190 4.33 11066.16 6.45 7425.10 1133.1821 84.4822 4.83 9900.47 1135.1845 84.2657 4.11 156.5374 611.2926 4832 1.25 49.800 44.608 79.7139 1201.1391 42.412 48.7431 1965.5606 1171627 893433 786258000000 4.354556 33.616 34.909 23.373 241.369 6.285 18.677 4.00938 7124.92 20.76 20.480 20.205 10352666667 10340000000 1.63 346.679 8666.980 0.463454 75.35 102599.6 81179.00 62.23 127226.6 3.44 8.42 177097.42 239496.01 77.30 11.7637 7.755 489995.20 106.8542 3329 6.074 5.286 4.337 184.7346 214.3093 70.613775 2.411 6551.876 45.0005 1325 270.091271 1519.6 34832.9 58195.5 52515.2 47498.1 43.8442 47596.6 64111.9 228.581 221.336 105.01 9.56 257 29.04 2371.78 219.45 240854 1982.15 43.9700 52559.0 2375.45 531854000 147.377 136.793 43.4310 44.2302 54.5087 36685.7 135.6212 705.7116 101.149 81.09 0.55 152292.39 960.3 946.9 166.14971 197081.98 25.528 6.57 79.20 80.18 123.8576 773.0686 54.80 875.39 75.725 6.86 5.66 1119.79 42.59 1129.01 42.22 249.26 192.25 472.84 101.26 62.961 0.12768 5.26 9113.11 9.90 19278.93 0.98 123571.68 2.811555 4.31 11108.16 6.49 7389.00 1136.8544 84.2115 4.82 9931.49 1137.5119 84.1546 4.11 155.8191 614.6105 4731 1.25 49.871 44.583 79.6865 1201.9839 42.409 48.9982 1954.1227 1159901 879430 740247000000 4.351402 33.905 34.687 24.598 227.898 6.018 18.678 3.99305 6675.71 20.68 20.589 20.107 10349666667 10337666667 1.64 344.639 8615.967 0.465796 73.04 101953.5 79784.15 61.81 127081.2 3.47 8.34 153458.78 208535.23 76.84 11.8090 7.808 466769.54 107.1108 3369 5.970 5.270 4.252 185.4907 217.4060 71.010323 2.420 7432.655 36.5411 1212 348.880025 1517.9 35742.3 60137.3 51275.1 47593.9 43.2396 47428.0 62666.5 230.440 258.500 95.67 9.49 253 28.90 2471.57 219.10 246882 2072.57 43.3575 52626.4 2479.62 392496000 145.766 134.695 43.2899 44.2716 54.6054 36329.6 166.4322 575.7518 102.776 80.81 0.54 151213.17 954.7 952.7 227.89595 196805.30 24.747 6.55 79.62 79.93 150.9167 635.0246 58.67 817.27 76.747 6.83 5.69 1153.70 41.33 1150.54 41.44 250.49 191.29 473.69 101.08 63.803 0.12820 5.28 9081.73 9.89 19314.04 0.97 121027.25 2.824814 4.30 11150.32 6.56 7306.47 1148.4964 82.4869 4.81 9959.38 1148.3278 82.2613 4.11 157.2164 608.5336 4830 1.24 50.084 44.698 80.4399 1190.5286 43.245 49.6278 1930.3277 1177484 824926 659554333333 4.364767 33.671 34.874 26.308 221.161 6.409 17.940 3.96488 5690.01 20.71 20.720 20.157 10349000000 10340333333 1.54 346.077 8651.924 0.465059 71.41 100446.2 71662.28 61.40 126057.7 3.29 8.33 117733.57 167474.70 75.86 11.8207 7.773 454360.62 106.5095 3345 6.152 5.330 4.250 187.6107 221.2898 70.898312 2.435 OpenBenchmarking.org
WRF Input: conus 2.5km OpenBenchmarking.org Seconds, Fewer Is Better WRF 4.2.2 Input: conus 2.5km 12c 10c 8c 6c 1600 3200 4800 6400 8000 4070.19 4563.18 6551.88 7432.66 1. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 12c 10c 8c 6c 20 40 60 80 100 SE +/- 1.12, N = 12 SE +/- 3.31, N = 9 SE +/- 0.49, N = 9 SE +/- 0.99, N = 9 86.81 48.29 45.00 36.54 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC 12c 10c 8c 6c 300 600 900 1200 1500 SE +/- 6.93, N = 3 SE +/- 11.03, N = 9 SE +/- 8.82, N = 3 SE +/- 15.59, N = 3 1325 1317 1325 1212 MIN: 329 / MAX: 4553 MIN: 327 / MAX: 5660 MIN: 330 / MAX: 5664 MIN: 328 / MAX: 4115
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d 12c 10c 8c 6c 80 160 240 320 400 SE +/- 0.14, N = 3 SE +/- 0.11, N = 3 SE +/- 2.69, N = 9 SE +/- 4.79, N = 9 125.53 146.29 270.09 348.88 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
NWChem Input: C240 Buckyball OpenBenchmarking.org Seconds, Fewer Is Better NWChem 7.0.2 Input: C240 Buckyball 12c 10c 8c 6c 300 600 900 1200 1500 1537.1 1531.0 1519.6 1517.9 1. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2
CockroachDB Workload: KV, 10% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 512 12c 10c 8c 6c 8K 16K 24K 32K 40K SE +/- 343.66, N = 15 SE +/- 270.36, N = 15 SE +/- 351.71, N = 6 SE +/- 438.30, N = 15 35970.0 35993.1 34832.9 35742.3
CockroachDB Workload: KV, 95% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 1024 12c 10c 8c 6c 14K 28K 42K 56K 70K SE +/- 575.30, N = 3 SE +/- 1142.40, N = 15 SE +/- 1317.65, N = 15 SE +/- 1310.27, N = 15 64661.8 62029.8 58195.5 60137.3
CockroachDB Workload: KV, 60% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 60% Reads - Concurrency: 512 12c 10c 8c 6c 11K 22K 33K 44K 55K SE +/- 268.61, N = 3 SE +/- 620.92, N = 15 SE +/- 411.73, N = 13 SE +/- 555.56, N = 15 52330.1 51748.8 52515.2 51275.1
CockroachDB Workload: KV, 50% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 1024 12c 10c 8c 6c 10K 20K 30K 40K 50K SE +/- 366.75, N = 15 SE +/- 380.16, N = 3 SE +/- 468.66, N = 15 SE +/- 391.13, N = 9 47465.5 48449.0 47498.1 47593.9
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: particle_volume/scivis/real_time 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 42.80 43.00 43.84 43.24
CockroachDB Workload: KV, 50% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 512 12c 10c 8c 6c 11K 22K 33K 44K 55K SE +/- 464.03, N = 15 SE +/- 514.54, N = 3 SE +/- 454.84, N = 15 SE +/- 32.88, N = 3 47621.9 49102.7 47596.6 47428.0
CockroachDB Workload: KV, 95% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 512 12c 10c 8c 6c 14K 28K 42K 56K 70K SE +/- 702.29, N = 3 SE +/- 1044.13, N = 15 SE +/- 890.57, N = 3 SE +/- 813.26, N = 15 64467.6 60769.7 64111.9 62666.5
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: particle_volume/pathtracer/real_time 12c 10c 8c 6c 50 100 150 200 250 SE +/- 1.54, N = 3 SE +/- 1.94, N = 3 SE +/- 1.74, N = 3 SE +/- 0.59, N = 3 229.27 230.28 228.58 230.44
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 3.1.1 Test: Basic - Device: CPU 12c 10c 8c 6c 60 120 180 240 300 SE +/- 1.38, N = 5 SE +/- 1.86, N = 4 SE +/- 2.88, N = 3 SE +/- 2.59, N = 6 128.10 151.40 221.34 258.50 1. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -lmpi_cxx -lmpi
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 256 - Model: ResNet-50 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.48, N = 3 SE +/- 0.36, N = 3 SE +/- 0.48, N = 3 SE +/- 0.26, N = 3 109.13 105.91 105.01 95.67
LuxCoreRender Scene: Danish Mood - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: CPU 12c 10c 8c 6c 3 6 9 12 15 SE +/- 0.09, N = 15 SE +/- 0.17, N = 12 SE +/- 0.11, N = 15 SE +/- 0.14, N = 12 9.69 9.62 9.56 9.49 MIN: 4 / MAX: 12.39 MIN: 3.97 / MAX: 12.9 MIN: 3.94 / MAX: 12.41 MIN: 3.85 / MAX: 12.15
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard 12c 10c 8c 6c 60 120 180 240 300 SE +/- 2.33, N = 7 SE +/- 3.09, N = 3 SE +/- 2.84, N = 5 SE +/- 2.17, N = 12 254 255 257 253 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
LuxCoreRender Scene: Orange Juice - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: CPU 12c 10c 8c 6c 7 14 21 28 35 SE +/- 0.63, N = 15 SE +/- 0.29, N = 3 SE +/- 0.72, N = 15 SE +/- 0.71, N = 15 28.82 28.19 29.04 28.90 MIN: 23.01 / MAX: 45.86 MIN: 23.3 / MAX: 45.65 MIN: 22.62 / MAX: 45.48 MIN: 22.4 / MAX: 44.91
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 12c 10c 8c 6c 500 1000 1500 2000 2500 SE +/- 24.22, N = 3 SE +/- 25.04, N = 15 SE +/- 25.14, N = 15 SE +/- 31.16, N = 3 2275.86 2325.71 2371.78 2471.57 MIN: 2213.34 MIN: 2171.69 MIN: 2234.23 MIN: 2410.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenRadioss Model: Bird Strike on Windshield OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bird Strike on Windshield 12c 10c 8c 6c 50 100 150 200 250 SE +/- 0.38, N = 3 SE +/- 0.54, N = 3 SE +/- 0.19, N = 3 SE +/- 0.14, N = 3 216.88 218.22 219.45 219.10
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 4.0 Test: Writes 12c 10c 8c 6c 50K 100K 150K 200K 250K SE +/- 3742.45, N = 12 SE +/- 2429.87, N = 3 SE +/- 1899.17, N = 3 SE +/- 2957.03, N = 3 251793 243603 240854 246882
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 12c 10c 8c 6c 400 800 1200 1600 2000 SE +/- 31.84, N = 15 SE +/- 14.89, N = 3 SE +/- 28.30, N = 3 SE +/- 16.27, N = 10 1968.70 2030.72 1982.15 2072.57 MIN: 1632.62 MIN: 1981.15 MIN: 1911.33 MIN: 1942.14 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: particle_volume/ao/real_time 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 43.71 43.03 43.97 43.36
CockroachDB Workload: KV, 60% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 60% Reads - Concurrency: 1024 12c 10c 8c 6c 11K 22K 33K 44K 55K SE +/- 239.52, N = 3 SE +/- 400.61, N = 10 SE +/- 447.89, N = 3 SE +/- 448.33, N = 3 52573.3 51959.5 52559.0 52626.4
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 12c 10c 8c 6c 500 1000 1500 2000 2500 SE +/- 21.01, N = 3 SE +/- 30.76, N = 3 SE +/- 21.41, N = 3 SE +/- 25.74, N = 15 2344.29 2438.00 2375.45 2479.62 MIN: 2288.85 MIN: 2353.97 MIN: 2319.45 MIN: 2293.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Graph500 Scale: 26 OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 12c 10c 8c 6c 120M 240M 360M 480M 600M 565152000 574018000 531854000 392496000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: allmodconfig 12c 10c 8c 6c 30 60 90 120 150 SE +/- 0.90, N = 3 SE +/- 0.72, N = 3 SE +/- 1.03, N = 3 SE +/- 0.14, N = 3 147.15 145.41 147.38 145.77
Timed Gem5 Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Gem5 Compilation 21.2 Time To Compile 12c 10c 8c 6c 30 60 90 120 150 SE +/- 0.16, N = 3 SE +/- 0.36, N = 3 SE +/- 0.77, N = 3 SE +/- 0.57, N = 3 139.24 134.37 136.79 134.70
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.15, N = 3 SE +/- 0.12, N = 3 SE +/- 0.13, N = 3 SE +/- 0.15, N = 3 43.13 43.33 43.43 43.29
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: gravity_spheres_volume/dim_512/ao/real_time 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 43.98 44.00 44.23 44.27
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time 12c 10c 8c 6c 12 24 36 48 60 SE +/- 0.50, N = 3 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 3 53.77 54.41 54.51 54.61
CockroachDB Workload: KV, 10% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 1024 12c 10c 8c 6c 8K 16K 24K 32K 40K SE +/- 155.07, N = 3 SE +/- 346.25, N = 3 SE +/- 322.68, N = 3 SE +/- 206.35, N = 3 36846.9 35776.8 36685.7 36329.6
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.43, N = 3 SE +/- 0.38, N = 3 SE +/- 1.66, N = 15 125.72 128.92 135.62 166.43
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 160 320 480 640 800 SE +/- 0.72, N = 3 SE +/- 2.41, N = 3 SE +/- 2.11, N = 3 SE +/- 6.13, N = 15 761.49 742.80 705.71 575.75
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 18.8 Time To Compile 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.26, N = 3 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 SE +/- 0.06, N = 3 101.47 101.94 101.15 102.78
OpenRadioss Model: INIVOL and Fluid Structure Interaction Drop Container OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: INIVOL and Fluid Structure Interaction Drop Container 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 81.57 81.15 81.09 80.81
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 12c 10c 8c 6c 0.1238 0.2476 0.3714 0.4952 0.619 SE +/- 0.00, N = 3 SE +/- 0.00, N = 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.55 0.55 0.55 0.54 MIN: 0.5 / MAX: 34.71 MIN: 0.5 / MAX: 41.23 MIN: 0.5 / MAX: 30.68 MIN: 0.5 / MAX: 34.19 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 12c 10c 8c 6c 30K 60K 90K 120K 150K SE +/- 745.28, N = 3 SE +/- 1134.97, N = 10 SE +/- 994.61, N = 3 SE +/- 365.43, N = 3 147769.26 147717.32 152292.39 151213.17 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
CockroachDB Workload: MoVR - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 512 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 3.38, N = 3 SE +/- 3.66, N = 3 SE +/- 9.03, N = 3 SE +/- 4.87, N = 3 948.5 949.6 960.3 954.7
CockroachDB Workload: MoVR - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 1024 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 1.42, N = 3 SE +/- 0.58, N = 3 SE +/- 3.18, N = 3 SE +/- 1.56, N = 3 953.8 949.5 946.9 952.7
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time 12c 10c 8c 6c 50 100 150 200 250 109.54 117.94 166.15 227.90 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 12c 10c 8c 6c 40K 80K 120K 160K 200K SE +/- 291.63, N = 3 SE +/- 335.64, N = 3 SE +/- 453.48, N = 3 SE +/- 113.87, N = 3 201032.06 198858.66 197081.98 196805.30 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig 12c 10c 8c 6c 6 12 18 24 30 SE +/- 0.19, N = 11 SE +/- 0.21, N = 14 SE +/- 0.21, N = 9 SE +/- 0.22, N = 7 25.50 25.41 25.53 24.75
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: TopTweet 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.07, N = 6 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 6.59 6.49 6.57 6.55 1. (CXX) g++ options: -O3
OpenRadioss Model: Bumper Beam OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bumper Beam 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.79, N = 3 SE +/- 0.75, N = 3 SE +/- 0.70, N = 3 SE +/- 0.71, N = 3 79.86 79.70 79.20 79.62
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: Barbershop - Compute: CPU-Only 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.21, N = 3 SE +/- 0.15, N = 3 SE +/- 0.24, N = 3 SE +/- 0.31, N = 3 81.03 80.37 80.18 79.93
Neural Magic DeepSparse Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 30 60 90 120 150 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 SE +/- 0.19, N = 3 SE +/- 1.56, N = 15 111.89 113.41 123.86 150.92
Neural Magic DeepSparse Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 0.57, N = 3 SE +/- 0.53, N = 3 SE +/- 1.22, N = 3 SE +/- 6.69, N = 15 856.02 844.43 773.07 635.02
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU 12c 10c 8c 6c 13 26 39 52 65 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 SE +/- 0.57, N = 6 SE +/- 0.37, N = 3 49.98 51.29 54.80 58.67 MIN: 38.24 / MAX: 187.97 MIN: 40.28 / MAX: 292.83 MIN: 40.7 / MAX: 276.86 MIN: 43.56 / MAX: 315.05 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 2.32, N = 3 SE +/- 1.48, N = 3 SE +/- 8.79, N = 6 SE +/- 5.14, N = 3 959.16 934.71 875.39 817.27 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 13.0 Build System: Ninja 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.23, N = 3 SE +/- 0.21, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 75.66 75.44 75.73 76.75
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: DistinctUserID 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.86 6.84 6.86 6.83 1. (CXX) g++ options: -O3
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: PartialTweets 12c 10c 8c 6c 1.2803 2.5606 3.8409 5.1212 6.4015 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 5.65 5.67 5.66 5.69 1. (CXX) g++ options: -O3
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 3.30, N = 3 SE +/- 2.71, N = 3 SE +/- 3.60, N = 3 SE +/- 4.42, N = 3 1109.45 1110.44 1119.79 1153.70 MIN: 810.74 / MAX: 1835.01 MIN: 769.04 / MAX: 1860.23 MIN: 808.33 / MAX: 1875.91 MIN: 853.88 / MAX: 1939.06 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.13, N = 3 SE +/- 0.12, N = 3 SE +/- 0.15, N = 3 SE +/- 0.17, N = 3 42.98 42.94 42.59 41.33 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 8.73, N = 3 SE +/- 5.36, N = 3 SE +/- 0.54, N = 3 SE +/- 1.87, N = 3 1110.68 1104.59 1129.01 1150.54 MIN: 833.53 / MAX: 1865.19 MIN: 807.38 / MAX: 1818.79 MIN: 850.94 / MAX: 1870.94 MIN: 870.26 / MAX: 1902.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.32, N = 3 SE +/- 0.20, N = 3 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 42.95 43.18 42.22 41.44 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 50 100 150 200 250 SE +/- 0.32, N = 3 SE +/- 0.03, N = 3 SE +/- 0.69, N = 3 SE +/- 0.13, N = 3 250.34 249.12 249.26 250.49 MIN: 222.95 / MAX: 301.42 MIN: 209.28 / MAX: 311.3 MIN: 207.76 / MAX: 340.53 MIN: 213.3 / MAX: 307.84 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 40 80 120 160 200 SE +/- 0.21, N = 3 SE +/- 0.03, N = 3 SE +/- 0.48, N = 3 SE +/- 0.09, N = 3 191.43 192.30 192.25 191.29 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU 12c 10c 8c 6c 100 200 300 400 500 SE +/- 0.21, N = 3 SE +/- 0.10, N = 3 SE +/- 0.27, N = 3 SE +/- 0.14, N = 3 470.98 469.43 472.84 473.69 MIN: 451.07 / MAX: 556.04 MIN: 432.92 / MAX: 555.25 MIN: 394.37 / MAX: 553.15 MIN: 423.34 / MAX: 579.41 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 101.74 102.01 101.26 101.08 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 0 12c 10c 8c 6c 14 28 42 56 70 SE +/- 0.18, N = 3 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 SE +/- 0.47, N = 3 63.25 63.25 62.96 63.80 1. (CXX) g++ options: -O3 -fPIC -lm
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 12c 10c 8c 6c 0.0288 0.0576 0.0864 0.1152 0.144 SE +/- 0.00009, N = 3 SE +/- 0.00007, N = 3 SE +/- 0.00046, N = 3 SE +/- 0.00009, N = 3 0.12783 0.12759 0.12768 0.12820
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU 12c 10c 8c 6c 1.1925 2.385 3.5775 4.77 5.9625 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 5.30 5.28 5.26 5.28 MIN: 4.42 / MAX: 40.66 MIN: 4.37 / MAX: 41.23 MIN: 4.42 / MAX: 42.93 MIN: 4.34 / MAX: 38.93 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 9.96, N = 3 SE +/- 5.19, N = 3 SE +/- 2.85, N = 3 SE +/- 7.67, N = 3 9038.47 9063.84 9113.11 9081.73 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 9.95 9.91 9.90 9.89 MIN: 8.42 / MAX: 52.38 MIN: 8.4 / MAX: 50.42 MIN: 8.39 / MAX: 56.99 MIN: 8.35 / MAX: 32.16 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 4K 8K 12K 16K 20K SE +/- 12.43, N = 3 SE +/- 30.88, N = 3 SE +/- 31.30, N = 3 SE +/- 33.95, N = 3 19171.51 19254.08 19278.93 19314.04 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 12c 10c 8c 6c 0.2205 0.441 0.6615 0.882 1.1025 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.97 0.98 0.98 0.97 MIN: 0.85 / MAX: 22.9 MIN: 0.85 / MAX: 39.82 MIN: 0.86 / MAX: 39.58 MIN: 0.86 / MAX: 33.82 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 12c 10c 8c 6c 30K 60K 90K 120K 150K SE +/- 1214.59, N = 3 SE +/- 815.42, N = 3 SE +/- 1158.58, N = 3 SE +/- 681.80, N = 3 119606.21 122938.23 123571.68 121027.25 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Stargate Digital Audio Workstation Sample Rate: 192000 - Buffer Size: 1024 OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 12c 10c 8c 6c 0.6365 1.273 1.9095 2.546 3.1825 SE +/- 0.001919, N = 3 SE +/- 0.017291, N = 3 SE +/- 0.019484, N = 3 SE +/- 0.004057, N = 3 2.829061 2.806190 2.811555 2.824814 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 0.9788 1.9576 2.9364 3.9152 4.894 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.35 4.33 4.31 4.30 MIN: 3.52 / MAX: 41.44 MIN: 3.51 / MAX: 41.25 MIN: 3.51 / MAX: 43.89 MIN: 3.52 / MAX: 43.57 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 1.42, N = 3 SE +/- 3.30, N = 3 SE +/- 1.79, N = 3 SE +/- 1.79, N = 3 11018.37 11066.16 11108.16 11150.32 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 6.48 6.45 6.49 6.56 MIN: 5.06 / MAX: 59.88 MIN: 4.97 / MAX: 59.86 MIN: 4.93 / MAX: 59.51 MIN: 4.99 / MAX: 59.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU 12c 10c 8c 6c 1600 3200 4800 6400 8000 SE +/- 2.30, N = 3 SE +/- 13.32, N = 3 SE +/- 6.27, N = 3 SE +/- 4.59, N = 3 7394.65 7425.10 7389.00 7306.47 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 0.82, N = 3 SE +/- 0.20, N = 3 SE +/- 0.88, N = 3 SE +/- 0.67, N = 3 1133.28 1133.18 1136.85 1148.50
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.18, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.31, N = 3 84.35 84.48 84.21 82.49
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU 12c 10c 8c 6c 1.0913 2.1826 3.2739 4.3652 5.4565 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.85 4.83 4.82 4.81 MIN: 4.06 / MAX: 28.62 MIN: 4.08 / MAX: 28.68 MIN: 3.98 / MAX: 28.83 MIN: 4.14 / MAX: 27.29 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 2.57, N = 3 SE +/- 2.08, N = 3 SE +/- 7.50, N = 3 SE +/- 3.42, N = 3 9867.41 9900.47 9931.49 9959.38 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 1.25, N = 3 SE +/- 1.00, N = 3 SE +/- 1.67, N = 3 SE +/- 1.05, N = 3 1133.48 1135.18 1137.51 1148.33
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.21, N = 3 SE +/- 0.03, N = 3 SE +/- 0.16, N = 3 SE +/- 0.25, N = 3 84.25 84.27 84.15 82.26
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: Kostya 12c 10c 8c 6c 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.11 4.11 4.11 4.11 1. (CXX) g++ options: -O3
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 30 60 90 120 150 SE +/- 0.46, N = 3 SE +/- 0.55, N = 3 SE +/- 0.27, N = 3 SE +/- 0.58, N = 3 155.48 156.54 155.82 157.22
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 130 260 390 520 650 SE +/- 1.72, N = 3 SE +/- 2.48, N = 3 SE +/- 1.32, N = 3 SE +/- 2.24, N = 3 615.45 611.29 614.61 608.53
DaCapo Benchmark Java Test: H2 OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: H2 12c 10c 8c 6c 1000 2000 3000 4000 5000 SE +/- 53.17, N = 20 SE +/- 39.79, N = 20 SE +/- 40.50, N = 20 SE +/- 36.16, N = 20 4802 4832 4731 4830
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: LargeRandom 12c 10c 8c 6c 0.2813 0.5626 0.8439 1.1252 1.4065 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.25 1.25 1.25 1.24 1. (CXX) g++ options: -O3
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 12c 10c 8c 6c 11 22 33 44 55 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.20, N = 3 SE +/- 0.28, N = 3 49.92 49.80 49.87 50.08
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 8.1.9 Time To Compile 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 44.52 44.61 44.58 44.70
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 SE +/- 0.07, N = 3 80.08 79.71 79.69 80.44
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 300 600 900 1200 1500 SE +/- 4.04, N = 3 SE +/- 0.69, N = 3 SE +/- 3.22, N = 3 SE +/- 1.21, N = 3 1195.91 1201.14 1201.98 1190.53
Timed GDB GNU Debugger Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed GDB GNU Debugger Compilation 10.2 Time To Compile 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.17, N = 3 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 41.71 42.41 42.41 43.25
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.21, N = 3 48.77 48.74 49.00 49.63
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 400 800 1200 1600 2000 SE +/- 4.95, N = 3 SE +/- 1.61, N = 3 SE +/- 1.56, N = 3 SE +/- 8.40, N = 3 1964.27 1965.56 1954.12 1930.33
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating 12c 10c 8c 6c 300K 600K 900K 1200K 1500K SE +/- 3305.67, N = 3 SE +/- 5138.86, N = 3 SE +/- 9235.88, N = 3 SE +/- 2020.82, N = 3 1181435 1171627 1159901 1177484 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating 12c 10c 8c 6c 200K 400K 600K 800K 1000K SE +/- 6636.11, N = 3 SE +/- 2580.44, N = 3 SE +/- 3797.71, N = 3 SE +/- 7292.38, N = 3 923176 893433 879430 824926 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
nekRS Input: TurboPipe Periodic OpenBenchmarking.org FLOP/s, More Is Better nekRS 22.0 Input: TurboPipe Periodic 12c 10c 8c 6c 200000M 400000M 600000M 800000M 1000000M SE +/- 9551971733.63, N = 3 SE +/- 7825985326.68, N = 3 SE +/- 5892587066.25, N = 3 SE +/- 1934071468.29, N = 3 821462000000 786258000000 740247000000 659554333333 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi
Stargate Digital Audio Workstation Sample Rate: 96000 - Buffer Size: 1024 OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 12c 10c 8c 6c 0.9821 1.9642 2.9463 3.9284 4.9105 SE +/- 0.023689, N = 3 SE +/- 0.010431, N = 3 SE +/- 0.008144, N = 3 SE +/- 0.002133, N = 3 4.345890 4.354556 4.351402 4.364767 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 12c 10c 8c 6c 8 16 24 32 40 SE +/- 0.40, N = 4 SE +/- 0.04, N = 3 SE +/- 0.19, N = 3 SE +/- 0.11, N = 3 34.03 33.62 33.91 33.67
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 2 12c 10c 8c 6c 8 16 24 32 40 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 34.85 34.91 34.69 34.87 1. (CXX) g++ options: -O3 -fPIC -lm
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 22.1 Input: Carbon Nanotube 12c 10c 8c 6c 6 12 18 24 30 SE +/- 0.23, N = 5 SE +/- 0.13, N = 3 SE +/- 0.18, N = 3 SE +/- 0.20, N = 3 23.15 23.37 24.60 26.31 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 12 - Input: Bosphorus 4K 12c 10c 8c 6c 60 120 180 240 300 SE +/- 7.35, N = 15 SE +/- 7.16, N = 15 SE +/- 7.53, N = 15 SE +/- 9.18, N = 13 251.77 241.37 227.90 221.16
Rodinia Test: OpenMP Streamcluster OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Streamcluster 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.089, N = 15 SE +/- 0.079, N = 15 SE +/- 0.078, N = 15 SE +/- 0.050, N = 3 6.001 6.285 6.018 6.409 1. (CXX) g++ options: -O2 -lOpenCL
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare 12c 10c 8c 6c 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 18.71 18.68 18.68 17.94 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU 12c 10c 8c 6c 0.9021 1.8042 2.7063 3.6084 4.5105 SE +/- 0.02537, N = 3 SE +/- 0.05885, N = 12 SE +/- 0.08932, N = 12 SE +/- 0.01788, N = 3 3.95471 4.00938 3.99305 3.96488 MIN: 3.05 MIN: 2.96 MIN: 2.67 MIN: 2.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 84.88, N = 3 SE +/- 206.91, N = 12 SE +/- 134.50, N = 15 SE +/- 158.57, N = 12 8491.01 7124.92 6675.71 5690.01 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: Classroom - Compute: CPU-Only 12c 10c 8c 6c 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 20.92 20.76 20.68 20.71
Timed Apache Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Apache Compilation 2.4.41 Time To Compile 12c 10c 8c 6c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 20.46 20.48 20.59 20.72
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile 12c 10c 8c 6c 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 20.12 20.21 20.11 20.16
Liquid-DSP Threads: 384 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 384 - Buffer Length: 256 - Filter Length: 57 12c 10c 8c 6c 2000M 4000M 6000M 8000M 10000M SE +/- 4582575.69, N = 3 SE +/- 4409585.52, N = 3 SE +/- 5783117.19, N = 3 SE +/- 3214550.25, N = 3 10347000000 10352666667 10349666667 10349000000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 256 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 256 - Buffer Length: 256 - Filter Length: 57 12c 10c 8c 6c 2000M 4000M 6000M 8000M 10000M SE +/- 4618802.15, N = 3 SE +/- 5196152.42, N = 3 SE +/- 4333333.33, N = 3 SE +/- 3844187.53, N = 3 10347000000 10340000000 10337666667 10340333333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RTLightmap.hdr.4096x4096 12c 10c 8c 6c 0.3713 0.7426 1.1139 1.4852 1.8565 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.65 1.63 1.64 1.54
miniBUDE Implementation: OpenMP - Input Deck: BM2 OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 12c 10c 8c 6c 80 160 240 320 400 SE +/- 1.09, N = 3 SE +/- 1.26, N = 3 SE +/- 2.53, N = 3 SE +/- 3.87, N = 3 345.61 346.68 344.64 346.08 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
miniBUDE Implementation: OpenMP - Input Deck: BM2 OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 27.15, N = 3 SE +/- 31.49, N = 3 SE +/- 63.13, N = 3 SE +/- 96.81, N = 3 8640.31 8666.98 8615.97 8651.92 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 12c 10c 8c 6c 0.1048 0.2096 0.3144 0.4192 0.524 SE +/- 0.005042, N = 3 SE +/- 0.005241, N = 4 SE +/- 0.006374, N = 3 SE +/- 0.005815, N = 3 0.446930 0.463454 0.465796 0.465059 MIN: 0.38 MIN: 0.38 MIN: 0.38 MIN: 0.38 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.58, N = 10 SE +/- 0.74, N = 3 SE +/- 1.04, N = 3 SE +/- 0.77, N = 3 73.44 75.35 73.04 71.41 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M 12c 10c 8c 6c 20K 40K 60K 80K 100K SE +/- 328.13, N = 3 SE +/- 152.19, N = 3 SE +/- 383.60, N = 3 SE +/- 214.10, N = 3 104604.6 102599.6 101953.5 100446.2 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C 12c 10c 8c 6c 20K 40K 60K 80K 100K SE +/- 812.04, N = 15 SE +/- 899.80, N = 15 SE +/- 907.72, N = 15 SE +/- 554.69, N = 3 80225.01 81179.00 79784.15 71662.28 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium 12c 10c 8c 6c 14 28 42 56 70 SE +/- 0.68, N = 3 SE +/- 0.11, N = 3 SE +/- 0.73, N = 3 SE +/- 0.53, N = 3 62.56 62.23 61.81 61.40 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M 12c 10c 8c 6c 30K 60K 90K 120K 150K SE +/- 849.90, N = 3 SE +/- 70.55, N = 3 SE +/- 122.05, N = 3 SE +/- 349.73, N = 3 126465.6 127226.6 127081.2 126057.7 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RT.hdr_alb_nrm.3840x2160 12c 10c 8c 6c 0.792 1.584 2.376 3.168 3.96 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 3.52 3.44 3.47 3.29
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: BMW27 - Compute: CPU-Only 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 8.58 8.42 8.34 8.33
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C 12c 10c 8c 6c 40K 80K 120K 160K 200K SE +/- 2393.90, N = 3 SE +/- 2631.10, N = 15 SE +/- 2089.98, N = 15 SE +/- 1626.80, N = 15 209846.76 177097.42 153458.78 117733.57 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C 12c 10c 8c 6c 60K 120K 180K 240K 300K SE +/- 1589.72, N = 3 SE +/- 726.36, N = 3 SE +/- 1630.30, N = 3 SE +/- 1838.44, N = 3 260471.50 239496.01 208535.23 167474.70 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.66, N = 3 SE +/- 1.02, N = 3 SE +/- 0.71, N = 3 SE +/- 0.63, N = 3 77.83 77.30 76.84 75.86 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Exhaustive 12c 10c 8c 6c 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 11.73 11.76 11.81 11.82 1. (CXX) g++ options: -O3 -flto -pthread
Timed MPlayer Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed MPlayer Compilation 1.5 Time To Compile 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.033, N = 3 SE +/- 0.034, N = 3 SE +/- 0.023, N = 3 SE +/- 0.010, N = 3 7.777 7.755 7.808 7.773
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C 12c 10c 8c 6c 100K 200K 300K 400K 500K SE +/- 5489.08, N = 4 SE +/- 2546.14, N = 3 SE +/- 5095.33, N = 5 SE +/- 4680.97, N = 5 489164.65 489995.20 466769.54 454360.62 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 106.57 106.85 107.11 106.51 1. (CXX) g++ options: -O3 -flto -pthread
DaCapo Benchmark Java Test: Jython OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: Jython 12c 10c 8c 6c 700 1400 2100 2800 3500 SE +/- 29.26, N = 4 SE +/- 18.49, N = 4 SE +/- 35.24, N = 4 SE +/- 21.34, N = 4 3380 3329 3369 3345
Rodinia Test: OpenMP CFD Solver OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP CFD Solver 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.031, N = 3 SE +/- 0.014, N = 3 SE +/- 0.016, N = 3 SE +/- 0.024, N = 3 6.050 6.074 5.970 6.152 1. (CXX) g++ options: -O2 -lOpenCL
libavif avifenc Encoder Speed: 6, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6, Lossless 12c 10c 8c 6c 1.1993 2.3986 3.5979 4.7972 5.9965 SE +/- 0.076, N = 3 SE +/- 0.044, N = 3 SE +/- 0.034, N = 3 SE +/- 0.055, N = 3 5.287 5.286 5.270 5.330 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 10, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 10, Lossless 12c 10c 8c 6c 0.9758 1.9516 2.9274 3.9032 4.879 SE +/- 0.024, N = 3 SE +/- 0.055, N = 3 SE +/- 0.009, N = 3 SE +/- 0.043, N = 3 4.241 4.337 4.252 4.250 1. (CXX) g++ options: -O3 -fPIC -lm
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer ISPC - Model: Crown 12c 10c 8c 6c 40 80 120 160 200 SE +/- 1.01, N = 3 SE +/- 0.47, N = 3 SE +/- 0.36, N = 3 SE +/- 0.33, N = 3 182.45 184.73 185.49 187.61 MIN: 128.42 / MAX: 209.42 MIN: 137.82 / MAX: 210.21 MIN: 134.45 / MAX: 211.64 MIN: 146.69 / MAX: 208.25
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer ISPC - Model: Asian Dragon 12c 10c 8c 6c 50 100 150 200 250 SE +/- 0.13, N = 3 SE +/- 0.47, N = 3 SE +/- 0.39, N = 3 SE +/- 0.46, N = 3 213.75 214.31 217.41 221.29 MIN: 209.16 / MAX: 225.43 MIN: 209.11 / MAX: 223.97 MIN: 211.73 / MAX: 230.1 MIN: 215.19 / MAX: 233.21
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate 12c 10c 8c 6c 16 32 48 64 80 SE +/- 0.33, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.13, N = 3 70.41 70.61 71.01 70.90 1. (CC) gcc options: -O3 -march=native -fopenmp
libavif avifenc Encoder Speed: 6 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6 12c 10c 8c 6c 0.5533 1.1066 1.6599 2.2132 2.7665 SE +/- 0.016, N = 3 SE +/- 0.003, N = 3 SE +/- 0.017, N = 3 SE +/- 0.004, N = 3 2.459 2.411 2.420 2.435 1. (CXX) g++ options: -O3 -fPIC -lm
Phoronix Test Suite v10.8.4