Benchmarks by Michael Larabel for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2212240-NE-AMDEPYCGE62 AMD EPYC Genoa Memory Scaling - Phoronix Test Suite AMD EPYC Genoa Memory Scaling Benchmarks by Michael Larabel for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2212240-NE-AMDEPYCGE62&grt .
AMD EPYC Genoa Memory Scaling Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution 12c 10c 8c 6c 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1002E BIOS) AMD Device 14a4 1520GB 800GB INTEL SSDPF21Q800GB ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Ubuntu 22.10 6.1.0-phx (x86_64) GNOME Shell 43.0 X Server 1.21.1.4 1.3.224 GCC 12.2.0 + Clang 15.0.2-1 ext4 1920x1080 1264GB 1008GB 768GB OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10110d Java Details - OpenJDK Runtime Environment (build 11.0.17+8-post-Ubuntu-1ubuntu2) Python Details - Python 3.10.7 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AMD EPYC Genoa Memory Scaling compress-7zip: Compression Rating compress-7zip: Decompression Rating mt-dgemm: Sustained Floating-Point Rate cassandra: Writes astcenc: Thorough astcenc: Exhaustive blender: BMW27 - CPU-Only blender: Classroom - CPU-Only blender: Barbershop - CPU-Only build2: Time To Compile cockroach: MoVR - 512 cockroach: MoVR - 1024 cockroach: KV, 10% Reads - 512 cockroach: KV, 50% Reads - 512 cockroach: KV, 60% Reads - 512 cockroach: KV, 95% Reads - 512 cockroach: KV, 10% Reads - 1024 cockroach: KV, 50% Reads - 1024 cockroach: KV, 60% Reads - 1024 cockroach: KV, 95% Reads - 1024 dacapobench: H2 dacapobench: Jython embree: Pathtracer ISPC - Crown embree: Pathtracer ISPC - Asian Dragon gpaw: Carbon Nanotube graph500: 26 gromacs: MPI CPU - water_GMX50_bare hpcg: oidn: RT.hdr_alb_nrm.3840x2160 oidn: RTLightmap.hdr.4096x4096 kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Ultra Fast avifenc: 0 avifenc: 2 avifenc: 6 avifenc: 6, Lossless avifenc: 10, Lossless liquid-dsp: 256 - 256 - 57 liquid-dsp: 384 - 256 - 57 luxcorerender: Danish Mood - CPU luxcorerender: Orange Juice - CPU minibude: OpenMP - BM2 minibude: OpenMP - BM2 namd: ATPase Simulation - 327,506 Atoms npb: CG.C npb: IS.D npb: LU.C npb: MG.C npb: SP.C nekrs: TurboPipe Periodic deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream nginx: 500 nwchem: C240 Buckyball onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onnx: fcn-resnet101-11 - CPU - Standard openfoam: drivaerFastback, Medium Mesh Size - Execution Time openradioss: Bumper Beam openradioss: Bird Strike on Windshield openradioss: INIVOL and Fluid Structure Interaction Drop Container openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvkl: vklBenchmark ISPC ospray: particle_volume/ao/real_time ospray: particle_volume/scivis/real_time ospray: particle_volume/pathtracer/real_time ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time relion: Basic - CPU rodinia: OpenMP CFD Solver rodinia: OpenMP Streamcluster simdjson: Kostya simdjson: TopTweet simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID stargate: 96000 - 1024 stargate: 192000 - 1024 svt-av1: Preset 12 - Bosphorus 4K tensorflow: CPU - 256 - ResNet-50 build-apache: Time To Compile build-gdb: Time To Compile build-gem5: Time To Compile build-godot: Time To Compile build-linux-kernel: defconfig build-linux-kernel: allmodconfig build-llvm: Ninja build-mesa: Time To Compile build-mplayer: Time To Compile build-nodejs: Time To Compile build-php: Time To Compile wrf: conus 2.5km incompact3d: X3D-benchmarking input.i3d xmrig: Monero - 1M xmrig: Wownero - 1M 12c 10c 8c 6c 923176 1181435 70.407733 251793 106.5663 11.7250 8.58 20.92 81.03 49.917 948.5 953.8 35970.0 47621.9 52330.1 64467.6 36846.9 47465.5 52573.3 64661.8 4802 3380 182.4498 213.7507 23.151 565152000 18.706 86.8143 3.52 1.65 62.56 73.44 77.83 63.247 34.848 2.459 5.287 4.241 10347000000 10347000000 9.69 28.82 8640.310 345.612 0.12783 80225.01 8491.01 489164.65 209846.76 260471.50 821462000000 84.3506 1133.2823 761.4853 125.7196 856.0186 111.8909 1964.2730 48.7673 1195.9102 80.0790 615.4474 155.4797 84.2470 1133.4774 201032.06 1537.1 3.95471 1968.70 2344.29 2275.86 0.446930 254 109.53721 79.86 216.88 81.57 101.74 470.98 42.98 1109.45 42.95 1110.68 7394.65 6.48 191.43 250.34 11018.37 4.35 9867.41 4.85 959.16 49.98 19171.51 9.95 9038.47 5.30 147769.26 0.55 119606.21 0.97 1325 43.7061 42.7970 229.269 43.9775 43.1262 53.7748 128.101 6.050 6.001 4.11 6.59 1.25 5.65 6.86 4.345890 2.829061 251.769 109.13 20.461 41.709 139.238 34.032 25.501 147.147 75.655 20.117 7.777 101.465 44.519 4070.19 125.526248 104604.6 126465.6 893433 1171627 70.613775 243603 106.8542 11.7637 8.42 20.76 80.37 49.800 949.6 949.5 35993.1 49102.7 51748.8 60769.7 35776.8 48449.0 51959.5 62029.8 4832 3329 184.7346 214.3093 23.373 574018000 18.677 48.2945 3.44 1.63 62.23 75.35 77.30 63.246 34.909 2.411 5.286 4.337 10340000000 10352666667 9.62 28.19 8666.980 346.679 0.12759 81179.00 7124.92 489995.20 177097.42 239496.01 786258000000 84.4822 1133.1821 742.7956 128.9249 844.4268 113.4079 1965.5606 48.7431 1201.1391 79.7139 611.2926 156.5374 84.2657 1135.1845 198858.66 1531 4.00938 2030.72 2438.00 2325.71 0.463454 255 117.94003 79.70 218.22 81.15 102.01 469.43 42.94 1110.44 43.18 1104.59 7425.10 6.45 192.30 249.12 11066.16 4.33 9900.47 4.83 934.71 51.29 19254.08 9.91 9063.84 5.28 147717.32 0.55 122938.23 0.98 1317 43.0316 42.9999 230.282 43.9969 43.3287 54.4083 151.398 6.074 6.285 4.11 6.49 1.25 5.67 6.84 4.354556 2.806190 241.369 105.91 20.480 42.412 134.373 33.616 25.407 145.410 75.440 20.205 7.755 101.941 44.608 4563.183 146.289830 102599.6 127226.6 879430 1159901 71.010323 240854 107.1108 11.8090 8.34 20.68 80.18 49.871 960.3 946.9 34832.9 47596.6 52515.2 64111.9 36685.7 47498.1 52559.0 58195.5 4731 3369 185.4907 217.4060 24.598 531854000 18.678 45.0005 3.47 1.64 61.81 73.04 76.84 62.961 34.687 2.420 5.270 4.252 10337666667 10349666667 9.56 29.04 8615.967 344.639 0.12768 79784.15 6675.71 466769.54 153458.78 208535.23 740247000000 84.2115 1136.8544 705.7116 135.6212 773.0686 123.8576 1954.1227 48.9982 1201.9839 79.6865 614.6105 155.8191 84.1546 1137.5119 197081.98 1519.6 3.99305 1982.15 2375.45 2371.78 0.465796 257 166.14971 79.20 219.45 81.09 101.26 472.84 42.59 1119.79 42.22 1129.01 7389.00 6.49 192.25 249.26 11108.16 4.31 9931.49 4.82 875.39 54.80 19278.93 9.90 9113.11 5.26 152292.39 0.55 123571.68 0.98 1325 43.9700 43.8442 228.581 44.2302 43.4310 54.5087 221.336 5.970 6.018 4.11 6.57 1.25 5.66 6.86 4.351402 2.811555 227.898 105.01 20.589 42.409 136.793 33.905 25.528 147.377 75.725 20.107 7.808 101.149 44.583 6551.876 270.091271 101953.5 127081.2 824926 1177484 70.898312 246882 106.5095 11.8207 8.33 20.71 79.93 50.084 954.7 952.7 35742.3 47428.0 51275.1 62666.5 36329.6 47593.9 52626.4 60137.3 4830 3345 187.6107 221.2898 26.308 392496000 17.940 36.5411 3.29 1.54 61.40 71.41 75.86 63.803 34.874 2.435 5.330 4.250 10340333333 10349000000 9.49 28.90 8651.924 346.077 0.12820 71662.28 5690.01 454360.62 117733.57 167474.70 659554333333 82.4869 1148.4964 575.7518 166.4322 635.0246 150.9167 1930.3277 49.6278 1190.5286 80.4399 608.5336 157.2164 82.2613 1148.3278 196805.30 1517.9 3.96488 2072.57 2479.62 2471.57 0.465059 253 227.89595 79.62 219.10 80.81 101.08 473.69 41.33 1153.70 41.44 1150.54 7306.47 6.56 191.29 250.49 11150.32 4.30 9959.38 4.81 817.27 58.67 19314.04 9.89 9081.73 5.28 151213.17 0.54 121027.25 0.97 1212 43.3575 43.2396 230.440 44.2716 43.2899 54.6054 258.500 6.152 6.409 4.11 6.55 1.24 5.69 6.83 4.364767 2.824814 221.161 95.67 20.720 43.245 134.695 33.671 24.747 145.766 76.747 20.157 7.773 102.776 44.698 7432.655 348.880025 100446.2 126057.7 OpenBenchmarking.org
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating 12c 10c 8c 6c 200K 400K 600K 800K 1000K SE +/- 6636.11, N = 3 SE +/- 2580.44, N = 3 SE +/- 3797.71, N = 3 SE +/- 7292.38, N = 3 923176 893433 879430 824926 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating 12c 10c 8c 6c 300K 600K 900K 1200K 1500K SE +/- 3305.67, N = 3 SE +/- 5138.86, N = 3 SE +/- 9235.88, N = 3 SE +/- 2020.82, N = 3 1181435 1171627 1159901 1177484 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate 12c 10c 8c 6c 16 32 48 64 80 SE +/- 0.33, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.13, N = 3 70.41 70.61 71.01 70.90 1. (CC) gcc options: -O3 -march=native -fopenmp
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 4.0 Test: Writes 12c 10c 8c 6c 50K 100K 150K 200K 250K SE +/- 3742.45, N = 12 SE +/- 2429.87, N = 3 SE +/- 1899.17, N = 3 SE +/- 2957.03, N = 3 251793 243603 240854 246882
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 106.57 106.85 107.11 106.51 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Exhaustive 12c 10c 8c 6c 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 11.73 11.76 11.81 11.82 1. (CXX) g++ options: -O3 -flto -pthread
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: BMW27 - Compute: CPU-Only 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 8.58 8.42 8.34 8.33
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: Classroom - Compute: CPU-Only 12c 10c 8c 6c 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 20.92 20.76 20.68 20.71
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: Barbershop - Compute: CPU-Only 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.21, N = 3 SE +/- 0.15, N = 3 SE +/- 0.24, N = 3 SE +/- 0.31, N = 3 81.03 80.37 80.18 79.93
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 12c 10c 8c 6c 11 22 33 44 55 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.20, N = 3 SE +/- 0.28, N = 3 49.92 49.80 49.87 50.08
CockroachDB Workload: MoVR - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 512 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 3.38, N = 3 SE +/- 3.66, N = 3 SE +/- 9.03, N = 3 SE +/- 4.87, N = 3 948.5 949.6 960.3 954.7
CockroachDB Workload: MoVR - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 1024 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 1.42, N = 3 SE +/- 0.58, N = 3 SE +/- 3.18, N = 3 SE +/- 1.56, N = 3 953.8 949.5 946.9 952.7
CockroachDB Workload: KV, 10% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 512 12c 10c 8c 6c 8K 16K 24K 32K 40K SE +/- 343.66, N = 15 SE +/- 270.36, N = 15 SE +/- 351.71, N = 6 SE +/- 438.30, N = 15 35970.0 35993.1 34832.9 35742.3
CockroachDB Workload: KV, 50% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 512 12c 10c 8c 6c 11K 22K 33K 44K 55K SE +/- 464.03, N = 15 SE +/- 514.54, N = 3 SE +/- 454.84, N = 15 SE +/- 32.88, N = 3 47621.9 49102.7 47596.6 47428.0
CockroachDB Workload: KV, 60% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 60% Reads - Concurrency: 512 12c 10c 8c 6c 11K 22K 33K 44K 55K SE +/- 268.61, N = 3 SE +/- 620.92, N = 15 SE +/- 411.73, N = 13 SE +/- 555.56, N = 15 52330.1 51748.8 52515.2 51275.1
CockroachDB Workload: KV, 95% Reads - Concurrency: 512 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 512 12c 10c 8c 6c 14K 28K 42K 56K 70K SE +/- 702.29, N = 3 SE +/- 1044.13, N = 15 SE +/- 890.57, N = 3 SE +/- 813.26, N = 15 64467.6 60769.7 64111.9 62666.5
CockroachDB Workload: KV, 10% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 10% Reads - Concurrency: 1024 12c 10c 8c 6c 8K 16K 24K 32K 40K SE +/- 155.07, N = 3 SE +/- 346.25, N = 3 SE +/- 322.68, N = 3 SE +/- 206.35, N = 3 36846.9 35776.8 36685.7 36329.6
CockroachDB Workload: KV, 50% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 1024 12c 10c 8c 6c 10K 20K 30K 40K 50K SE +/- 366.75, N = 15 SE +/- 380.16, N = 3 SE +/- 468.66, N = 15 SE +/- 391.13, N = 9 47465.5 48449.0 47498.1 47593.9
CockroachDB Workload: KV, 60% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 60% Reads - Concurrency: 1024 12c 10c 8c 6c 11K 22K 33K 44K 55K SE +/- 239.52, N = 3 SE +/- 400.61, N = 10 SE +/- 447.89, N = 3 SE +/- 448.33, N = 3 52573.3 51959.5 52559.0 52626.4
CockroachDB Workload: KV, 95% Reads - Concurrency: 1024 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 1024 12c 10c 8c 6c 14K 28K 42K 56K 70K SE +/- 575.30, N = 3 SE +/- 1142.40, N = 15 SE +/- 1317.65, N = 15 SE +/- 1310.27, N = 15 64661.8 62029.8 58195.5 60137.3
DaCapo Benchmark Java Test: H2 OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: H2 12c 10c 8c 6c 1000 2000 3000 4000 5000 SE +/- 53.17, N = 20 SE +/- 39.79, N = 20 SE +/- 40.50, N = 20 SE +/- 36.16, N = 20 4802 4832 4731 4830
DaCapo Benchmark Java Test: Jython OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: Jython 12c 10c 8c 6c 700 1400 2100 2800 3500 SE +/- 29.26, N = 4 SE +/- 18.49, N = 4 SE +/- 35.24, N = 4 SE +/- 21.34, N = 4 3380 3329 3369 3345
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer ISPC - Model: Crown 12c 10c 8c 6c 40 80 120 160 200 SE +/- 1.01, N = 3 SE +/- 0.47, N = 3 SE +/- 0.36, N = 3 SE +/- 0.33, N = 3 182.45 184.73 185.49 187.61 MIN: 128.42 / MAX: 209.42 MIN: 137.82 / MAX: 210.21 MIN: 134.45 / MAX: 211.64 MIN: 146.69 / MAX: 208.25
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.13 Binary: Pathtracer ISPC - Model: Asian Dragon 12c 10c 8c 6c 50 100 150 200 250 SE +/- 0.13, N = 3 SE +/- 0.47, N = 3 SE +/- 0.39, N = 3 SE +/- 0.46, N = 3 213.75 214.31 217.41 221.29 MIN: 209.16 / MAX: 225.43 MIN: 209.11 / MAX: 223.97 MIN: 211.73 / MAX: 230.1 MIN: 215.19 / MAX: 233.21
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 22.1 Input: Carbon Nanotube 12c 10c 8c 6c 6 12 18 24 30 SE +/- 0.23, N = 5 SE +/- 0.13, N = 3 SE +/- 0.18, N = 3 SE +/- 0.20, N = 3 23.15 23.37 24.60 26.31 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
Graph500 Scale: 26 OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 12c 10c 8c 6c 120M 240M 360M 480M 600M 565152000 574018000 531854000 392496000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare 12c 10c 8c 6c 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 18.71 18.68 18.68 17.94 1. (CXX) g++ options: -O3
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 12c 10c 8c 6c 20 40 60 80 100 SE +/- 1.12, N = 12 SE +/- 3.31, N = 9 SE +/- 0.49, N = 9 SE +/- 0.99, N = 9 86.81 48.29 45.00 36.54 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RT.hdr_alb_nrm.3840x2160 12c 10c 8c 6c 0.792 1.584 2.376 3.168 3.96 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 3.52 3.44 3.47 3.29
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RTLightmap.hdr.4096x4096 12c 10c 8c 6c 0.3713 0.7426 1.1139 1.4852 1.8565 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.65 1.63 1.64 1.54
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium 12c 10c 8c 6c 14 28 42 56 70 SE +/- 0.68, N = 3 SE +/- 0.11, N = 3 SE +/- 0.73, N = 3 SE +/- 0.53, N = 3 62.56 62.23 61.81 61.40 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.58, N = 10 SE +/- 0.74, N = 3 SE +/- 1.04, N = 3 SE +/- 0.77, N = 3 73.44 75.35 73.04 71.41 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.66, N = 3 SE +/- 1.02, N = 3 SE +/- 0.71, N = 3 SE +/- 0.63, N = 3 77.83 77.30 76.84 75.86 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 0 12c 10c 8c 6c 14 28 42 56 70 SE +/- 0.18, N = 3 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 SE +/- 0.47, N = 3 63.25 63.25 62.96 63.80 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 2 12c 10c 8c 6c 8 16 24 32 40 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 34.85 34.91 34.69 34.87 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 6 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6 12c 10c 8c 6c 0.5533 1.1066 1.6599 2.2132 2.7665 SE +/- 0.016, N = 3 SE +/- 0.003, N = 3 SE +/- 0.017, N = 3 SE +/- 0.004, N = 3 2.459 2.411 2.420 2.435 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 6, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 6, Lossless 12c 10c 8c 6c 1.1993 2.3986 3.5979 4.7972 5.9965 SE +/- 0.076, N = 3 SE +/- 0.044, N = 3 SE +/- 0.034, N = 3 SE +/- 0.055, N = 3 5.287 5.286 5.270 5.330 1. (CXX) g++ options: -O3 -fPIC -lm
libavif avifenc Encoder Speed: 10, Lossless OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.11 Encoder Speed: 10, Lossless 12c 10c 8c 6c 0.9758 1.9516 2.9274 3.9032 4.879 SE +/- 0.024, N = 3 SE +/- 0.055, N = 3 SE +/- 0.009, N = 3 SE +/- 0.043, N = 3 4.241 4.337 4.252 4.250 1. (CXX) g++ options: -O3 -fPIC -lm
Liquid-DSP Threads: 256 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 256 - Buffer Length: 256 - Filter Length: 57 12c 10c 8c 6c 2000M 4000M 6000M 8000M 10000M SE +/- 4618802.15, N = 3 SE +/- 5196152.42, N = 3 SE +/- 4333333.33, N = 3 SE +/- 3844187.53, N = 3 10347000000 10340000000 10337666667 10340333333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 384 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 384 - Buffer Length: 256 - Filter Length: 57 12c 10c 8c 6c 2000M 4000M 6000M 8000M 10000M SE +/- 4582575.69, N = 3 SE +/- 4409585.52, N = 3 SE +/- 5783117.19, N = 3 SE +/- 3214550.25, N = 3 10347000000 10352666667 10349666667 10349000000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
LuxCoreRender Scene: Danish Mood - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: CPU 12c 10c 8c 6c 3 6 9 12 15 SE +/- 0.09, N = 15 SE +/- 0.17, N = 12 SE +/- 0.11, N = 15 SE +/- 0.14, N = 12 9.69 9.62 9.56 9.49 MIN: 4 / MAX: 12.39 MIN: 3.97 / MAX: 12.9 MIN: 3.94 / MAX: 12.41 MIN: 3.85 / MAX: 12.15
LuxCoreRender Scene: Orange Juice - Acceleration: CPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: CPU 12c 10c 8c 6c 7 14 21 28 35 SE +/- 0.63, N = 15 SE +/- 0.29, N = 3 SE +/- 0.72, N = 15 SE +/- 0.71, N = 15 28.82 28.19 29.04 28.90 MIN: 23.01 / MAX: 45.86 MIN: 23.3 / MAX: 45.65 MIN: 22.62 / MAX: 45.48 MIN: 22.4 / MAX: 44.91
miniBUDE Implementation: OpenMP - Input Deck: BM2 OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 27.15, N = 3 SE +/- 31.49, N = 3 SE +/- 63.13, N = 3 SE +/- 96.81, N = 3 8640.31 8666.98 8615.97 8651.92 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
miniBUDE Implementation: OpenMP - Input Deck: BM2 OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 12c 10c 8c 6c 80 160 240 320 400 SE +/- 1.09, N = 3 SE +/- 1.26, N = 3 SE +/- 2.53, N = 3 SE +/- 3.87, N = 3 345.61 346.68 344.64 346.08 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 12c 10c 8c 6c 0.0288 0.0576 0.0864 0.1152 0.144 SE +/- 0.00009, N = 3 SE +/- 0.00007, N = 3 SE +/- 0.00046, N = 3 SE +/- 0.00009, N = 3 0.12783 0.12759 0.12768 0.12820
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C 12c 10c 8c 6c 20K 40K 60K 80K 100K SE +/- 812.04, N = 15 SE +/- 899.80, N = 15 SE +/- 907.72, N = 15 SE +/- 554.69, N = 3 80225.01 81179.00 79784.15 71662.28 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 84.88, N = 3 SE +/- 206.91, N = 12 SE +/- 134.50, N = 15 SE +/- 158.57, N = 12 8491.01 7124.92 6675.71 5690.01 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C 12c 10c 8c 6c 100K 200K 300K 400K 500K SE +/- 5489.08, N = 4 SE +/- 2546.14, N = 3 SE +/- 5095.33, N = 5 SE +/- 4680.97, N = 5 489164.65 489995.20 466769.54 454360.62 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C 12c 10c 8c 6c 40K 80K 120K 160K 200K SE +/- 2393.90, N = 3 SE +/- 2631.10, N = 15 SE +/- 2089.98, N = 15 SE +/- 1626.80, N = 15 209846.76 177097.42 153458.78 117733.57 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C 12c 10c 8c 6c 60K 120K 180K 240K 300K SE +/- 1589.72, N = 3 SE +/- 726.36, N = 3 SE +/- 1630.30, N = 3 SE +/- 1838.44, N = 3 260471.50 239496.01 208535.23 167474.70 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.4
nekRS Input: TurboPipe Periodic OpenBenchmarking.org FLOP/s, More Is Better nekRS 22.0 Input: TurboPipe Periodic 12c 10c 8c 6c 200000M 400000M 600000M 800000M 1000000M SE +/- 9551971733.63, N = 3 SE +/- 7825985326.68, N = 3 SE +/- 5892587066.25, N = 3 SE +/- 1934071468.29, N = 3 821462000000 786258000000 740247000000 659554333333 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.18, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.31, N = 3 84.35 84.48 84.21 82.49
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 0.82, N = 3 SE +/- 0.20, N = 3 SE +/- 0.88, N = 3 SE +/- 0.67, N = 3 1133.28 1133.18 1136.85 1148.50
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 160 320 480 640 800 SE +/- 0.72, N = 3 SE +/- 2.41, N = 3 SE +/- 2.11, N = 3 SE +/- 6.13, N = 15 761.49 742.80 705.71 575.75
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.43, N = 3 SE +/- 0.38, N = 3 SE +/- 1.66, N = 15 125.72 128.92 135.62 166.43
Neural Magic DeepSparse Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 0.57, N = 3 SE +/- 0.53, N = 3 SE +/- 1.22, N = 3 SE +/- 6.69, N = 15 856.02 844.43 773.07 635.02
Neural Magic DeepSparse Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 30 60 90 120 150 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 SE +/- 0.19, N = 3 SE +/- 1.56, N = 15 111.89 113.41 123.86 150.92
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 400 800 1200 1600 2000 SE +/- 4.95, N = 3 SE +/- 1.61, N = 3 SE +/- 1.56, N = 3 SE +/- 8.40, N = 3 1964.27 1965.56 1954.12 1930.33
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.21, N = 3 48.77 48.74 49.00 49.63
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 300 600 900 1200 1500 SE +/- 4.04, N = 3 SE +/- 0.69, N = 3 SE +/- 3.22, N = 3 SE +/- 1.21, N = 3 1195.91 1201.14 1201.98 1190.53
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 SE +/- 0.07, N = 3 80.08 79.71 79.69 80.44
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 130 260 390 520 650 SE +/- 1.72, N = 3 SE +/- 2.48, N = 3 SE +/- 1.32, N = 3 SE +/- 2.24, N = 3 615.45 611.29 614.61 608.53
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 30 60 90 120 150 SE +/- 0.46, N = 3 SE +/- 0.55, N = 3 SE +/- 0.27, N = 3 SE +/- 0.58, N = 3 155.48 156.54 155.82 157.22
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.21, N = 3 SE +/- 0.03, N = 3 SE +/- 0.16, N = 3 SE +/- 0.25, N = 3 84.25 84.27 84.15 82.26
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.1 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 1.25, N = 3 SE +/- 1.00, N = 3 SE +/- 1.67, N = 3 SE +/- 1.05, N = 3 1133.48 1135.18 1137.51 1148.33
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 12c 10c 8c 6c 40K 80K 120K 160K 200K SE +/- 291.63, N = 3 SE +/- 335.64, N = 3 SE +/- 453.48, N = 3 SE +/- 113.87, N = 3 201032.06 198858.66 197081.98 196805.30 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
NWChem Input: C240 Buckyball OpenBenchmarking.org Seconds, Fewer Is Better NWChem 7.0.2 Input: C240 Buckyball 12c 10c 8c 6c 300 600 900 1200 1500 1537.1 1531.0 1519.6 1517.9 1. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU 12c 10c 8c 6c 0.9021 1.8042 2.7063 3.6084 4.5105 SE +/- 0.02537, N = 3 SE +/- 0.05885, N = 12 SE +/- 0.08932, N = 12 SE +/- 0.01788, N = 3 3.95471 4.00938 3.99305 3.96488 MIN: 3.05 MIN: 2.96 MIN: 2.67 MIN: 2.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 12c 10c 8c 6c 400 800 1200 1600 2000 SE +/- 31.84, N = 15 SE +/- 14.89, N = 3 SE +/- 28.30, N = 3 SE +/- 16.27, N = 10 1968.70 2030.72 1982.15 2072.57 MIN: 1632.62 MIN: 1981.15 MIN: 1911.33 MIN: 1942.14 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 12c 10c 8c 6c 500 1000 1500 2000 2500 SE +/- 21.01, N = 3 SE +/- 30.76, N = 3 SE +/- 21.41, N = 3 SE +/- 25.74, N = 15 2344.29 2438.00 2375.45 2479.62 MIN: 2288.85 MIN: 2353.97 MIN: 2319.45 MIN: 2293.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 12c 10c 8c 6c 500 1000 1500 2000 2500 SE +/- 24.22, N = 3 SE +/- 25.04, N = 15 SE +/- 25.14, N = 15 SE +/- 31.16, N = 3 2275.86 2325.71 2371.78 2471.57 MIN: 2213.34 MIN: 2171.69 MIN: 2234.23 MIN: 2410.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 12c 10c 8c 6c 0.1048 0.2096 0.3144 0.4192 0.524 SE +/- 0.005042, N = 3 SE +/- 0.005241, N = 4 SE +/- 0.006374, N = 3 SE +/- 0.005815, N = 3 0.446930 0.463454 0.465796 0.465059 MIN: 0.38 MIN: 0.38 MIN: 0.38 MIN: 0.38 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard 12c 10c 8c 6c 60 120 180 240 300 SE +/- 2.33, N = 7 SE +/- 3.09, N = 3 SE +/- 2.84, N = 5 SE +/- 2.17, N = 12 254 255 257 253 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time 12c 10c 8c 6c 50 100 150 200 250 109.54 117.94 166.15 227.90 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenRadioss Model: Bumper Beam OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bumper Beam 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.79, N = 3 SE +/- 0.75, N = 3 SE +/- 0.70, N = 3 SE +/- 0.71, N = 3 79.86 79.70 79.20 79.62
OpenRadioss Model: Bird Strike on Windshield OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bird Strike on Windshield 12c 10c 8c 6c 50 100 150 200 250 SE +/- 0.38, N = 3 SE +/- 0.54, N = 3 SE +/- 0.19, N = 3 SE +/- 0.14, N = 3 216.88 218.22 219.45 219.10
OpenRadioss Model: INIVOL and Fluid Structure Interaction Drop Container OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: INIVOL and Fluid Structure Interaction Drop Container 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 81.57 81.15 81.09 80.81
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 101.74 102.01 101.26 101.08 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU 12c 10c 8c 6c 100 200 300 400 500 SE +/- 0.21, N = 3 SE +/- 0.10, N = 3 SE +/- 0.27, N = 3 SE +/- 0.14, N = 3 470.98 469.43 472.84 473.69 MIN: 451.07 / MAX: 556.04 MIN: 432.92 / MAX: 555.25 MIN: 394.37 / MAX: 553.15 MIN: 423.34 / MAX: 579.41 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.13, N = 3 SE +/- 0.12, N = 3 SE +/- 0.15, N = 3 SE +/- 0.17, N = 3 42.98 42.94 42.59 41.33 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 3.30, N = 3 SE +/- 2.71, N = 3 SE +/- 3.60, N = 3 SE +/- 4.42, N = 3 1109.45 1110.44 1119.79 1153.70 MIN: 810.74 / MAX: 1835.01 MIN: 769.04 / MAX: 1860.23 MIN: 808.33 / MAX: 1875.91 MIN: 853.88 / MAX: 1939.06 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.32, N = 3 SE +/- 0.20, N = 3 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 42.95 43.18 42.22 41.44 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 8.73, N = 3 SE +/- 5.36, N = 3 SE +/- 0.54, N = 3 SE +/- 1.87, N = 3 1110.68 1104.59 1129.01 1150.54 MIN: 833.53 / MAX: 1865.19 MIN: 807.38 / MAX: 1818.79 MIN: 850.94 / MAX: 1870.94 MIN: 870.26 / MAX: 1902.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU 12c 10c 8c 6c 1600 3200 4800 6400 8000 SE +/- 2.30, N = 3 SE +/- 13.32, N = 3 SE +/- 6.27, N = 3 SE +/- 4.59, N = 3 7394.65 7425.10 7389.00 7306.47 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 6.48 6.45 6.49 6.56 MIN: 5.06 / MAX: 59.88 MIN: 4.97 / MAX: 59.86 MIN: 4.93 / MAX: 59.51 MIN: 4.99 / MAX: 59.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 40 80 120 160 200 SE +/- 0.21, N = 3 SE +/- 0.03, N = 3 SE +/- 0.48, N = 3 SE +/- 0.09, N = 3 191.43 192.30 192.25 191.29 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 50 100 150 200 250 SE +/- 0.32, N = 3 SE +/- 0.03, N = 3 SE +/- 0.69, N = 3 SE +/- 0.13, N = 3 250.34 249.12 249.26 250.49 MIN: 222.95 / MAX: 301.42 MIN: 209.28 / MAX: 311.3 MIN: 207.76 / MAX: 340.53 MIN: 213.3 / MAX: 307.84 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 1.42, N = 3 SE +/- 3.30, N = 3 SE +/- 1.79, N = 3 SE +/- 1.79, N = 3 11018.37 11066.16 11108.16 11150.32 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 0.9788 1.9576 2.9364 3.9152 4.894 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.35 4.33 4.31 4.30 MIN: 3.52 / MAX: 41.44 MIN: 3.51 / MAX: 41.25 MIN: 3.51 / MAX: 43.89 MIN: 3.52 / MAX: 43.57 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 2.57, N = 3 SE +/- 2.08, N = 3 SE +/- 7.50, N = 3 SE +/- 3.42, N = 3 9867.41 9900.47 9931.49 9959.38 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU 12c 10c 8c 6c 1.0913 2.1826 3.2739 4.3652 5.4565 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.85 4.83 4.82 4.81 MIN: 4.06 / MAX: 28.62 MIN: 4.08 / MAX: 28.68 MIN: 3.98 / MAX: 28.83 MIN: 4.14 / MAX: 27.29 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU 12c 10c 8c 6c 200 400 600 800 1000 SE +/- 2.32, N = 3 SE +/- 1.48, N = 3 SE +/- 8.79, N = 6 SE +/- 5.14, N = 3 959.16 934.71 875.39 817.27 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU 12c 10c 8c 6c 13 26 39 52 65 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 SE +/- 0.57, N = 6 SE +/- 0.37, N = 3 49.98 51.29 54.80 58.67 MIN: 38.24 / MAX: 187.97 MIN: 40.28 / MAX: 292.83 MIN: 40.7 / MAX: 276.86 MIN: 43.56 / MAX: 315.05 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 4K 8K 12K 16K 20K SE +/- 12.43, N = 3 SE +/- 30.88, N = 3 SE +/- 31.30, N = 3 SE +/- 33.95, N = 3 19171.51 19254.08 19278.93 19314.04 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 12c 10c 8c 6c 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 9.95 9.91 9.90 9.89 MIN: 8.42 / MAX: 52.38 MIN: 8.4 / MAX: 50.42 MIN: 8.39 / MAX: 56.99 MIN: 8.35 / MAX: 32.16 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU 12c 10c 8c 6c 2K 4K 6K 8K 10K SE +/- 9.96, N = 3 SE +/- 5.19, N = 3 SE +/- 2.85, N = 3 SE +/- 7.67, N = 3 9038.47 9063.84 9113.11 9081.73 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU 12c 10c 8c 6c 1.1925 2.385 3.5775 4.77 5.9625 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 5.30 5.28 5.26 5.28 MIN: 4.42 / MAX: 40.66 MIN: 4.37 / MAX: 41.23 MIN: 4.42 / MAX: 42.93 MIN: 4.34 / MAX: 38.93 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 12c 10c 8c 6c 30K 60K 90K 120K 150K SE +/- 745.28, N = 3 SE +/- 1134.97, N = 10 SE +/- 994.61, N = 3 SE +/- 365.43, N = 3 147769.26 147717.32 152292.39 151213.17 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 12c 10c 8c 6c 0.1238 0.2476 0.3714 0.4952 0.619 SE +/- 0.00, N = 3 SE +/- 0.00, N = 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.55 0.55 0.55 0.54 MIN: 0.5 / MAX: 34.71 MIN: 0.5 / MAX: 41.23 MIN: 0.5 / MAX: 30.68 MIN: 0.5 / MAX: 34.19 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 12c 10c 8c 6c 30K 60K 90K 120K 150K SE +/- 1214.59, N = 3 SE +/- 815.42, N = 3 SE +/- 1158.58, N = 3 SE +/- 681.80, N = 3 119606.21 122938.23 123571.68 121027.25 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 12c 10c 8c 6c 0.2205 0.441 0.6615 0.882 1.1025 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.97 0.98 0.98 0.97 MIN: 0.85 / MAX: 22.9 MIN: 0.85 / MAX: 39.82 MIN: 0.86 / MAX: 39.58 MIN: 0.86 / MAX: 33.82 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC 12c 10c 8c 6c 300 600 900 1200 1500 SE +/- 6.93, N = 3 SE +/- 11.03, N = 9 SE +/- 8.82, N = 3 SE +/- 15.59, N = 3 1325 1317 1325 1212 MIN: 329 / MAX: 4553 MIN: 327 / MAX: 5660 MIN: 330 / MAX: 5664 MIN: 328 / MAX: 4115
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: particle_volume/ao/real_time 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 43.71 43.03 43.97 43.36
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: particle_volume/scivis/real_time 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 42.80 43.00 43.84 43.24
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: particle_volume/pathtracer/real_time 12c 10c 8c 6c 50 100 150 200 250 SE +/- 1.54, N = 3 SE +/- 1.94, N = 3 SE +/- 1.74, N = 3 SE +/- 0.59, N = 3 229.27 230.28 228.58 230.44
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: gravity_spheres_volume/dim_512/ao/real_time 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 43.98 44.00 44.23 44.27
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.15, N = 3 SE +/- 0.12, N = 3 SE +/- 0.13, N = 3 SE +/- 0.15, N = 3 43.13 43.33 43.43 43.29
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.10 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time 12c 10c 8c 6c 12 24 36 48 60 SE +/- 0.50, N = 3 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 3 53.77 54.41 54.51 54.61
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 3.1.1 Test: Basic - Device: CPU 12c 10c 8c 6c 60 120 180 240 300 SE +/- 1.38, N = 5 SE +/- 1.86, N = 4 SE +/- 2.88, N = 3 SE +/- 2.59, N = 6 128.10 151.40 221.34 258.50 1. (CXX) g++ options: -fopenmp -std=c++0x -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -lmpi_cxx -lmpi
Rodinia Test: OpenMP CFD Solver OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP CFD Solver 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.031, N = 3 SE +/- 0.014, N = 3 SE +/- 0.016, N = 3 SE +/- 0.024, N = 3 6.050 6.074 5.970 6.152 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenMP Streamcluster OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Streamcluster 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.089, N = 15 SE +/- 0.079, N = 15 SE +/- 0.078, N = 15 SE +/- 0.050, N = 3 6.001 6.285 6.018 6.409 1. (CXX) g++ options: -O2 -lOpenCL
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: Kostya 12c 10c 8c 6c 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.11 4.11 4.11 4.11 1. (CXX) g++ options: -O3
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: TopTweet 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.07, N = 6 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 6.59 6.49 6.57 6.55 1. (CXX) g++ options: -O3
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: LargeRandom 12c 10c 8c 6c 0.2813 0.5626 0.8439 1.1252 1.4065 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.25 1.25 1.25 1.24 1. (CXX) g++ options: -O3
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: PartialTweets 12c 10c 8c 6c 1.2803 2.5606 3.8409 5.1212 6.4015 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 5.65 5.67 5.66 5.69 1. (CXX) g++ options: -O3
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 2.0 Throughput Test: DistinctUserID 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.86 6.84 6.86 6.83 1. (CXX) g++ options: -O3
Stargate Digital Audio Workstation Sample Rate: 96000 - Buffer Size: 1024 OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 12c 10c 8c 6c 0.9821 1.9642 2.9463 3.9284 4.9105 SE +/- 0.023689, N = 3 SE +/- 0.010431, N = 3 SE +/- 0.008144, N = 3 SE +/- 0.002133, N = 3 4.345890 4.354556 4.351402 4.364767 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
Stargate Digital Audio Workstation Sample Rate: 192000 - Buffer Size: 1024 OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 12c 10c 8c 6c 0.6365 1.273 1.9095 2.546 3.1825 SE +/- 0.001919, N = 3 SE +/- 0.017291, N = 3 SE +/- 0.019484, N = 3 SE +/- 0.004057, N = 3 2.829061 2.806190 2.811555 2.824814 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.4 Encoder Mode: Preset 12 - Input: Bosphorus 4K 12c 10c 8c 6c 60 120 180 240 300 SE +/- 7.35, N = 15 SE +/- 7.16, N = 15 SE +/- 7.53, N = 15 SE +/- 9.18, N = 13 251.77 241.37 227.90 221.16
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 256 - Model: ResNet-50 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.48, N = 3 SE +/- 0.36, N = 3 SE +/- 0.48, N = 3 SE +/- 0.26, N = 3 109.13 105.91 105.01 95.67
Timed Apache Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Apache Compilation 2.4.41 Time To Compile 12c 10c 8c 6c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 20.46 20.48 20.59 20.72
Timed GDB GNU Debugger Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed GDB GNU Debugger Compilation 10.2 Time To Compile 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.17, N = 3 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 41.71 42.41 42.41 43.25
Timed Gem5 Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Gem5 Compilation 21.2 Time To Compile 12c 10c 8c 6c 30 60 90 120 150 SE +/- 0.16, N = 3 SE +/- 0.36, N = 3 SE +/- 0.77, N = 3 SE +/- 0.57, N = 3 139.24 134.37 136.79 134.70
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 12c 10c 8c 6c 8 16 24 32 40 SE +/- 0.40, N = 4 SE +/- 0.04, N = 3 SE +/- 0.19, N = 3 SE +/- 0.11, N = 3 34.03 33.62 33.91 33.67
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig 12c 10c 8c 6c 6 12 18 24 30 SE +/- 0.19, N = 11 SE +/- 0.21, N = 14 SE +/- 0.21, N = 9 SE +/- 0.22, N = 7 25.50 25.41 25.53 24.75
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: allmodconfig 12c 10c 8c 6c 30 60 90 120 150 SE +/- 0.90, N = 3 SE +/- 0.72, N = 3 SE +/- 1.03, N = 3 SE +/- 0.14, N = 3 147.15 145.41 147.38 145.77
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 13.0 Build System: Ninja 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.23, N = 3 SE +/- 0.21, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 75.66 75.44 75.73 76.75
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 21.0 Time To Compile 12c 10c 8c 6c 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 20.12 20.21 20.11 20.16
Timed MPlayer Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed MPlayer Compilation 1.5 Time To Compile 12c 10c 8c 6c 2 4 6 8 10 SE +/- 0.033, N = 3 SE +/- 0.034, N = 3 SE +/- 0.023, N = 3 SE +/- 0.010, N = 3 7.777 7.755 7.808 7.773
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 18.8 Time To Compile 12c 10c 8c 6c 20 40 60 80 100 SE +/- 0.26, N = 3 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 SE +/- 0.06, N = 3 101.47 101.94 101.15 102.78
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 8.1.9 Time To Compile 12c 10c 8c 6c 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 44.52 44.61 44.58 44.70
WRF Input: conus 2.5km OpenBenchmarking.org Seconds, Fewer Is Better WRF 4.2.2 Input: conus 2.5km 12c 10c 8c 6c 1600 3200 4800 6400 8000 4070.19 4563.18 6551.88 7432.66 1. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d 12c 10c 8c 6c 80 160 240 320 400 SE +/- 0.14, N = 3 SE +/- 0.11, N = 3 SE +/- 2.69, N = 9 SE +/- 4.79, N = 9 125.53 146.29 270.09 348.88 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M 12c 10c 8c 6c 20K 40K 60K 80K 100K SE +/- 328.13, N = 3 SE +/- 152.19, N = 3 SE +/- 383.60, N = 3 SE +/- 214.10, N = 3 104604.6 102599.6 101953.5 100446.2 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M 12c 10c 8c 6c 30K 60K 90K 120K 150K SE +/- 849.90, N = 3 SE +/- 70.55, N = 3 SE +/- 122.05, N = 3 SE +/- 349.73, N = 3 126465.6 127226.6 127081.2 126057.7 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Phoronix Test Suite v10.8.4