AMD EPYC 9655P memory benchmarks by Michael Larabel for a future article.
12c DDR5-6000 Processor: AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads), Motherboard: Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS), Chipset: AMD 1Ah, Memory: 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.10, Kernel: 6.12.0-rc7-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
8c DDR5-6000 Changed Memory to 8 x 64GB DDR5-6000MT/s .
Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)
AMD EPYC Turin 8c vs. 12c Memory Channel DDR5 Comparison OpenBenchmarking.org Phoronix Test Suite AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads) Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) AMD 1Ah 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF 8 x 64GB DDR5-6000MT/s 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.12.0-rc7-phx (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1024x768 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution AMD EPYC Turin 8c Vs. 12c Memory Channel DDR5 Comparison Benchmarks System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 - Python 3.12.7 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - 8c DDR5-6000: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)
12c DDR5-6000 vs. 8c DDR5-6000 Comparison Phoronix Test Suite Baseline +13.8% +13.8% +27.6% +27.6% +41.4% +41.4% 7.1% 4.3% 3.6% 3.2% 2.9% 2.3% 2% Copy 55.1% P.D.F - CPU 53.9% Scale 53.9% P.D.F - CPU 53.5% Add 52.5% Triad 51.7% 104 104 104 - 60 51.4% 144 144 144 - 60 50.6% 47.5% X.b.i.i 43.5% SP.C 42.7% Chrysler Neon 1M 42.3% MG.C 39.8% FT.C 37.2% d.M.M.S - Execution Time 32.3% i.i.1.C.P.D 27.2% BT.C 26.8% M.C.F.B.S - 8192 MiB 23.4% IS.D 23.3% CPU - 512 - ResNet-50 22.9% M.C.F.B.S - 4096 MiB 22.8% Add - Integer 20.9% Copy - Integer 20.7% S.w.1.0.6.A 18.5% Scale - Integer 18.4% 1.R.H.D.S.R 18.2% Average - Integer 18% SP.B 17.6% 1.R.H.D.T.R 17.4% Triad - Integer 17% 1.R.H.D.F.R.C.C 16.1% M.T.E.T.D.F - CPU 14.2% M.T.E.T.D.F - CPU 14.2% LU.C 13.4% 26 11.5% Carbon Nanotube 11.3% 26 10.4% CaffeNet 12-int8 - CPU - Standard 8.7% N.S.P.L.F - CPU 8.5% 26 8.2% 26 8% CG.C 7.9% Mobilenet Quant 7.9% Memory Copy - 8192 MiB 7.6% N.S.P.L.F - CPU 7.3% Memory Copy - 4096 MiB 7.3% Compression Rating 7.2% EP.D MPI CPU - water_GMX50_bare 6.8% ZFNet-512 - CPU - Standard 6.7% 500 6.6% Medium 6.6% R.S.A.F.I - CPU 6.4% R.S.A.F.I - CPU 6.3% Small 6.2% Writes 5.9% A.w.3.5.A 5.7% Cone 4.9% GPT-2 - CPU - Standard 4.4% LuxCore Benchmark - CPU FP32MobileNetV3Large 4% 500 Fast 3.5% I.R.V 3.3% T5 Encoder - CPU - Standard 3.3% DeepLab V3 3.2% DLSC - CPU d.S.M.S - Mesh Time fcn-resnet101-11 - CPU - Standard 2.8% d.M.M.S - Mesh Time 2.7% Bosphorus 4K - Super Fast 2.7% FP16MobileNetV3Small 2.4% Tomographic Model v.I 2.3% Preset 8 - Bosphorus 4K 2.2% Bosphorus 4K - Super Fast GPT-2 - CPU - Standard 4.4% ZFNet-512 - CPU - Standard 6.7% T5 Encoder - CPU - Standard 3.3% CaffeNet 12-int8 - CPU - Standard 9.2% fcn-resnet101-11 - CPU - Standard 2.6% Stream OpenVINO Stream OpenVINO Stream Stream High Performance Conjugate Gradient High Performance Conjugate Gradient Algebraic Multi-Grid Benchmark Xcompact3d Incompact3d NAS Parallel Benchmarks OpenRadioss NAS Parallel Benchmarks NAS Parallel Benchmarks OpenFOAM Xcompact3d Incompact3d NAS Parallel Benchmarks MBW NAS Parallel Benchmarks TensorFlow MBW RAMspeed SMP RAMspeed SMP NAMD RAMspeed SMP ClickHouse RAMspeed SMP NAS Parallel Benchmarks ClickHouse RAMspeed SMP ClickHouse OpenVINO OpenVINO NAS Parallel Benchmarks Graph500 GPAW Graph500 ONNX Runtime OpenVINO Graph500 Graph500 NAS Parallel Benchmarks LiteRT MBW OpenVINO MBW 7-Zip Compression NAS Parallel Benchmarks GROMACS ONNX Runtime Apache HTTP Server Whisperfile OpenVINO OpenVINO Whisperfile Apache Cassandra NAMD Epoch ONNX Runtime LuxCoreRender XNNPACK nginx ASTC Encoder LiteRT ONNX Runtime LiteRT LuxCoreRender OpenFOAM ONNX Runtime OpenFOAM uvg266 XNNPACK SPECFEM3D OpenVKL SVT-AV1 Kvazaar ONNX Runtime ONNX Runtime ONNX Runtime ONNX Runtime ONNX Runtime 12c DDR5-6000 8c DDR5-6000
AMD EPYC Turin 8c vs. 12c Memory Channel DDR5 Comparison compress-7zip: Compression Rating compress-7zip: Decompression Rating amg: cassandra: Writes apache: 500 astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive astcenc: Very Thorough blender: BMW27 - CPU-Only blender: Junkshop - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only build2: Time To Compile v-ray: CPU clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache clickhouse: 100M Rows Hits Dataset, Second Run clickhouse: 100M Rows Hits Dataset, Third Run epoch: Cone gpaw: Carbon Nanotube graph500: 26 graph500: 26 graph500: 26 graph500: 26 gromacs: MPI CPU - water_GMX50_bare hpcg: 104 104 104 - 60 hpcg: 144 144 144 - 60 john-the-ripper: bcrypt john-the-ripper: WPA PSK john-the-ripper: Blowfish john-the-ripper: HMAC-SHA512 john-the-ripper: MD5 kvazaar: Bosphorus 4K - Slow kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Super Fast kvazaar: Bosphorus 4K - Ultra Fast liquid-dsp: 64 - 256 - 57 liquid-dsp: 128 - 256 - 57 liquid-dsp: 192 - 256 - 57 liquid-dsp: 64 - 256 - 512 liquid-dsp: 128 - 256 - 512 liquid-dsp: 192 - 256 - 512 litert: DeepLab V3 litert: SqueezeNet litert: Inception V4 litert: NASNet Mobile litert: Mobilenet Float litert: Mobilenet Quant litert: Inception ResNet V2 litert: Quantized COCO SSD MobileNet v1 luxcorerender: DLSC - CPU luxcorerender: Danish Mood - CPU luxcorerender: Orange Juice - CPU luxcorerender: LuxCore Benchmark - CPU luxcorerender: Rainbow Colors and Prism - CPU m-queens: Time To Solve mbw: Memory Copy - 4096 MiB mbw: Memory Copy - 8192 MiB mbw: Memory Copy, Fixed Block Size - 4096 MiB mbw: Memory Copy, Fixed Block Size - 8192 MiB memcached: 1:5 memcached: 1:10 memcached: 1:100 minibude: OpenMP - BM2 minibude: OpenMP - BM2 namd: ATPase with 327,506 Atoms namd: STMV with 1,066,628 Atoms npb: BT.C npb: CG.C npb: EP.C npb: EP.D npb: FT.C npb: IS.D npb: LU.C npb: MG.C npb: SP.B npb: SP.C nginx: 500 onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: ZFNet-512 - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard openfoam: drivaerFastback, Small Mesh Size - Mesh Time openfoam: drivaerFastback, Small Mesh Size - Execution Time openfoam: drivaerFastback, Medium Mesh Size - Mesh Time openfoam: drivaerFastback, Medium Mesh Size - Execution Time openradioss: Chrysler Neon 1M openradioss: INIVOL and Fluid Structure Interaction Drop Container openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvkl: vklBenchmarkCPU ISPC pgbench: 1000 - 800 - Read Only pgbench: 1000 - 800 - Read Only - Average Latency pgbench: 1000 - 800 - Read Write pgbench: 1000 - 800 - Read Write - Average Latency ramspeed: Add - Integer ramspeed: Copy - Integer ramspeed: Scale - Integer ramspeed: Triad - Integer ramspeed: Average - Integer rawtherapee: Total Benchmark Time specfem3d: Mount St. Helens specfem3d: Layered Halfspace specfem3d: Tomographic Model specfem3d: Homogeneous Halfspace specfem3d: Water-layered Halfspace srsran: PDSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Total stargate: 96000 - 512 stargate: 192000 - 512 stargate: 96000 - 1024 stargate: 192000 - 1024 stream: Copy stream: Scale stream: Triad stream: Add svt-av1: Preset 3 - Bosphorus 4K svt-av1: Preset 5 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 3 - Beauty 4K 10-bit svt-av1: Preset 5 - Beauty 4K 10-bit svt-av1: Preset 8 - Beauty 4K 10-bit svt-av1: Preset 13 - Beauty 4K 10-bit tensorflow: CPU - 512 - ResNet-50 build-godot: Time To Compile build-linux-kernel: defconfig build-linux-kernel: allmodconfig uvg266: Bosphorus 4K - Slow uvg266: Bosphorus 4K - Medium uvg266: Bosphorus 4K - Very Fast uvg266: Bosphorus 4K - Super Fast uvg266: Bosphorus 4K - Ultra Fast vvenc: Bosphorus 4K - Fast vvenc: Bosphorus 4K - Faster warpx: Uniform Plasma warpx: Plasma Acceleration whisperfile: Small whisperfile: Medium incompact3d: X3D-benchmarking input.i3d incompact3d: input.i3d 193 Cells Per Direction xnnpack: FP32MobileNetV1 xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV1 xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: QS8MobileNetV2 12c DDR5-6000 8c DDR5-6000 779242 627215 3076286333 447176 170467.37 1109.9501 568.8262 82.2795 7.1334 11.6277 13.14 16.75 35.43 18.46 120.83 41.25 53.439 199769 740.15 771.05 765.07 186.85 25.660 1478180000 1505710000 554936000 747138000 17.787 62.5441 61.7722 237542 1023333 237515 413474333 21681000 44.79 45.30 90.46 91.54 96.07 3195900000 4963433333 5958600000 1412966667 1952100000 2149200000 11450.9 7051.59 44088.3 99408.3 4399.48 5800.65 58054.3 7211.99 16.49 12.03 24.19 12.39 28.23 7.214 25160.632 25126.492 22520.674 22449.826 3701189.23 6670118.94 11923788.43 7171.451 286.858 12.28998 4.21717 361153.06 65380.58 11230.81 13685.97 164903.45 6970.89 311883.80 147097.41 199255.89 154965.84 505484.86 0.531984 0.266791 0.343593 6.76736 0.716373 425.784 277.643 195.276 5.11928 12.2149 81.8816 163.976 6.09718 243.328 4.10880 22.0238 45.4067 1041.95 0.959531 9.44076 106.2133 22.944605 22.408568 106.31177 192.22721 90.77 79.42 689.50 69.48 145.10 330.01 8240.40 5.80 23427.50 3.96 2437.93 19.61 844.93 56.71 14006.15 6.71 6380.58 7.48 7934.40 11.57 10462.56 4.56 3551.35 27.01 166983.33 0.33 2825 3155036 0.254 113745 7.037 120526.51 139450.46 139548.85 122052.34 130318.77 35.854 5.778714613 13.700075229 6.777958300 8.194862498 14.275555136 25941.9 6237.7 5.943979 3.932179 6.539322 4.429570 343229.3 314946.5 348347.3 355858.4 15.684 56.438 186.687 435.382 1.852 7.699 13.554 17.145 241.55 80.552 22.224 193.346 30.40 33.04 67.24 67.26 67.65 9.643 22.960 16.27457152 25.85109772 91.05176 202.15909 203.937052 7.08492692 4629 9242 13643 10418 4611 8982 13074 10559 10095 727045 626553 2085218000 422312 159864.99 1072.7819 568.3363 82.1896 7.1339 11.6303 13.09 16.80 35.51 18.55 121.15 41.18 53.551 198145 637.67 652.53 651.48 196.03 28.560 1365800000 1394450000 497798000 676709000 16.654 41.3186 41.0157 237055 1019333 237458 413326333 21408333 44.76 45.38 89.33 93.40 96.72 3214133333 5053200000 5955433333 1417633333 1956333333 2136766667 11817.9 7048.39 43959.9 100519.4 4441.27 6260.21 59977.8 7110.12 17.01 12.06 24.15 12.92 27.85 7.229 23448.633 23357.889 18332.409 18188.111 3711825.75 6752055.12 12031156.03 7208.779 288.351 11.62457 3.55953 284718.53 60572.20 11375.83 14658.25 120202.20 5653.35 275011.41 105206.58 169436.81 108558.21 523766.78 0.529052 0.265558 0.341374 6.84402 0.715363 429.684 277.996 187.125 5.34359 12.0605 82.9434 153.637 6.50761 235.596 4.24372 21.6478 46.2026 958.576 1.047505 9.18611 109.005 22.297072 22.648378 109.19543 254.29552 129.17 79.99 449.23 106.96 145.11 330.13 8240.98 5.80 23443.86 3.97 2291.09 20.85 740.13 64.75 14026.36 6.70 6329.60 7.53 7314.92 12.42 10448.92 4.56 3543.79 27.06 166363.91 0.33 2762 3149277 0.254 112175 7.136 99666.50 115580.47 117864.04 104337.42 110436.65 36.228 5.855350933 13.861252099 6.626369356 8.198005837 14.355377904 26209.9 6238.1 5.918595 3.896311 6.517349 4.388974 221234.4 204677.8 229604.0 233276.2 15.603 56.088 182.608 430.868 1.846 7.687 13.512 17.110 196.47 80.874 22.555 196.240 30.30 32.92 66.12 65.52 66.86 9.598 22.635 16.45373580 25.76282272 96.68799 215.52289 292.578786 9.01416429 4596 9236 14189 10534 4570 9094 13257 10810 9968 OpenBenchmarking.org
7-Zip Compression OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 24.05 Test: Compression Rating 12c DDR5-6000 8c DDR5-6000 200K 400K 600K 800K 1000K SE +/- 879.82, N = 3 SE +/- 1850.34, N = 3 779242 727045 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 24.05 Test: Decompression Rating 12c DDR5-6000 8c DDR5-6000 130K 260K 390K 520K 650K SE +/- 169.10, N = 3 SE +/- 340.29, N = 3 627215 626553 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Apache HTTP Server This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 500 12c DDR5-6000 8c DDR5-6000 40K 80K 120K 160K 200K SE +/- 413.18, N = 3 SE +/- 177.80, N = 3 170467.37 159864.99 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
ASTC Encoder OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Fast 12c DDR5-6000 8c DDR5-6000 200 400 600 800 1000 SE +/- 14.18, N = 3 SE +/- 7.37, N = 13 1109.95 1072.78 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Medium 12c DDR5-6000 8c DDR5-6000 120 240 360 480 600 SE +/- 0.76, N = 3 SE +/- 0.83, N = 3 568.83 568.34 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Thorough 12c DDR5-6000 8c DDR5-6000 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 82.28 82.19 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Exhaustive 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.0018, N = 3 SE +/- 0.0004, N = 3 7.1334 7.1339 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 5.0 Preset: Very Thorough 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 11.63 11.63 1. (CXX) g++ options: -O3 -flto -pthread
Blender OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: BMW27 - Compute: CPU-Only 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 13.14 13.09
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Junkshop - Compute: CPU-Only 12c DDR5-6000 8c DDR5-6000 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 16.75 16.80
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Classroom - Compute: CPU-Only 12c DDR5-6000 8c DDR5-6000 8 16 24 32 40 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 35.43 35.51
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Fishy Cat - Compute: CPU-Only 12c DDR5-6000 8c DDR5-6000 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 18.46 18.55
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Barbershop - Compute: CPU-Only 12c DDR5-6000 8c DDR5-6000 30 60 90 120 150 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 120.83 121.15
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.2 Blend File: Pabellon Barcelona - Compute: CPU-Only 12c DDR5-6000 8c DDR5-6000 9 18 27 36 45 SE +/- 0.14, N = 3 SE +/- 0.15, N = 3 41.25 41.18
Build2 OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.17 Time To Compile 12c DDR5-6000 8c DDR5-6000 12 24 36 48 60 SE +/- 0.01, N = 3 SE +/- 0.16, N = 3 53.44 53.55
Chaos Group V-RAY This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org vsamples, More Is Better Chaos Group V-RAY 6.0 Mode: CPU 12c DDR5-6000 8c DDR5-6000 40K 80K 120K 160K 200K SE +/- 252.42, N = 3 SE +/- 357.23, N = 3 199769 198145
ClickHouse ClickHouse is an open-source, high performance OLAP data management system. This test profile uses ClickHouse's standard benchmark recommendations per https://clickhouse.com/docs/en/operations/performance-test/ / https://github.com/ClickHouse/ClickBench/tree/main/clickhouse with the 100 million rows web analytics dataset. The reported value is the query processing time using the geometric mean of all separate queries performed as an aggregate. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache 12c DDR5-6000 8c DDR5-6000 160 320 480 640 800 SE +/- 3.26, N = 3 SE +/- 1.98, N = 3 740.15 637.67 MIN: 69.2 / MAX: 7500 MIN: 67.95 / MAX: 7500
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run 12c DDR5-6000 8c DDR5-6000 170 340 510 680 850 SE +/- 2.95, N = 3 SE +/- 0.96, N = 3 771.05 652.53 MIN: 69.77 / MAX: 8571.43 MIN: 69.28 / MAX: 7500
OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run 12c DDR5-6000 8c DDR5-6000 160 320 480 640 800 SE +/- 4.74, N = 3 SE +/- 1.81, N = 3 765.07 651.48 MIN: 69.77 / MAX: 8571.43 MIN: 70.92 / MAX: 6666.67
Epoch OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone 12c DDR5-6000 8c DDR5-6000 40 80 120 160 200 SE +/- 0.55, N = 3 SE +/- 0.52, N = 3 186.85 196.03 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
GPAW GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube 12c DDR5-6000 8c DDR5-6000 7 14 21 28 35 SE +/- 0.23, N = 3 SE +/- 0.24, N = 3 25.66 28.56 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
Graph500 This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org bfs median_TEPS, More Is Better Graph500 3.0 Scale: 26 12c DDR5-6000 8c DDR5-6000 300M 600M 900M 1200M 1500M 1478180000 1365800000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org bfs max_TEPS, More Is Better Graph500 3.0 Scale: 26 12c DDR5-6000 8c DDR5-6000 300M 600M 900M 1200M 1500M 1505710000 1394450000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 12c DDR5-6000 8c DDR5-6000 120M 240M 360M 480M 600M 554936000 497798000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenBenchmarking.org sssp max_TEPS, More Is Better Graph500 3.0 Scale: 26 12c DDR5-6000 8c DDR5-6000 160M 320M 480M 640M 800M 747138000 676709000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare 12c DDR5-6000 8c DDR5-6000 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 17.79 16.65 1. (CXX) g++ options: -O3 -lm
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK 12c DDR5-6000 8c DDR5-6000 200K 400K 600K 800K 1000K SE +/- 2185.81, N = 3 SE +/- 1666.67, N = 3 1023333 1019333 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish 12c DDR5-6000 8c DDR5-6000 50K 100K 150K 200K 250K SE +/- 16.90, N = 3 SE +/- 16.64, N = 3 237515 237458 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 12c DDR5-6000 8c DDR5-6000 90M 180M 270M 360M 450M SE +/- 2811379.81, N = 3 SE +/- 211622.57, N = 3 413474333 413326333 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 12c DDR5-6000 8c DDR5-6000 5M 10M 15M 20M 25M SE +/- 20808.65, N = 3 SE +/- 32518.37, N = 3 21681000 21408333 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt -lbz2
Kvazaar This is a test of Kvazaar as a CPU-based H.265/HEVC video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Slow 12c DDR5-6000 8c DDR5-6000 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.11, N = 3 44.79 44.76 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium 12c DDR5-6000 8c DDR5-6000 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 45.30 45.38 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast 12c DDR5-6000 8c DDR5-6000 20 40 60 80 100 SE +/- 0.61, N = 3 SE +/- 0.20, N = 3 90.46 89.33 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Super Fast 12c DDR5-6000 8c DDR5-6000 20 40 60 80 100 SE +/- 0.52, N = 3 SE +/- 0.53, N = 3 91.54 93.40 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 12c DDR5-6000 8c DDR5-6000 20 40 60 80 100 SE +/- 0.29, N = 3 SE +/- 0.36, N = 3 96.07 96.72 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 12c DDR5-6000 8c DDR5-6000 700M 1400M 2100M 2800M 3500M SE +/- 1814754.35, N = 3 SE +/- 3636084.59, N = 3 3195900000 3214133333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 57 12c DDR5-6000 8c DDR5-6000 1100M 2200M 3300M 4400M 5500M SE +/- 8326730.72, N = 3 SE +/- 8868107.65, N = 3 4963433333 5053200000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 192 - Buffer Length: 256 - Filter Length: 57 12c DDR5-6000 8c DDR5-6000 1300M 2600M 3900M 5200M 6500M SE +/- 9761830.43, N = 3 SE +/- 8738103.02, N = 3 5958600000 5955433333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 12c DDR5-6000 8c DDR5-6000 300M 600M 900M 1200M 1500M SE +/- 1419311.26, N = 3 SE +/- 2356079.61, N = 3 1412966667 1417633333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 12c DDR5-6000 8c DDR5-6000 400M 800M 1200M 1600M 2000M SE +/- 6251666.44, N = 3 SE +/- 1072898.46, N = 3 1952100000 1956333333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 192 - Buffer Length: 256 - Filter Length: 512 12c DDR5-6000 8c DDR5-6000 500M 1000M 1500M 2000M 2500M SE +/- 1582192.57, N = 3 SE +/- 821245.67, N = 3 2149200000 2136766667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile 12c DDR5-6000 8c DDR5-6000 20K 40K 60K 80K 100K SE +/- 1271.15, N = 3 SE +/- 1096.54, N = 15 99408.3 100519.4
OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float 12c DDR5-6000 8c DDR5-6000 1000 2000 3000 4000 5000 SE +/- 16.88, N = 3 SE +/- 18.31, N = 3 4399.48 4441.27
OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant 12c DDR5-6000 8c DDR5-6000 1300 2600 3900 5200 6500 SE +/- 109.66, N = 15 SE +/- 155.36, N = 15 5800.65 6260.21
OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 12c DDR5-6000 8c DDR5-6000 13K 26K 39K 52K 65K SE +/- 598.16, N = 3 SE +/- 811.71, N = 3 58054.3 59977.8
OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 12c DDR5-6000 8c DDR5-6000 1500 3000 4500 6000 7500 SE +/- 28.16, N = 3 SE +/- 68.84, N = 15 7211.99 7110.12
LuxCoreRender LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: DLSC - Acceleration: CPU 12c DDR5-6000 8c DDR5-6000 4 8 12 16 20 SE +/- 0.19, N = 3 SE +/- 0.21, N = 15 16.49 17.01 MIN: 15.79 / MAX: 20.03 MIN: 15.67 / MAX: 20.15
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: CPU 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.09, N = 15 SE +/- 0.11, N = 15 12.03 12.06 MIN: 5.72 / MAX: 14.2 MIN: 5.76 / MAX: 14.38
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: CPU 12c DDR5-6000 8c DDR5-6000 6 12 18 24 30 SE +/- 0.09, N = 3 SE +/- 0.09, N = 3 24.19 24.15 MIN: 21.48 / MAX: 33.13 MIN: 21.4 / MAX: 32.92
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: CPU 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.12, N = 15 SE +/- 0.17, N = 3 12.39 12.92 MIN: 5.62 / MAX: 14.88 MIN: 6.3 / MAX: 15.04
OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: CPU 12c DDR5-6000 8c DDR5-6000 7 14 21 28 35 SE +/- 0.19, N = 3 SE +/- 0.02, N = 3 28.23 27.85 MIN: 25.05 / MAX: 28.69 MIN: 24.8 / MAX: 28.48
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 8192 MiB 12c DDR5-6000 8c DDR5-6000 5K 10K 15K 20K 25K SE +/- 31.99, N = 3 SE +/- 19.02, N = 3 25126.49 23357.89 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB 12c DDR5-6000 8c DDR5-6000 5K 10K 15K 20K 25K SE +/- 55.98, N = 3 SE +/- 54.43, N = 3 22520.67 18332.41 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB 12c DDR5-6000 8c DDR5-6000 5K 10K 15K 20K 25K SE +/- 64.32, N = 3 SE +/- 60.40, N = 3 22449.83 18188.11 1. (CC) gcc options: -O3 -march=native
Memcached Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 12c DDR5-6000 8c DDR5-6000 800K 1600K 2400K 3200K 4000K SE +/- 16589.87, N = 3 SE +/- 3725.22, N = 3 3701189.23 3711825.75 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:10 12c DDR5-6000 8c DDR5-6000 1.4M 2.8M 4.2M 5.6M 7M SE +/- 23004.60, N = 3 SE +/- 5728.08, N = 3 6670118.94 6752055.12 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:100 12c DDR5-6000 8c DDR5-6000 3M 6M 9M 12M 15M SE +/- 20796.91, N = 3 SE +/- 48609.95, N = 3 11923788.43 12031156.03 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 12c DDR5-6000 8c DDR5-6000 1500 3000 4500 6000 7500 SE +/- 56.00, N = 3 SE +/- 12.59, N = 3 7171.45 7208.78 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 12c DDR5-6000 8c DDR5-6000 60 120 180 240 300 SE +/- 2.24, N = 3 SE +/- 0.50, N = 3 286.86 288.35 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
NAMD OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 12.29 11.62
OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms 12c DDR5-6000 8c DDR5-6000 0.9489 1.8978 2.8467 3.7956 4.7445 SE +/- 0.00356, N = 3 SE +/- 0.00613, N = 3 4.21717 3.55953
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C 12c DDR5-6000 8c DDR5-6000 80K 160K 240K 320K 400K SE +/- 2078.86, N = 3 SE +/- 377.10, N = 3 361153.06 284718.53 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C 12c DDR5-6000 8c DDR5-6000 14K 28K 42K 56K 70K SE +/- 928.22, N = 15 SE +/- 631.87, N = 15 65380.58 60572.20 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 85.90, N = 3 SE +/- 91.88, N = 3 11230.81 11375.83 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D 12c DDR5-6000 8c DDR5-6000 3K 6K 9K 12K 15K SE +/- 614.47, N = 12 SE +/- 133.00, N = 3 13685.97 14658.25 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C 12c DDR5-6000 8c DDR5-6000 40K 80K 120K 160K 200K SE +/- 1877.63, N = 4 SE +/- 1124.88, N = 3 164903.45 120202.20 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D 12c DDR5-6000 8c DDR5-6000 1500 3000 4500 6000 7500 SE +/- 26.78, N = 3 SE +/- 31.49, N = 3 6970.89 5653.35 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C 12c DDR5-6000 8c DDR5-6000 70K 140K 210K 280K 350K SE +/- 2348.10, N = 3 SE +/- 1080.76, N = 3 311883.80 275011.41 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C 12c DDR5-6000 8c DDR5-6000 30K 60K 90K 120K 150K SE +/- 2056.93, N = 3 SE +/- 1082.06, N = 15 147097.41 105206.58 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B 12c DDR5-6000 8c DDR5-6000 40K 80K 120K 160K 200K SE +/- 2749.03, N = 3 SE +/- 1524.43, N = 15 199255.89 169436.81 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C 12c DDR5-6000 8c DDR5-6000 30K 60K 90K 120K 150K SE +/- 622.44, N = 3 SE +/- 459.55, N = 3 154965.84 108558.21 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 12c DDR5-6000 8c DDR5-6000 110K 220K 330K 440K 550K SE +/- 328.79, N = 3 SE +/- 3145.55, N = 3 505484.86 523766.78 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU 12c DDR5-6000 8c DDR5-6000 0.1197 0.2394 0.3591 0.4788 0.5985 SE +/- 0.003116, N = 3 SE +/- 0.004111, N = 3 0.531984 0.529052 MIN: 0.48 MIN: 0.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU 12c DDR5-6000 8c DDR5-6000 0.06 0.12 0.18 0.24 0.3 SE +/- 0.000721, N = 3 SE +/- 0.000649, N = 3 0.266791 0.265558 MIN: 0.25 MIN: 0.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU 12c DDR5-6000 8c DDR5-6000 0.0773 0.1546 0.2319 0.3092 0.3865 SE +/- 0.000927, N = 3 SE +/- 0.000677, N = 3 0.343593 0.341374 MIN: 0.32 MIN: 0.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.02120, N = 3 SE +/- 0.01476, N = 3 6.76736 6.84402 MIN: 4.07 MIN: 5.14 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU 12c DDR5-6000 8c DDR5-6000 0.1612 0.3224 0.4836 0.6448 0.806 SE +/- 0.002304, N = 3 SE +/- 0.001800, N = 3 0.716373 0.715363 MIN: 0.62 MIN: 0.62 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU 12c DDR5-6000 8c DDR5-6000 90 180 270 360 450 SE +/- 0.20, N = 3 SE +/- 0.28, N = 3 425.78 429.68 MIN: 420.18 MIN: 423.83 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU 12c DDR5-6000 8c DDR5-6000 60 120 180 240 300 SE +/- 0.81, N = 3 SE +/- 0.67, N = 3 277.64 278.00 MIN: 269.94 MIN: 271.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time 12c DDR5-6000 8c DDR5-6000 5 10 15 20 25 22.94 22.30 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time 12c DDR5-6000 8c DDR5-6000 5 10 15 20 25 22.41 22.65 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time 12c DDR5-6000 8c DDR5-6000 20 40 60 80 100 106.31 109.20 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time 12c DDR5-6000 8c DDR5-6000 60 120 180 240 300 192.23 254.30 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Chrysler Neon 1M 12c DDR5-6000 8c DDR5-6000 30 60 90 120 150 SE +/- 0.26, N = 3 SE +/- 0.94, N = 11 90.77 129.17
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 150 300 450 600 750 SE +/- 1.81, N = 3 SE +/- 6.83, N = 15 689.50 449.23 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 20 40 60 80 100 SE +/- 0.19, N = 3 SE +/- 1.46, N = 15 69.48 106.96 MIN: 34.01 / MAX: 134.55 MIN: 37.41 / MAX: 380.85 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 30 60 90 120 150 SE +/- 0.26, N = 3 SE +/- 0.12, N = 3 145.10 145.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 70 140 210 280 350 SE +/- 0.62, N = 3 SE +/- 0.26, N = 3 330.01 330.13 MIN: 242.42 / MAX: 353.46 MIN: 257.44 / MAX: 357.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 10.65, N = 3 SE +/- 5.48, N = 3 8240.40 8240.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 1.305 2.61 3.915 5.22 6.525 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 5.80 5.80 MIN: 2.41 / MAX: 27.54 MIN: 2.26 / MAX: 26.37 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 5K 10K 15K 20K 25K SE +/- 29.34, N = 3 SE +/- 11.10, N = 3 23427.50 23443.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 0.8933 1.7866 2.6799 3.5732 4.4665 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.96 3.97 MIN: 1.68 / MAX: 20.25 MIN: 1.61 / MAX: 20.64 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 500 1000 1500 2000 2500 SE +/- 1.15, N = 3 SE +/- 2.70, N = 3 2437.93 2291.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 19.61 20.85 MIN: 10.37 / MAX: 41.25 MIN: 10.51 / MAX: 40.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 200 400 600 800 1000 SE +/- 1.30, N = 3 SE +/- 4.52, N = 3 844.93 740.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 14 28 42 56 70 SE +/- 0.09, N = 3 SE +/- 0.40, N = 3 56.71 64.75 MIN: 34.97 / MAX: 96.61 MIN: 28.81 / MAX: 99.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 3K 6K 9K 12K 15K SE +/- 11.52, N = 3 SE +/- 1.42, N = 3 14006.15 14026.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 6.71 6.70 MIN: 2.58 / MAX: 23.11 MIN: 2.45 / MAX: 24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 1400 2800 4200 5600 7000 SE +/- 12.38, N = 3 SE +/- 17.72, N = 3 6380.58 6329.60 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 7.48 7.53 MIN: 4.42 / MAX: 23.35 MIN: 4.35 / MAX: 22.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 20.16, N = 3 SE +/- 10.57, N = 3 7934.40 7314.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 11.57 12.42 MIN: 6.95 / MAX: 30.79 MIN: 6.11 / MAX: 28.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 5.49, N = 3 SE +/- 7.25, N = 3 10462.56 10448.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU 12c DDR5-6000 8c DDR5-6000 1.026 2.052 3.078 4.104 5.13 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.56 4.56 MIN: 2.45 / MAX: 18.95 MIN: 2.23 / MAX: 20.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 800 1600 2400 3200 4000 SE +/- 4.91, N = 3 SE +/- 1.42, N = 3 3551.35 3543.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 27.01 27.06 MIN: 16.35 / MAX: 47.96 MIN: 16.36 / MAX: 45.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 40K 80K 120K 160K 200K SE +/- 251.97, N = 3 SE +/- 234.32, N = 3 166983.33 166363.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU 12c DDR5-6000 8c DDR5-6000 0.0743 0.1486 0.2229 0.2972 0.3715 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.33 0.33 MIN: 0.12 / MAX: 23.43 MIN: 0.13 / MAX: 24.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVKL OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 2.0.0 Benchmark: vklBenchmarkCPU ISPC 12c DDR5-6000 8c DDR5-6000 600 1200 1800 2400 3000 SE +/- 1.76, N = 3 SE +/- 1.53, N = 3 2825 2762 MIN: 217 / MAX: 36373 MIN: 216 / MAX: 33882
PostgreSQL OpenBenchmarking.org TPS, More Is Better PostgreSQL 17 Scaling Factor: 1000 - Clients: 800 - Mode: Read Only 12c DDR5-6000 8c DDR5-6000 700K 1400K 2100K 2800K 3500K SE +/- 1685.88, N = 3 SE +/- 8342.05, N = 3 3155036 3149277 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpq -lpgcommon -lpgport -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 17 Scaling Factor: 1000 - Clients: 800 - Mode: Read Only - Average Latency 12c DDR5-6000 8c DDR5-6000 0.0572 0.1144 0.1716 0.2288 0.286 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 0.254 0.254 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpq -lpgcommon -lpgport -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL 17 Scaling Factor: 1000 - Clients: 800 - Mode: Read Write 12c DDR5-6000 8c DDR5-6000 20K 40K 60K 80K 100K SE +/- 1257.14, N = 5 SE +/- 836.14, N = 11 113745 112175 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpq -lpgcommon -lpgport -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 17 Scaling Factor: 1000 - Clients: 800 - Mode: Read Write - Average Latency 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.079, N = 5 SE +/- 0.053, N = 11 7.037 7.136 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpq -lpgcommon -lpgport -lm
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Integer 12c DDR5-6000 8c DDR5-6000 30K 60K 90K 120K 150K SE +/- 266.61, N = 3 SE +/- 779.65, N = 3 139450.46 115580.47 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer 12c DDR5-6000 8c DDR5-6000 30K 60K 90K 120K 150K SE +/- 283.77, N = 3 SE +/- 291.38, N = 3 139548.85 117864.04 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Integer 12c DDR5-6000 8c DDR5-6000 30K 60K 90K 120K 150K SE +/- 253.90, N = 3 SE +/- 795.56, N = 3 122052.34 104337.42 1. (CC) gcc options: -O3 -march=native
OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer 12c DDR5-6000 8c DDR5-6000 30K 60K 90K 120K 150K SE +/- 252.96, N = 3 SE +/- 807.29, N = 3 130318.77 110436.65 1. (CC) gcc options: -O3 -march=native
SPECFEM3D simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Mount St. Helens 12c DDR5-6000 8c DDR5-6000 1.3175 2.635 3.9525 5.27 6.5875 SE +/- 0.055041694, N = 6 SE +/- 0.034598254, N = 3 5.778714613 5.855350933 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Layered Halfspace 12c DDR5-6000 8c DDR5-6000 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.11, N = 3 13.70 13.86 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Tomographic Model 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.086144824, N = 3 SE +/- 0.080343793, N = 3 6.777958300 6.626369356 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Homogeneous Halfspace 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.106640607, N = 3 SE +/- 0.043840287, N = 3 8.194862498 8.198005837 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.1.1 Model: Water-layered Halfspace 12c DDR5-6000 8c DDR5-6000 4 8 12 16 20 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 14.28 14.36 1. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PDSCH Processor Benchmark, Throughput Total 12c DDR5-6000 8c DDR5-6000 6K 12K 18K 24K 30K SE +/- 311.92, N = 3 SE +/- 185.11, N = 12 25941.9 26209.9 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PUSCH Processor Benchmark, Throughput Total 12c DDR5-6000 8c DDR5-6000 1300 2600 3900 5200 6500 SE +/- 0.38, N = 3 SE +/- 0.12, N = 3 6237.7 6238.1 MIN: 3818.9 / MAX: 6238.4 MIN: 3819.5 / MAX: 6238.3 1. (CXX) g++ options: -O3 -march=native -mavx2 -mavx -msse4.1 -mfma -mavx512f -mavx512cd -mavx512bw -mavx512dq -fno-trapping-math -fno-math-errno -ldl
Stargate Digital Audio Workstation Stargate is an open-source, cross-platform digital audio workstation (DAW) software package with "a unique and carefully curated experience" with scalability from old systems up through modern multi-core systems. Stargate is GPLv3 licensed and makes use of Qt5 (PyQt5) for its user-interface. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 512 12c DDR5-6000 8c DDR5-6000 1.3374 2.6748 4.0122 5.3496 6.687 SE +/- 0.018441, N = 3 SE +/- 0.033577, N = 3 5.943979 5.918595 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 512 12c DDR5-6000 8c DDR5-6000 0.8847 1.7694 2.6541 3.5388 4.4235 SE +/- 0.003414, N = 3 SE +/- 0.040462, N = 3 3.932179 3.896311 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 96000 - Buffer Size: 1024 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.024623, N = 3 SE +/- 0.002866, N = 3 6.539322 6.517349 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 22.11.5 Sample Rate: 192000 - Buffer Size: 1024 12c DDR5-6000 8c DDR5-6000 0.9967 1.9934 2.9901 3.9868 4.9835 SE +/- 0.018753, N = 3 SE +/- 0.014254, N = 3 4.429570 4.388974 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Scale 12c DDR5-6000 8c DDR5-6000 70K 140K 210K 280K 350K SE +/- 478.12, N = 5 SE +/- 630.12, N = 5 314946.5 204677.8 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Triad 12c DDR5-6000 8c DDR5-6000 70K 140K 210K 280K 350K SE +/- 1895.03, N = 5 SE +/- 936.98, N = 5 348347.3 229604.0 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Add 12c DDR5-6000 8c DDR5-6000 80K 160K 240K 320K 400K SE +/- 1639.97, N = 5 SE +/- 1797.63, N = 5 355858.4 233276.2 1. (CC) gcc options: -mcmodel=medium -O3 -march=native -fopenmp
SVT-AV1 OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K 12c DDR5-6000 8c DDR5-6000 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 15.68 15.60 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K 12c DDR5-6000 8c DDR5-6000 13 26 39 52 65 SE +/- 0.28, N = 3 SE +/- 0.10, N = 3 56.44 56.09 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K 12c DDR5-6000 8c DDR5-6000 40 80 120 160 200 SE +/- 1.50, N = 3 SE +/- 0.19, N = 3 186.69 182.61 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K 12c DDR5-6000 8c DDR5-6000 90 180 270 360 450 SE +/- 1.17, N = 3 SE +/- 0.57, N = 3 435.38 430.87 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit 12c DDR5-6000 8c DDR5-6000 0.4167 0.8334 1.2501 1.6668 2.0835 SE +/- 0.007, N = 3 SE +/- 0.002, N = 3 1.852 1.846 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit 12c DDR5-6000 8c DDR5-6000 2 4 6 8 10 SE +/- 0.039, N = 3 SE +/- 0.026, N = 3 7.699 7.687 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 13.55 13.51 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit 12c DDR5-6000 8c DDR5-6000 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 17.15 17.11 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 12c DDR5-6000 8c DDR5-6000 50 100 150 200 250 SE +/- 0.11, N = 3 SE +/- 0.27, N = 3 241.55 196.47
uvg266 OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Slow 12c DDR5-6000 8c DDR5-6000 7 14 21 28 35 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 30.40 30.30
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Medium 12c DDR5-6000 8c DDR5-6000 8 16 24 32 40 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 33.04 32.92
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 12c DDR5-6000 8c DDR5-6000 15 30 45 60 75 SE +/- 0.21, N = 3 SE +/- 0.33, N = 3 67.24 66.12
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Super Fast 12c DDR5-6000 8c DDR5-6000 15 30 45 60 75 SE +/- 0.11, N = 3 SE +/- 0.32, N = 3 67.26 65.52
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.8.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 12c DDR5-6000 8c DDR5-6000 15 30 45 60 75 SE +/- 0.18, N = 3 SE +/- 0.79, N = 3 67.65 66.86
VVenC VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Fast 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.082, N = 3 SE +/- 0.064, N = 3 9.643 9.598 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.11 Video Input: Bosphorus 4K - Video Preset: Faster 12c DDR5-6000 8c DDR5-6000 6 12 18 24 30 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 22.96 22.64 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
WarpX OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma 12c DDR5-6000 8c DDR5-6000 4 8 12 16 20 SE +/- 0.23, N = 15 SE +/- 0.24, N = 12 16.27 16.45 1. (CXX) g++ options: -O3 -lm
OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration 12c DDR5-6000 8c DDR5-6000 6 12 18 24 30 SE +/- 0.22, N = 3 SE +/- 0.17, N = 3 25.85 25.76 1. (CXX) g++ options: -O3 -lm
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d 12c DDR5-6000 8c DDR5-6000 60 120 180 240 300 SE +/- 2.76, N = 3 SE +/- 3.79, N = 3 203.94 292.58 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction 12c DDR5-6000 8c DDR5-6000 3 6 9 12 15 SE +/- 0.06792326, N = 3 SE +/- 0.05824182, N = 15 7.08492692 9.01416429 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
XNNPACK OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 12c DDR5-6000 8c DDR5-6000 1000 2000 3000 4000 5000 SE +/- 57.47, N = 3 SE +/- 41.29, N = 3 4629 4596 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 73.61, N = 3 SE +/- 46.71, N = 3 9242 9236 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large 12c DDR5-6000 8c DDR5-6000 3K 6K 9K 12K 15K SE +/- 63.76, N = 3 SE +/- 140.55, N = 3 13643 14189 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 54.11, N = 3 SE +/- 8.95, N = 3 10418 10534 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 12c DDR5-6000 8c DDR5-6000 1000 2000 3000 4000 5000 SE +/- 43.59, N = 3 SE +/- 11.46, N = 3 4611 4570 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 96.09, N = 3 SE +/- 49.08, N = 3 8982 9094 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large 12c DDR5-6000 8c DDR5-6000 3K 6K 9K 12K 15K SE +/- 58.09, N = 3 SE +/- 75.41, N = 3 13074 13257 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 181.63, N = 3 SE +/- 162.45, N = 3 10559 10810 1. (CXX) g++ options: -O3 -lrt -lm
OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 12c DDR5-6000 8c DDR5-6000 2K 4K 6K 8K 10K SE +/- 60.71, N = 3 SE +/- 103.51, N = 3 10095 9968 1. (CXX) g++ options: -O3 -lrt -lm
12c DDR5-6000 Processor: AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads), Motherboard: Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS), Chipset: AMD 1Ah, Memory: 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.10, Kernel: 6.12.0-rc7-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 15 November 2024 11:35 by user phoronix.
8c DDR5-6000 Processor: AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads), Motherboard: Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS), Chipset: AMD 1Ah, Memory: 8 x 64GB DDR5-6000MT/s, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 24.10, Kernel: 6.12.0-rc7-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116Java Notes: OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.10)Python Notes: Python 3.12.7Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 15 November 2024 21:45 by user phoronix.