AMD EPYC Turin AVX-512 Comparison

AMD EPYC 9755 AVX-512 comparison by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2410104-NE-TURINAVX566
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
AVX-512 Off
September 29
  5 Hours, 55 Minutes
AVX-512 256b DP
September 28
  6 Hours, 28 Minutes
AVX-512 512b DP
September 30
  6 Hours, 26 Minutes
Invert Behavior (Only Show Selected Data)
  6 Hours, 16 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AMD EPYC Turin AVX-512 ComparisonOpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 9755 128-Core @ 2.70GHz (128 Cores / 256 Threads)AMD VOLCANO (RVOT1000D BIOS)AMD Device 153a12 x 64GB DDR5-6000MT/s Samsung M321R8GA0PB1-CCPKC2 x 1920GB KIOXIA KCD8XPUG1T92ASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 24.046.10.0-phx (x86_64)GCC 13.2.0ext41920x1200ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionAMD EPYC Turin AVX-512 Comparison BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-OiuXZC/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002110 - Python 3.12.2- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AVX-512 OffAVX-512 256b DPAVX-512 512b DPResult OverviewPhoronix Test Suite100%137%174%212%249%oneDNNNAMDminiBUDEOpenVINOOSPRayTensorFlowsimdjsonGROMACSONNX RuntimeY-CruncherMobile Neural NetworkXmrigOSPRay StudioOpenVKLPyTorchlibxsmmSVT-AV1EmbreeNumpy BenchmarkSMHasherOpenFOAM

AVX-512 OffAVX-512 256b DPAVX-512 512b DPPer Watt Result OverviewPhoronix Test Suite100%138%176%215%miniBUDENAMDTensorFlowOSPRayGROMACSlibxsmmsimdjsonPyTorchOpenVKLXmrigEmbreeSVT-AV1Numpy BenchmarkP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.MP.W.G.M

AMD EPYC Turin AVX-512 Comparisonpytorch: CPU - 256 - ResNet-50pytorch: CPU - 512 - ResNet-50pytorch: CPU - 1 - ResNet-50minibude: OpenMP - BM1minibude: OpenMP - BM2openvino: Face Detection FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Person Detection FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUembree: Pathtracer ISPC - Asian Dragonembree: Pathtracer ISPC - Asian Dragon Objembree: Pathtracer ISPC - Crownsvt-av1: Preset 13 - Bosphorus 4Ksvt-av1: Preset 8 - Bosphorus 4Ksvt-av1: Preset 5 - Bosphorus 4Ksvt-av1: Preset 3 - Bosphorus 4Ksimdjson: PartialTweetssimdjson: LargeRandsimdjson: Kostyasimdjson: DistinctUserIDsimdjson: TopTweetminibude: OpenMP - BM1minibude: OpenMP - BM2libxsmm: 128xmrig: GhostRider - 1Mtensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 512 - ResNet-50onnx: fcn-resnet101-11 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardopenvkl: vklBenchmarkCPU ISPCospray: gravity_spheres_volume/dim_512/ao/real_timeospray: gravity_spheres_volume/dim_512/scivis/real_timeospray: gravity_spheres_volume/dim_512/pathtracer/real_timesmhasher: FarmHash32 x86_64 AVXgromacs: MPI CPU - water_GMX50_barenamd: ATPase with 327,506 Atomsnamd: STMV with 1,066,628 Atomsnumpy: smhasher: FarmHash32 x86_64 AVXonnx: fcn-resnet101-11 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardospray-studio: 1 - 4K - 1 - Path Tracer - CPUospray-studio: 1 - 4K - 16 - Path Tracer - CPUospray-studio: 1 - 4K - 32 - Path Tracer - CPUospray-studio: 2 - 4K - 1 - Path Tracer - CPUospray-studio: 2 - 4K - 16 - Path Tracer - CPUospray-studio: 2 - 4K - 32 - Path Tracer - CPUospray-studio: 3 - 4K - 1 - Path Tracer - CPUospray-studio: 3 - 4K - 16 - Path Tracer - CPUospray-studio: 3 - 4K - 32 - Path Tracer - CPUonednn: Convolution Batch Shapes Auto - CPUonednn: Deconvolution Batch shapes_3d - CPUonednn: IP Shapes 3D - CPUonednn: Recurrent Neural Network Training - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Person Detection FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUmnn: mobilenetV3mnn: resnet-v2-50openfoam: drivaerFastback, Small Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Execution Timey-cruncher: 1By-cruncher: 5By-cruncher: 10BAVX-512 OffAVX-512 256b DPAVX-512 512b DP38.3838.8245.05195.312191.605117.14165052.36382.4311702.446189.694150.45398.5719828.791861.722626.834727.353010.63205.9535178.4839167.0048378.434101.84446.24513.0387.301.404.697.647.494882.8084790.1323431.716826.7158.25179.257.89424161.96417.0582181.55638.83805.50491309931.719030.607242.131632099.0318.3507.000682.28097739.6926.512127.0196.1739559.03045.5057825.7457181.7367141139622869719114512295583613357308200.3739251.312143.14198447.798544.380.57167.0710.7510.2615.32160.346.2968.6824.3013.3542.422.05510.05320.421965161.951948.02031.24561.25643.5543.5051.63328.870316.656137.51175156.04710.9813579.908450.688024.13866.8324619.044203.212855.3410058.8410112.34221.3817190.1200178.1347388.542108.12448.59513.7628.601.575.668.998.848221.7397916.4093652.917480.4190.08221.258.92861162.96017.9209191.17644.61295.67626356045.408744.787649.349934422.4119.54413.310024.17026794.9826.885112.5426.1392655.82175.2286222.4363177.4876711073321539675108162164679112693253930.3500340.7003790.323221482.881463.880.5589.909.257.487.9073.735.0430.3522.366.2911.791.9808.85119.976723163.51457.75124.60247.27143.8543.3952.33395.370387.050194.04192118.56766.9918690.3211239.498755.751111.6931620.584780.243297.8513925.7610450.98223.1431191.6390179.6822400.898114.98050.73714.2789.461.595.909.929.879884.2549676.2543764.420005.2203.33246.1010.3299206.82321.5998192.83743.82226.96791366046.972146.326950.901534394.9822.83914.182934.62565795.3126.49196.80504.8348146.33695.1836922.8473143.9406391016920379642102312043875211963240300.2576320.5090190.322331425.220329.050.4883.326.605.617.2157.473.8726.6319.294.4910.751.8727.67420.24893160.997427.78924.06745.855OpenBenchmarking.org

CPU Temperature Monitor

OpenBenchmarking.orgCelsiusCPU Temperature MonitorPhoronix Test Suite System MonitoringAVX-512 OffAVX-512 512b DPAVX-512 256b DP1326395265Min: 26.13 / Avg: 49.06 / Max: 63.5Min: 23.88 / Avg: 50.93 / Max: 66Min: 25.13 / Avg: 49.34 / Max: 63.75

CPU Peak Freq (Highest CPU Core Frequency) Monitor

OpenBenchmarking.orgMegahertzCPU Peak Freq (Highest CPU Core Frequency) MonitorPhoronix Test Suite System MonitoringAVX-512 OffAVX-512 512b DPAVX-512 256b DP8001600240032004000Min: 2294 / Avg: 3647.31 / Max: 4647Min: 1886 / Avg: 3621.72 / Max: 4224Min: 2172 / Avg: 3712.06 / Max: 4195

CPU Power Consumption Monitor

OpenBenchmarking.orgWattsCPU Power Consumption MonitorPhoronix Test Suite System MonitoringAVX-512 OffAVX-512 512b DPAVX-512 256b DP90180270360450Min: 22.25 / Avg: 297.71 / Max: 505.2Min: 22.21 / Avg: 292.98 / Max: 502.06Min: 22.32 / Avg: 305.93 / Max: 503.55

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50AVX-512 OffAVX-512 512b DPAVX-512 256b DP1020304050SE +/- 0.43, N = 3SE +/- 0.55, N = 3SE +/- 0.31, N = 338.3843.8543.55MIN: 36.82 / MAX: 39.67MIN: 41.39 / MAX: 45.93MIN: 41.56 / MAX: 44.75

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50AVX-512 OffAVX-512 512b DPAVX-512 256b DP1020304050SE +/- 0.08, N = 3SE +/- 0.52, N = 4SE +/- 0.34, N = 338.8243.3943.50MIN: 37.4 / MAX: 39.86MIN: 40.31 / MAX: 44.91MIN: 41.09 / MAX: 44.92

OpenBenchmarking.orgbatches/sec Per Watt, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50AVX-512 OffAVX-512 512b DPAVX-512 256b DP0.04930.09860.14790.19720.24650.1790.2170.219

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50AVX-512 OffAVX-512 512b DPAVX-512 256b DP1224364860SE +/- 0.23, N = 3SE +/- 0.38, N = 3SE +/- 0.12, N = 345.0552.3351.63MIN: 43.23 / MAX: 46.3MIN: 49.93 / MAX: 54.1MIN: 48.86 / MAX: 52.93

miniBUDE

OpenBenchmarking.orgBillion Interactions/s Per Watt, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX-512 OffAVX-512 512b DPAVX-512 256b DP0.25610.51220.76831.02441.28050.4981.1380.935

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX-512 OffAVX-512 512b DPAVX-512 256b DP90180270360450SE +/- 0.12, N = 8SE +/- 0.11, N = 11SE +/- 0.13, N = 10195.31395.37328.871. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX-512 OffAVX-512 512b DPAVX-512 256b DP80160240320400SE +/- 0.64, N = 3SE +/- 4.22, N = 4SE +/- 0.70, N = 4191.61387.05316.661. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP4080120160200SE +/- 0.07, N = 3SE +/- 0.46, N = 3SE +/- 0.02, N = 3117.14194.04137.511. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP40K80K120K160K200KSE +/- 339.72, N = 3SE +/- 544.06, N = 3SE +/- 517.64, N = 3165052.36192118.56175156.041. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP170340510680850SE +/- 0.04, N = 3SE +/- 0.42, N = 3SE +/- 0.61, N = 3382.43766.99710.981. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP4K8K12K16K20KSE +/- 1.96, N = 3SE +/- 12.83, N = 3SE +/- 5.42, N = 311702.4418690.3213579.901. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP2K4K6K8K10KSE +/- 1.18, N = 3SE +/- 0.53, N = 3SE +/- 0.88, N = 36189.6911239.498450.681. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP2K4K6K8K10KSE +/- 9.55, N = 3SE +/- 25.40, N = 3SE +/- 14.15, N = 34150.458755.758024.131. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP2004006008001000SE +/- 0.88, N = 3SE +/- 5.73, N = 3SE +/- 1.32, N = 3398.571111.69866.831. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP7K14K21K28K35KSE +/- 6.94, N = 3SE +/- 23.78, N = 3SE +/- 4.94, N = 319828.7931620.5824619.041. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP10002000300040005000SE +/- 1.11, N = 3SE +/- 1.88, N = 3SE +/- 2.89, N = 31861.724780.244203.211. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP7001400210028003500SE +/- 0.80, N = 3SE +/- 5.22, N = 3SE +/- 3.53, N = 32626.833297.852855.341. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP3K6K9K12K15KSE +/- 1.32, N = 3SE +/- 7.20, N = 3SE +/- 6.22, N = 34727.3513925.7610058.841. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP2K4K6K8K10KSE +/- 16.72, N = 3SE +/- 29.86, N = 3SE +/- 47.84, N = 33010.6310450.9810112.341. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian DragonAVX-512 OffAVX-512 512b DPAVX-512 256b DP50100150200250SE +/- 0.08, N = 8SE +/- 0.09, N = 8SE +/- 0.05, N = 8205.95223.14221.38MIN: 202.63 / MAX: 210.88MIN: 218.68 / MAX: 229MIN: 217.86 / MAX: 225.89

SVT-AV1

OpenBenchmarking.orgFrames Per Second Per Watt, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 4KAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.5061.0121.5182.0242.532.1252.2492.202

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian Dragon ObjAVX-512 OffAVX-512 512b DPAVX-512 256b DP4080120160200SE +/- 0.17, N = 5SE +/- 0.08, N = 5SE +/- 0.11, N = 5178.48191.64190.12MIN: 174.73 / MAX: 183.09MIN: 188.18 / MAX: 196.08MIN: 186.67 / MAX: 194.33

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: CrownAVX-512 OffAVX-512 512b DPAVX-512 256b DP4080120160200SE +/- 0.18, N = 8SE +/- 0.09, N = 8SE +/- 0.12, N = 8167.00179.68178.13MIN: 162.73 / MAX: 173.25MIN: 174.97 / MAX: 186.29MIN: 173.22 / MAX: 184.71

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 4KAVX-512 OffAVX-512 512b DPAVX-512 256b DP90180270360450SE +/- 0.63, N = 6SE +/- 6.48, N = 15SE +/- 6.38, N = 15378.43400.90388.54-mavx2 -mavx512f -mavx512bw -mavx512dq-mavx2 -mavx512f -mavx512bw -mavx512dq1. (CXX) g++ options: -march=native -mno-avx

OpenBenchmarking.orgFrames Per Second Per Watt, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 4KAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.11840.23680.35520.47360.5920.4620.5260.510

OpenBenchmarking.orgFrames Per Second Per Watt, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 4KAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.05040.10080.15120.20160.2520.2020.2240.220

OpenBenchmarking.orgFrames Per Second Per Watt, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 4KAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.01280.02560.03840.05120.0640.0520.0570.056

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 4KAVX-512 OffAVX-512 512b DPAVX-512 256b DP306090120150SE +/- 0.40, N = 3SE +/- 0.52, N = 3SE +/- 0.47, N = 3101.84114.98108.12-mavx2 -mavx512f -mavx512bw -mavx512dq-mavx2 -mavx512f -mavx512bw -mavx512dq1. (CXX) g++ options: -march=native -mno-avx

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 4KAVX-512 OffAVX-512 512b DPAVX-512 256b DP1122334455SE +/- 0.13, N = 3SE +/- 0.02, N = 3SE +/- 0.20, N = 346.2550.7448.60-mavx2 -mavx512f -mavx512bw -mavx512dq-mavx2 -mavx512f -mavx512bw -mavx512dq1. (CXX) g++ options: -march=native -mno-avx

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 4KAVX-512 OffAVX-512 512b DPAVX-512 256b DP48121620SE +/- 0.08, N = 3SE +/- 0.05, N = 3SE +/- 0.05, N = 313.0414.2813.76-mavx2 -mavx512f -mavx512bw -mavx512dq-mavx2 -mavx512f -mavx512bw -mavx512dq1. (CXX) g++ options: -march=native -mno-avx

simdjson

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: PartialTweetsAVX-512 OffAVX-512 512b DPAVX-512 256b DP3691215SE +/- 0.04, N = 3SE +/- 0.09, N = 15SE +/- 0.09, N = 67.309.468.601. (CXX) g++ options: -O3 -lrt

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: LargeRandomAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.35780.71561.07341.43121.789SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.401.591.571. (CXX) g++ options: -O3 -lrt

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: KostyaAVX-512 OffAVX-512 512b DPAVX-512 256b DP1.32752.6553.98255.316.6375SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 34.695.905.661. (CXX) g++ options: -O3 -lrt

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: DistinctUserIDAVX-512 OffAVX-512 512b DPAVX-512 256b DP3691215SE +/- 0.02, N = 3SE +/- 0.12, N = 4SE +/- 0.04, N = 37.649.928.991. (CXX) g++ options: -O3 -lrt

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 3.10Throughput Test: TopTweetAVX-512 OffAVX-512 512b DPAVX-512 256b DP3691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 37.499.878.841. (CXX) g++ options: -O3 -lrt

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX-512 OffAVX-512 512b DPAVX-512 256b DP2K4K6K8K10KSE +/- 3.11, N = 8SE +/- 2.65, N = 11SE +/- 3.23, N = 104882.819884.258221.741. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX-512 OffAVX-512 512b DPAVX-512 256b DP2K4K6K8K10KSE +/- 15.88, N = 3SE +/- 105.52, N = 4SE +/- 17.43, N = 44790.139676.257916.411. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

libxsmm

OpenBenchmarking.orgGFLOPS/s Per Watt, More Is Betterlibxsmm 2-1.17-3645M N K: 128AVX-512 OffAVX-512 512b DPAVX-512 256b DP51015202516.3820.7421.01

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128AVX-512 OffAVX-512 512b DPAVX-512 256b DP8001600240032004000SE +/- 12.94, N = 3SE +/- 9.17, N = 3SE +/- 5.61, N = 33431.73764.43652.9-pedantic -fopenmp -march=core-avx2-msse4.2-msse4.21. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden

Xmrig

OpenBenchmarking.orgH/s Per Watt, More Is BetterXmrig 6.21Variant: GhostRider - Hash Count: 1MAVX-512 OffAVX-512 512b DPAVX-512 256b DP112233445540.8547.0742.29

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.21Variant: GhostRider - Hash Count: 1MAVX-512 OffAVX-512 512b DPAVX-512 256b DP4K8K12K16K20KSE +/- 973.00, N = 15SE +/- 779.44, N = 15SE +/- 973.55, N = 1516826.720005.217480.41. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: ResNet-50AVX-512 OffAVX-512 512b DPAVX-512 256b DP4080120160200SE +/- 0.47, N = 3SE +/- 0.98, N = 3SE +/- 0.40, N = 3158.25203.33190.08

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50AVX-512 OffAVX-512 512b DPAVX-512 256b DP50100150200250SE +/- 0.20, N = 3SE +/- 0.45, N = 3SE +/- 0.23, N = 3179.25246.10221.25

ONNX Runtime

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: StandardAVX-512 OffAVX-512 512b DPAVX-512 256b DP3691215SE +/- 0.11217, N = 15SE +/- 0.01681, N = 3SE +/- 0.16447, N = 157.8942410.329908.928611. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: StandardAVX-512 OffAVX-512 512b DPAVX-512 256b DP50100150200250SE +/- 0.15, N = 3SE +/- 0.29, N = 3SE +/- 1.79, N = 5161.96206.82162.961. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: StandardAVX-512 OffAVX-512 512b DPAVX-512 256b DP510152025SE +/- 0.38, N = 12SE +/- 0.17, N = 15SE +/- 0.21, N = 417.0621.6017.921. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: StandardAVX-512 OffAVX-512 512b DPAVX-512 256b DP4080120160200SE +/- 0.16, N = 3SE +/- 0.74, N = 3SE +/- 0.18, N = 3181.56192.84191.181. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: StandardAVX-512 OffAVX-512 512b DPAVX-512 256b DP1020304050SE +/- 0.02, N = 3SE +/- 0.43, N = 15SE +/- 0.38, N = 1538.8443.8244.611. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: StandardAVX-512 OffAVX-512 512b DPAVX-512 256b DP246810SE +/- 0.06677, N = 4SE +/- 0.10140, N = 15SE +/- 0.12978, N = 155.504916.967915.676261. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenVKL

OpenBenchmarking.orgItems / Sec Per Watt, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU ISPCAVX-512 OffAVX-512 512b DPAVX-512 256b DP36912157.7919.1809.055

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU ISPCAVX-512 OffAVX-512 512b DPAVX-512 256b DP8001600240032004000SE +/- 0.33, N = 3SE +/- 0.58, N = 3SE +/- 1.76, N = 3309936603560MIN: 245 / MAX: 36357MIN: 293 / MAX: 42710MIN: 284 / MAX: 41727

OSPRay

OpenBenchmarking.orgItems Per Second Per Watt, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/ao/real_timeAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.02430.04860.07290.09720.12150.0770.1080.108

OpenBenchmarking.orgItems Per Second Per Watt, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.02410.04820.07230.09640.12050.0740.1070.106

OpenBenchmarking.orgItems Per Second Per Watt, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.02790.05580.08370.11160.13950.1010.1240.121

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/ao/real_timeAVX-512 OffAVX-512 512b DPAVX-512 256b DP1122334455SE +/- 0.01, N = 3SE +/- 0.16, N = 3SE +/- 0.03, N = 331.7246.9745.41

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeAVX-512 OffAVX-512 512b DPAVX-512 256b DP1122334455SE +/- 0.02, N = 3SE +/- 0.11, N = 3SE +/- 0.13, N = 330.6146.3344.79

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeAVX-512 OffAVX-512 512b DPAVX-512 256b DP1122334455SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.05, N = 342.1350.9049.35

NAMD

MinAvgMaxAVX-512 Off270036894147AVX-512 512b DP270032534149AVX-512 256b DP270033384158OpenBenchmarking.orgMegahertz, More Is BetterNAMD 3.0b6CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270034474182AVX-512 512b DP255433624151AVX-512 256b DP270034804149OpenBenchmarking.orgMegahertz, More Is BetterNAMD 3.0b6CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

GROMACS

MinAvgMaxAVX-512 Off270033734150AVX-512 512b DP270034894155AVX-512 256b DP270035044148OpenBenchmarking.orgMegahertz, More Is BetterGROMACS 2024CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

OpenFOAM

MinAvgMaxAVX-512 Off270034134156AVX-512 512b DP270033874174AVX-512 256b DP270033904150OpenBenchmarking.orgMegahertz, More Is BetterOpenFOAM 10CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270035764155AVX-512 512b DP270035494168AVX-512 256b DP270035764152OpenBenchmarking.orgMegahertz, More Is BetterOpenFOAM 10CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

OSPRay Studio

MinAvgMaxAVX-512 Off270032244143AVX-512 512b DP270032244153AVX-512 256b DP270033294145OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270032284144AVX-512 512b DP270032214152AVX-512 256b DP270033344151OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270031664142AVX-512 512b DP270031734139AVX-512 256b DP270032724134OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270032484155AVX-512 512b DP270032544154AVX-512 256b DP270033314145OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270032484143AVX-512 512b DP270032414152AVX-512 256b DP270033424155OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270031924150AVX-512 512b DP270031944144AVX-512 256b DP270033024150OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270032264140AVX-512 512b DP270032234141AVX-512 256b DP270033124144OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270031714148AVX-512 512b DP270032104162AVX-512 256b DP270033194144OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270033074153AVX-512 512b DP270031704151AVX-512 256b DP270032844151OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

Y-Cruncher

MinAvgMaxAVX-512 Off246436004155AVX-512 512b DP270036254154AVX-512 256b DP269536924151OpenBenchmarking.orgMegahertz, More Is BetterY-Cruncher 0.8.5CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off243736014157AVX-512 512b DP206335044142AVX-512 256b DP263836164146OpenBenchmarking.orgMegahertz, More Is BetterY-Cruncher 0.8.5CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off246535964148AVX-512 512b DP207735304160AVX-512 256b DP261336384155OpenBenchmarking.orgMegahertz, More Is BetterY-Cruncher 0.8.5CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

oneDNN

MinAvgMaxAVX-512 Off270031464042AVX-512 512b DP270028844153AVX-512 256b DP270032894145OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.4CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270031604144AVX-512 512b DP255927304150AVX-512 256b DP217230494150OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.4CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270037304152AVX-512 512b DP270035284155AVX-512 256b DP270035364145OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.4CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off248731934143AVX-512 512b DP188632984162AVX-512 256b DP270037904156OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.4CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

OpenVINO

MinAvgMaxAVX-512 Off256327674146AVX-512 512b DP228024754150AVX-512 256b DP270031824150OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270031924152AVX-512 512b DP270031704142AVX-512 256b DP270034424163OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270030314147AVX-512 512b DP264227684162AVX-512 256b DP270032264168OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off263028134149AVX-512 512b DP220024144152AVX-512 256b DP270031014145OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off268828234156AVX-512 512b DP238025734150AVX-512 256b DP270031474154OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270028754145AVX-512 512b DP239425224152AVX-512 256b DP270031464153OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off254127234153AVX-512 512b DP241627334152AVX-512 256b DP270032724195OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off269728914145AVX-512 512b DP233626014149AVX-512 256b DP270030974152OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off252026144162AVX-512 512b DP261327674147AVX-512 256b DP270029654148OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270030214157AVX-512 512b DP258527234144AVX-512 256b DP270031874144OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off268228324151AVX-512 512b DP248827004154AVX-512 256b DP270032314147OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

MinAvgMaxAVX-512 Off270032414405AVX-512 512b DP269928694150AVX-512 256b DP270030874153OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.0CPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

Numpy Benchmark

MinAvgMaxAVX-512 Off270040894151AVX-512 512b DP270040824154AVX-512 256b DP270040794154OpenBenchmarking.orgMegahertz, More Is BetterNumpy BenchmarkCPU Peak Freq (Highest CPU Core Frequency) Monitor11002200330044005500

SMHasher

SMHasher is a hash function tester supporting various algorithms and able to make use of AVX and other modern CPU instruction set extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMiB/sec, More Is BetterSMHasher 2022-08-22Hash: FarmHash32 x86_64 AVXAVX-512 OffAVX-512 512b DPAVX-512 256b DP7K14K21K28K35KSE +/- 19.92, N = 6SE +/- 23.52, N = 6SE +/- 20.30, N = 632099.0334394.9834422.411. (CXX) g++ options: -march=native -O3 -flto=auto -fno-fat-lto-objects

GROMACS

OpenBenchmarking.orgNs Per Day Per Watt, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.01510.03020.04530.06040.07550.0520.0670.057

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareAVX-512 OffAVX-512 512b DPAVX-512 256b DP510152025SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 318.3522.8419.541. (CXX) g++ options: -O3 -lm

NAMD

OpenBenchmarking.orgns/day Per Watt, More Is BetterNAMD 3.0b6Input: ATPase with 327,506 AtomsAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.01690.03380.05070.06760.08450.0400.0750.067

OpenBenchmarking.orgns/day Per Watt, More Is BetterNAMD 3.0b6Input: STMV with 1,066,628 AtomsAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.00430.00860.01290.01720.02150.0090.0190.016

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0b6Input: ATPase with 327,506 AtomsAVX-512 OffAVX-512 512b DPAVX-512 256b DP48121620SE +/- 0.01866, N = 3SE +/- 0.06451, N = 7SE +/- 0.02496, N = 77.0006814.1829313.31002

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0b6Input: STMV with 1,066,628 AtomsAVX-512 OffAVX-512 512b DPAVX-512 256b DP1.04082.08163.12244.16325.204SE +/- 0.00213, N = 3SE +/- 0.00436, N = 4SE +/- 0.00881, N = 42.280974.625654.17026

Numpy Benchmark

This is a test to obtain the general Numpy performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkAVX-512 OffAVX-512 512b DPAVX-512 256b DP2004006008001000SE +/- 1.08, N = 3SE +/- 2.30, N = 3SE +/- 2.12, N = 3739.69795.31794.98

OpenFOAM

MinAvgMaxAVX-512 Off36.453.960.4AVX-512 512b DP35.352.759.0AVX-512 256b DP37.153.659.8OpenBenchmarking.orgCelsius, Fewer Is BetterOpenFOAM 10CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off36.359.962.3AVX-512 512b DP35.359.362.0AVX-512 256b DP35.859.561.6OpenBenchmarking.orgCelsius, Fewer Is BetterOpenFOAM 10CPU Temperature Monitor20406080100

OSPRay Studio

MinAvgMaxAVX-512 Off34.653.958.5AVX-512 512b DP36.555.760.0AVX-512 256b DP34.954.058.5OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off37.653.757.8AVX-512 512b DP39.155.159.8AVX-512 256b DP37.353.757.8OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off37.054.758.9AVX-512 512b DP38.456.760.9AVX-512 256b DP36.654.558.5OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off37.054.458.5AVX-512 512b DP39.656.360.6AVX-512 256b DP37.053.858.0OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off36.953.558.4AVX-512 512b DP39.156.060.6AVX-512 256b DP37.353.958.5OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off37.155.258.6AVX-512 512b DP39.357.360.9AVX-512 256b DP37.655.158.5OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off37.054.158.3AVX-512 512b DP39.356.760.9AVX-512 256b DP37.154.158.1OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off36.855.358.9AVX-512 512b DP39.356.260.9AVX-512 256b DP36.854.558.6OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off37.551.557.0AVX-512 512b DP39.157.661.1AVX-512 256b DP37.655.459.1OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature Monitor20406080100

Y-Cruncher

MinAvgMaxAVX-512 Off32.044.349.6AVX-512 512b DP33.544.248.5AVX-512 256b DP31.442.346.8OpenBenchmarking.orgCelsius, Fewer Is BetterY-Cruncher 0.8.5CPU Temperature Monitor1428425670

MinAvgMaxAVX-512 Off32.851.860.0AVX-512 512b DP34.149.655.9AVX-512 256b DP31.848.254.6OpenBenchmarking.orgCelsius, Fewer Is BetterY-Cruncher 0.8.5CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off34.854.661.6AVX-512 512b DP35.951.957.6AVX-512 256b DP34.350.856.5OpenBenchmarking.orgCelsius, Fewer Is BetterY-Cruncher 0.8.5CPU Temperature Monitor20406080100

oneDNN

MinAvgMaxAVX-512 Off37.145.649.4AVX-512 512b DP40.145.849.0AVX-512 256b DP37.845.248.8OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.4CPU Temperature Monitor1428425670

MinAvgMaxAVX-512 Off31.944.948.6AVX-512 512b DP33.940.342.5AVX-512 256b DP32.539.842.4OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.4CPU Temperature Monitor1428425670

MinAvgMaxAVX-512 Off31.438.040.6AVX-512 512b DP33.340.443.0AVX-512 256b DP31.638.040.5OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.4CPU Temperature Monitor1224364860

MinAvgMaxAVX-512 Off30.147.152.4AVX-512 512b DP33.348.250.9AVX-512 256b DP30.848.751.6OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.4CPU Temperature Monitor1530456075

OpenVINO

MinAvgMaxAVX-512 Off33.853.556.8AVX-512 512b DP37.054.757.5AVX-512 256b DP34.554.357.6OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off35.151.453.3AVX-512 512b DP37.551.653.5AVX-512 256b DP35.351.153.0OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1530456075

MinAvgMaxAVX-512 Off34.355.958.8AVX-512 512b DP36.655.757.9AVX-512 256b DP34.156.158.9OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off35.553.956.0AVX-512 512b DP37.654.256.0AVX-512 256b DP35.654.456.6OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off34.855.357.8AVX-512 512b DP37.456.658.9AVX-512 256b DP35.455.558.0OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off35.853.455.9AVX-512 512b DP38.653.054.8AVX-512 256b DP35.555.357.9OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off34.855.057.6AVX-512 512b DP36.659.362.1AVX-512 256b DP35.657.160.0OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off35.855.157.4AVX-512 512b DP39.556.057.9AVX-512 256b DP36.455.057.3OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off35.154.356.5AVX-512 512b DP38.059.161.5AVX-512 256b DP35.155.858.3OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off35.556.659.3AVX-512 512b DP39.454.556.3AVX-512 256b DP36.155.257.5OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1632486480

MinAvgMaxAVX-512 Off36.354.056.4AVX-512 512b DP37.659.261.8AVX-512 256b DP35.456.559.1OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor20406080100

MinAvgMaxAVX-512 Off35.148.050.1AVX-512 512b DP39.554.957.1AVX-512 256b DP35.853.456.1OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.0CPU Temperature Monitor1632486480

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP150300450600750SE +/- 0.00, N = 3SE +/- 0.58, N = 3SE +/- 0.33, N = 3714639671

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP2K4K6K8K10KSE +/- 7.69, N = 3SE +/- 9.28, N = 3SE +/- 5.86, N = 3113961016910733

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP5K10K15K20K25KSE +/- 34.33, N = 3SE +/- 25.38, N = 3SE +/- 6.39, N = 3228692037921539

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP160320480640800SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3719642675

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP2K4K6K8K10KSE +/- 2.85, N = 3SE +/- 10.11, N = 3SE +/- 11.85, N = 3114511023110816

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP5K10K15K20K25KSE +/- 25.87, N = 3SE +/- 8.88, N = 3SE +/- 8.65, N = 3229552043821646

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP2004006008001000SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.58, N = 3836752791

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP3K6K9K12K15KSE +/- 13.35, N = 3SE +/- 7.06, N = 3SE +/- 7.80, N = 3133571196312693

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP7K14K21K28K35KSE +/- 109.14, N = 3SE +/- 20.00, N = 3SE +/- 2.00, N = 3308202403025393

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.4Harness: Convolution Batch Shapes Auto - Engine: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.08410.16820.25230.33640.4205SE +/- 0.000549, N = 7SE +/- 0.000532, N = 7SE +/- 0.000363, N = 70.3739250.2576320.350034MIN: 0.35MIN: 0.25MIN: 0.331. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.4Harness: Deconvolution Batch shapes_3d - Engine: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.29520.59040.88561.18081.476SE +/- 0.001041, N = 9SE +/- 0.000468, N = 9SE +/- 0.000342, N = 91.3121400.5090190.700379MIN: 1.28MIN: 0.48MIN: 0.681. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.4Harness: IP Shapes 3D - Engine: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.70691.41382.12072.82763.5345SE +/- 0.003218, N = 5SE +/- 0.000544, N = 5SE +/- 0.000924, N = 53.1419800.3223310.323221MIN: 3.071. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.4Harness: Recurrent Neural Network Training - Engine: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP100200300400500SE +/- 0.35, N = 3SE +/- 0.60, N = 3SE +/- 0.27, N = 3447.80425.22482.88MIN: 443.23MIN: 418.82MIN: 478.351. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP120240360480600SE +/- 0.31, N = 3SE +/- 0.76, N = 3SE +/- 0.07, N = 3544.38329.05463.88MIN: 261.15 / MAX: 569.93MIN: 146.73 / MAX: 360.28MIN: 405.2 / MAX: 482.041. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP0.12830.25660.38490.51320.6415SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.570.480.55MIN: 0.2 / MAX: 26.73MIN: 0.13 / MAX: 25.07MIN: 0.18 / MAX: 22.431. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP4080120160200SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.08, N = 3167.0783.3289.90MIN: 75.83 / MAX: 256.32MIN: 35.6 / MAX: 146.57MIN: 40.69 / MAX: 160.41. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP3691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 310.756.609.25MIN: 4.66 / MAX: 32.11MIN: 2.24 / MAX: 26.72MIN: 4.08 / MAX: 21.61. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP3691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 310.265.617.48MIN: 6.21 / MAX: 38.75MIN: 2.24 / MAX: 30.88MIN: 4.14 / MAX: 23.691. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP48121620SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 315.327.217.90MIN: 7.91 / MAX: 45.71MIN: 4.13 / MAX: 25.4MIN: 5.02 / MAX: 29.41. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP4080120160200SE +/- 0.34, N = 3SE +/- 0.30, N = 3SE +/- 0.11, N = 3160.3457.4773.73MIN: 88.77 / MAX: 245.28MIN: 26.92 / MAX: 106.12MIN: 35.26 / MAX: 1131. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP246810SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 36.293.875.04MIN: 3.06 / MAX: 24.12MIN: 1.69 / MAX: 21.99MIN: 2.51 / MAX: 19.141. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP1530456075SE +/- 0.04, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 368.6826.6330.35MIN: 40.62 / MAX: 87.32MIN: 15.61 / MAX: 47.58MIN: 17.71 / MAX: 42.781. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP612182430SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 324.3019.2922.36MIN: 13.82 / MAX: 53.77MIN: 9.84 / MAX: 44.85MIN: 11.44 / MAX: 40.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP3691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 313.354.496.29MIN: 6.22 / MAX: 36.5MIN: 1.96 / MAX: 22.35MIN: 3.16 / MAX: 20.631. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUAVX-512 OffAVX-512 512b DPAVX-512 256b DP1020304050SE +/- 0.24, N = 3SE +/- 0.07, N = 3SE +/- 0.06, N = 342.4210.7511.79MIN: 15.51 / MAX: 60.87MIN: 6.13 / MAX: 31.81MIN: 6.96 / MAX: 33.851. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

Mobile Neural Network

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenetV3AVX-512 OffAVX-512 512b DPAVX-512 256b DP0.46240.92481.38721.84962.312SE +/- 0.012, N = 3SE +/- 0.008, N = 3SE +/- 0.033, N = 32.0551.8721.980MIN: 1.91 / MAX: 2.32MIN: 1.73 / MAX: 2.43MIN: 1.83 / MAX: 2.341. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: resnet-v2-50AVX-512 OffAVX-512 512b DPAVX-512 256b DP3691215SE +/- 0.040, N = 3SE +/- 0.113, N = 3SE +/- 0.037, N = 310.0537.6748.851MIN: 9.74 / MAX: 12.33MIN: 7.38 / MAX: 8.78MIN: 8.53 / MAX: 10.371. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeAVX-512 OffAVX-512 512b DPAVX-512 256b DP51015202520.4220.2519.981. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution TimeAVX-512 OffAVX-512 512b DPAVX-512 256b DP4080120160200161.95161.00163.511. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

Y-Cruncher

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.5Pi Digits To Calculate: 1BAVX-512 OffAVX-512 512b DPAVX-512 256b DP246810SE +/- 0.014, N = 5SE +/- 0.012, N = 5SE +/- 0.005, N = 58.0207.7897.751

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.5Pi Digits To Calculate: 5BAVX-512 OffAVX-512 512b DPAVX-512 256b DP714212835SE +/- 0.04, N = 3SE +/- 0.15, N = 3SE +/- 0.03, N = 331.2524.0724.60

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.5Pi Digits To Calculate: 10BAVX-512 OffAVX-512 512b DPAVX-512 256b DP1428425670SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 361.2645.8647.27

164 Results Shown

CPU Temperature Monitor:
  Phoronix Test Suite System Monitoring:
    Celsius
    Megahertz
    Watts
PyTorch:
  CPU - 256 - ResNet-50
  CPU - 512 - ResNet-50
PyTorch
PyTorch
miniBUDE
miniBUDE:
  OpenMP - BM1
  OpenMP - BM2
OpenVINO:
  Face Detection FP16-INT8 - CPU
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU
  Person Detection FP16 - CPU
  Weld Porosity Detection FP16-INT8 - CPU
  Vehicle Detection FP16-INT8 - CPU
  Person Vehicle Bike Detection FP16 - CPU
  Machine Translation EN To DE FP16 - CPU
  Face Detection Retail FP16-INT8 - CPU
  Handwritten English Recognition FP16-INT8 - CPU
  Road Segmentation ADAS FP16-INT8 - CPU
  Person Re-Identification Retail FP16 - CPU
  Noise Suppression Poconet-Like FP16 - CPU
Embree
SVT-AV1
Embree:
  Pathtracer ISPC - Asian Dragon Obj
  Pathtracer ISPC - Crown
SVT-AV1
SVT-AV1:
  Preset 8 - Bosphorus 4K
  Preset 5 - Bosphorus 4K
  Preset 3 - Bosphorus 4K
SVT-AV1:
  Preset 8 - Bosphorus 4K
  Preset 5 - Bosphorus 4K
  Preset 3 - Bosphorus 4K
simdjson:
  PartialTweets
  LargeRand
  Kostya
  DistinctUserID
  TopTweet
miniBUDE:
  OpenMP - BM1
  OpenMP - BM2
libxsmm
libxsmm
Xmrig
Xmrig
TensorFlow:
  CPU - 256 - ResNet-50
  CPU - 512 - ResNet-50
ONNX Runtime:
  fcn-resnet101-11 - CPU - Standard
  super-resolution-10 - CPU - Standard
  bertsquad-12 - CPU - Standard
  GPT-2 - CPU - Standard
  ArcFace ResNet-100 - CPU - Standard
  ResNet101_DUC_HDC-12 - CPU - Standard
OpenVKL
OpenVKL
OSPRay:
  gravity_spheres_volume/dim_512/ao/real_time
  gravity_spheres_volume/dim_512/scivis/real_time
  gravity_spheres_volume/dim_512/pathtracer/real_time
OSPRay:
  gravity_spheres_volume/dim_512/ao/real_time
  gravity_spheres_volume/dim_512/scivis/real_time
  gravity_spheres_volume/dim_512/pathtracer/real_time
NAMD:
  CPU Peak Freq (Highest CPU Core Frequency) Monitor:
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
SMHasher
GROMACS
GROMACS
NAMD:
  ATPase with 327,506 Atoms
  STMV with 1,066,628 Atoms
NAMD:
  ATPase with 327,506 Atoms
  STMV with 1,066,628 Atoms
Numpy Benchmark
OpenFOAM:
  CPU Temp Monitor:
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
OSPRay Studio:
  1 - 4K - 1 - Path Tracer - CPU
  1 - 4K - 16 - Path Tracer - CPU
  1 - 4K - 32 - Path Tracer - CPU
  2 - 4K - 1 - Path Tracer - CPU
  2 - 4K - 16 - Path Tracer - CPU
  2 - 4K - 32 - Path Tracer - CPU
  3 - 4K - 1 - Path Tracer - CPU
  3 - 4K - 16 - Path Tracer - CPU
  3 - 4K - 32 - Path Tracer - CPU
oneDNN:
  Convolution Batch Shapes Auto - CPU
  Deconvolution Batch shapes_3d - CPU
  IP Shapes 3D - CPU
  Recurrent Neural Network Training - CPU
OpenVINO:
  Face Detection FP16-INT8 - CPU
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU
  Person Detection FP16 - CPU
  Weld Porosity Detection FP16-INT8 - CPU
  Vehicle Detection FP16-INT8 - CPU
  Person Vehicle Bike Detection FP16 - CPU
  Machine Translation EN To DE FP16 - CPU
  Face Detection Retail FP16-INT8 - CPU
  Handwritten English Recognition FP16-INT8 - CPU
  Road Segmentation ADAS FP16-INT8 - CPU
  Person Re-Identification Retail FP16 - CPU
  Noise Suppression Poconet-Like FP16 - CPU
Mobile Neural Network:
  mobilenetV3
  resnet-v2-50
OpenFOAM:
  drivaerFastback, Small Mesh Size - Execution Time
  drivaerFastback, Medium Mesh Size - Execution Time
Y-Cruncher:
  1B
  5B
  10B