AMD EPYC Turin 2025 New AVX-512 Benchmarks

AMD EPYC 9655P AVX-512 on/off benchmarks by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2501295-NE-AMDEPYCTU09
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Comparison
Transpose Comparison

Table

Show Detailed System Result Table

Sensor Monitoring

Show Accumulated Sensor Monitoring Data For Displayed Results
Generate Power Efficiency / Performance Per Watt Results

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
EPYC Turin: AVX-512 Enabled
January 20
  5 Hours, 37 Minutes
EPYC Turin: AVX-512 Disabled
January 20
  6 Hours, 58 Minutes
Invert Behavior (Only Show Selected Data)
  6 Hours, 18 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AMD EPYC Turin 2025 New AVX-512 BenchmarksOpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads)Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS)AMD 1Ah12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF3201GB Micron_7450_MTFDKCB3T2TFSASPEED2 x Broadcom NetXtreme BCM5720 PCIeUbuntu 24.106.13.0-rc4-phx-stock (x86_64)GNOME Shell 47.0X ServerGCC 14.2.0ext41024x768ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionAMD EPYC Turin 2025 New AVX-512 Benchmarks PerformanceSystem Logs- Transparent Huge Pages: madvise- CXXFLAGS="-O3 -march=znver5 -mprefer-vector-width=512 -flto" CFLAGS="-O3 -march=znver5 -mprefer-vector-width=512 -flto" - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 - Python 3.12.7- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

pytorch: CPU - 256 - ResNet-50pytorch: CPU - 512 - ResNet-50minibude: OpenMP - BM1minibude: OpenMP - BM2openvino: Face Detection FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Person Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUembree: Pathtracer ISPC - Asian Dragonembree: Pathtracer ISPC - Asian Dragon Objembree: Pathtracer ISPC - Crownsvt-av1: Preset 13 - Bosphorus 4Ksvt-av1: Preset 8 - Bosphorus 4Ksvt-av1: Preset 5 - Bosphorus 4Ksvt-av1: Preset 3 - Bosphorus 4Kminibude: OpenMP - BM1minibude: OpenMP - BM2mt-dgemm: Sustained Floating-Point Ratelibxsmm: 128tensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 512 - ResNet-50onnx: fcn-resnet101-11 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardopenvkl: vklBenchmarkCPU ISPCospray: gravity_spheres_volume/dim_512/ao/real_timeospray: gravity_spheres_volume/dim_512/scivis/real_timeospray: gravity_spheres_volume/dim_512/pathtracer/real_timecpuminer-opt: scryptcpuminer-opt: Skeincoincpuminer-opt: LBC, LBRY Creditslaghos: Sedov Blast Wave, ube_922_hex.meshsrsran: PDSCH Processor Benchmark, Throughput Totalsmhasher: FarmHash32 x86_64 AVXgromacs: MPI CPU - water_GMX50_barenumpy: llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPUsmhasher: FarmHash32 x86_64 AVXonnx: fcn-resnet101-11 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardospray-studio: 1 - 4K - 1 - Path Tracer - CPUospray-studio: 1 - 4K - 16 - Path Tracer - CPUospray-studio: 1 - 4K - 32 - Path Tracer - CPUospray-studio: 2 - 4K - 1 - Path Tracer - CPUospray-studio: 2 - 4K - 16 - Path Tracer - CPUospray-studio: 2 - 4K - 32 - Path Tracer - CPUospray-studio: 3 - 4K - 1 - Path Tracer - CPUospray-studio: 3 - 4K - 16 - Path Tracer - CPUospray-studio: 3 - 4K - 32 - Path Tracer - CPUonednn: Deconvolution Batch shapes_1d - CPUonednn: Deconvolution Batch shapes_3d - CPUonednn: IP Shapes 1D - CPUonednn: IP Shapes 3D - CPUonednn: Recurrent Neural Network Training - CPUonednn: Recurrent Neural Network Inference - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Person Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Tokenopenvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Tokenmnn: mobilenetV3mnn: resnet-v2-50mnn: SqueezeNetV1.0y-cruncher: 5By-cruncher: 10BEPYC Turin AVX-512 Enabled AVX-512 Disabled50.5250.84254.750286.491147.06162706.14710.047611.176775.46857.1616580.183488.232102.0310524.046825.09172.9627148.3243136.4155416.846184.79156.47715.6496368.7377162.2694225.4328973422.0212.82241.229.56433215.16222.4331196.34848.781764.0661244.9896.50795281735.284934.689138.19752937.261286043750993603.92124852.337638.6217.416882.00107.84106.71107.34356.76109.25107.32108.7466.5722.808104.82734.6475044.58135.0910820.503015.60864.08114153.70786813823309588731390131077102416297359556.724660.7169790.5334140.265927428.500278.310325.740.3167.4712.537.0355.952.8227.5022.784.5213.4718.5915.031.7367.4093.12121.84744.05343.3343.44136.657146.64485.54129762.91241.772189.063132.50295.886688.981200.37713.963605.752516.89159.2583138.2788126.6317412.131171.28951.73514.5213416.4273666.09470.7325173004.6152.47161.276.58603170.94917.7590182.81040.554837.1520238.6404.30914235324.028023.167831.5810846.44747190245007590.60110897.936139.9514.125828.6795.0393.4595.47347.1096.1594.9095.2791.4022.658151.8495.8509656.30865.4682024.669426.97084.18972235.705966153823466897415503344801135180983978512.86011.354390.6462113.07554491.687307.050558.740.53198.1743.7315.26162.027.1479.9067.1213.3038.0527.0110.971.8039.9503.32733.08568.129OpenBenchmarking.org

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50AVX-512 DisabledAVX-512 Enabled1122334455SE +/- 0.21, N = 3SE +/- 0.31, N = 343.3350.52MIN: 38.8 / MAX: 44.27MIN: 43.73 / MAX: 51.63

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50AVX-512 DisabledAVX-512 Enabled1122334455SE +/- 0.14, N = 3SE +/- 0.28, N = 343.4450.84MIN: 38.59 / MAX: 44.4MIN: 44.51 / MAX: 52.23

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX-512 DisabledAVX-512 Enabled60120180240300SE +/- 2.22, N = 15SE +/- 5.97, N = 15136.66254.751. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX-512 DisabledAVX-512 Enabled60120180240300SE +/- 0.09, N = 3SE +/- 2.17, N = 3146.64286.491. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenVINO

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection FP16-INT8 - Device: CPUAVX-512 DisabledAVX-512 Enabled306090120150SE +/- 0.04, N = 3SE +/- 0.11, N = 385.54147.06-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX-512 DisabledAVX-512 Enabled30K60K90K120K150KSE +/- 170.89, N = 3SE +/- 180.45, N = 3129762.91162706.14-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Detection FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled150300450600750SE +/- 0.82, N = 3SE +/- 0.91, N = 3241.77710.04-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled16003200480064008000SE +/- 1.32, N = 3SE +/- 5.62, N = 32189.067611.17-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled15003000450060007500SE +/- 2.95, N = 3SE +/- 17.52, N = 33132.506775.46-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Machine Translation EN To DE FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled2004006008001000SE +/- 0.09, N = 3SE +/- 0.97, N = 3295.88857.16-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled4K8K12K16K20KSE +/- 8.72, N = 3SE +/- 12.77, N = 36688.9816580.18-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled7001400210028003500SE +/- 0.28, N = 3SE +/- 4.77, N = 31200.373488.23-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled5001000150020002500SE +/- 1.87, N = 3SE +/- 6.46, N = 3713.962102.03-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Re-Identification Retail FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled2K4K6K8K10KSE +/- 2.23, N = 3SE +/- 6.44, N = 33605.7510524.04-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Noise Suppression Poconet-Like FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled15003000450060007500SE +/- 6.13, N = 3SE +/- 13.37, N = 32516.896825.09-march=znver51. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian DragonAVX-512 DisabledAVX-512 Enabled4080120160200SE +/- 0.10, N = 7SE +/- 0.06, N = 8159.26172.96MIN: 154.75 / MAX: 162.47MIN: 169.32 / MAX: 176.68

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: Asian Dragon ObjAVX-512 DisabledAVX-512 Enabled306090120150SE +/- 0.18, N = 4SE +/- 0.11, N = 5138.28148.32MIN: 134.48 / MAX: 141.24MIN: 144.93 / MAX: 151.79

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.3Binary: Pathtracer ISPC - Model: CrownAVX-512 DisabledAVX-512 Enabled306090120150SE +/- 0.04, N = 7SE +/- 0.09, N = 7126.63136.42MIN: 123.57 / MAX: 130.86MIN: 132.5 / MAX: 141.56

SVT-AV1

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 13 - Input: Bosphorus 4KAVX-512 DisabledAVX-512 Enabled90180270360450SE +/- 1.74, N = 6SE +/- 2.58, N = 6412.13416.851. (CXX) g++ options: -O3 -march=znver5 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 8 - Input: Bosphorus 4KAVX-512 DisabledAVX-512 Enabled4080120160200SE +/- 1.07, N = 4SE +/- 1.02, N = 4171.29184.791. (CXX) g++ options: -O3 -march=znver5 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 5 - Input: Bosphorus 4KAVX-512 DisabledAVX-512 Enabled1326395265SE +/- 0.29, N = 3SE +/- 0.23, N = 351.7456.481. (CXX) g++ options: -O3 -march=znver5 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.3Encoder Mode: Preset 3 - Input: Bosphorus 4KAVX-512 DisabledAVX-512 Enabled48121620SE +/- 0.05, N = 3SE +/- 0.03, N = 314.5215.651. (CXX) g++ options: -O3 -march=znver5 -flto -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX-512 DisabledAVX-512 Enabled14002800420056007000SE +/- 55.43, N = 15SE +/- 149.28, N = 153416.436368.741. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX-512 DisabledAVX-512 Enabled15003000450060007500SE +/- 2.24, N = 3SE +/- 54.17, N = 33666.097162.271. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

ACES DGEMM

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateAVX-512 DisabledAVX-512 Enabled9001800270036004500SE +/- 0.03, N = 3SE +/- 8.70, N = 470.734225.431. (CC) gcc options: -ffast-math -O3 -march=znver5 -flto -mavx2 -fopenmp -lopenblas

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128AVX-512 DisabledAVX-512 Enabled7001400210028003500SE +/- 10.56, N = 3SE +/- 5.19, N = 33004.63422.01. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: ResNet-50AVX-512 DisabledAVX-512 Enabled50100150200250SE +/- 0.30, N = 3SE +/- 0.23, N = 3152.47212.82

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50AVX-512 DisabledAVX-512 Enabled50100150200250SE +/- 0.06, N = 3SE +/- 0.01, N = 3161.27241.22

ONNX Runtime

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: StandardAVX-512 DisabledAVX-512 Enabled3691215SE +/- 0.04636, N = 3SE +/- 0.13137, N = 156.586039.564331. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: StandardAVX-512 DisabledAVX-512 Enabled50100150200250SE +/- 1.92, N = 3SE +/- 0.59, N = 3170.95215.161. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: StandardAVX-512 DisabledAVX-512 Enabled510152025SE +/- 0.06, N = 3SE +/- 0.19, N = 317.7622.431. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: StandardAVX-512 DisabledAVX-512 Enabled4080120160200SE +/- 0.35, N = 3SE +/- 0.28, N = 3182.81196.351. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: StandardAVX-512 DisabledAVX-512 Enabled1122334455SE +/- 0.36, N = 8SE +/- 0.53, N = 340.5548.781. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: StandardAVX-512 DisabledAVX-512 Enabled1428425670SE +/- 0.45, N = 15SE +/- 0.50, N = 337.1564.071. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: StandardAVX-512 DisabledAVX-512 Enabled50100150200250SE +/- 0.75, N = 3SE +/- 1.84, N = 3238.64244.991. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: StandardAVX-512 DisabledAVX-512 Enabled246810SE +/- 0.15591, N = 12SE +/- 0.06862, N = 44.309146.507951. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 2.0.0Benchmark: vklBenchmarkCPU ISPCAVX-512 DisabledAVX-512 Enabled6001200180024003000SE +/- 0.58, N = 3SE +/- 0.58, N = 323532817MIN: 179 / MAX: 30230MIN: 217 / MAX: 36244

OSPRay

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/ao/real_timeAVX-512 DisabledAVX-512 Enabled816243240SE +/- 0.01, N = 3SE +/- 0.02, N = 324.0335.28

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeAVX-512 DisabledAVX-512 Enabled816243240SE +/- 0.01, N = 3SE +/- 0.01, N = 323.1734.69

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 3.2Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeAVX-512 DisabledAVX-512 Enabled918273645SE +/- 0.04, N = 3SE +/- 0.02, N = 331.5838.20

Cpuminer-Opt

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 24.3Algorithm: scryptAVX-512 DisabledAVX-512 Enabled6001200180024003000SE +/- 0.72, N = 3SE +/- 3.40, N = 3846.442937.26-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto -lcurl -lz -lpthread -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 24.3Algorithm: SkeincoinAVX-512 DisabledAVX-512 Enabled300K600K900K1200K1500KSE +/- 2345.30, N = 3SE +/- 6599.64, N = 37471901286043-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto -lcurl -lz -lpthread -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 24.3Algorithm: LBC, LBRY CreditsAVX-512 DisabledAVX-512 Enabled160K320K480K640K800KSE +/- 2841.66, N = 3SE +/- 622.53, N = 3245007750993-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto -lcurl -lz -lpthread -lgmp

Laghos

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Sedov Blast Wave, ube_922_hex.meshAVX-512 DisabledAVX-512 Enabled130260390520650SE +/- 3.13, N = 3SE +/- 2.72, N = 3590.60603.92-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

srsRAN Project

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 24.10Test: PDSCH Processor Benchmark, Throughput TotalAVX-512 DisabledAVX-512 Enabled30K60K90K120K150KSE +/- 1008.70, N = 3SE +/- 1542.84, N = 3110897.9124852.31. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl

GROMACS

OpenBenchmarking.orgMegahertz, More Is BetterGROMACS 2024CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2600 / Avg: 3308.87 / Max: 4572Min: 2600 / Avg: 3374.13 / Max: 4568

OSPRay Studio

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3207.7 / Max: 4551Min: 2600 / Avg: 3210.72 / Max: 4566

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3129.96 / Max: 4569Min: 2600 / Avg: 3221.06 / Max: 4547

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2600 / Avg: 3292.85 / Max: 4565Min: 2600 / Avg: 3314.8 / Max: 4569

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3233.67 / Max: 4563Min: 2600 / Avg: 3247.36 / Max: 4575

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3138.26 / Max: 4559Min: 2600 / Avg: 3230.91 / Max: 4572

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2600 / Avg: 3313.46 / Max: 4563Min: 2600 / Avg: 3343.68 / Max: 4571

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3218.88 / Max: 4545Min: 2600 / Avg: 3232.49 / Max: 4565

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2600 / Avg: 3195.19 / Max: 4571Min: 2600 / Avg: 3212.44 / Max: 4562

OpenBenchmarking.orgMegahertz, More Is BetterOSPRay Studio 1.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2600 / Avg: 3282.21 / Max: 4552Min: 2600 / Avg: 3291.96 / Max: 4561

Y-Cruncher

OpenBenchmarking.orgMegahertz, More Is BetterY-Cruncher 0.8.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2075 / Avg: 3308.94 / Max: 4567Min: 2387 / Avg: 3380.84 / Max: 4566

OpenBenchmarking.orgMegahertz, More Is BetterY-Cruncher 0.8.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2056 / Avg: 3198.93 / Max: 4561Min: 2378 / Avg: 3385.41 / Max: 4566

SMHasher

OpenBenchmarking.orgMegahertz, More Is BetterSMHasher 2022-08-22CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2600 / Avg: 3893.07 / Max: 4550Min: 2600 / Avg: 3897.49 / Max: 4573

oneDNN

OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.6CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2583 / Avg: 3215.51 / Max: 4567Min: 2600 / Avg: 3408.08 / Max: 4620

OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.6CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2448 / Avg: 2613.84 / Max: 4541Min: 2600 / Avg: 3065.11 / Max: 4539

OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.6CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2374 / Avg: 3087.78 / Max: 4541Min: 2557 / Avg: 3100.03 / Max: 4539

OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.6CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2539 / Avg: 3522.54 / Max: 4553Min: 2600 / Avg: 3798.58 / Max: 4549

OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.6CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2299 / Avg: 3036.42 / Max: 4549Min: 1783 / Avg: 3106.58 / Max: 4569

OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.6CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 1575 / Avg: 3008.31 / Max: 4572Min: 1753 / Avg: 3103.6 / Max: 4545

OpenVINO

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2239 / Avg: 2473.08 / Max: 4567Min: 2574 / Avg: 2722.94 / Max: 4564

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2600 / Avg: 2860.53 / Max: 4567Min: 2600 / Avg: 2876.42 / Max: 4546

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2394 / Avg: 2594.63 / Max: 4554Min: 2595 / Avg: 2827.98 / Max: 4555

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2270 / Avg: 2390.53 / Max: 4551Min: 2470 / Avg: 2551.48 / Max: 4563

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2216 / Avg: 2363.03 / Max: 4569Min: 2600 / Avg: 2695.03 / Max: 4545

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2411 / Avg: 2576.28 / Max: 4563Min: 2480 / Avg: 2633.44 / Max: 4568

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2269 / Avg: 2338.6 / Max: 4541Min: 2469 / Avg: 2591.17 / Max: 4530

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2434 / Avg: 2573.7 / Max: 4562Min: 2447 / Avg: 2590.5 / Max: 4548

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2323 / Avg: 2457.52 / Max: 4559Min: 2595 / Avg: 2730.19 / Max: 4576

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2362 / Avg: 2549.18 / Max: 4548Min: 2522 / Avg: 2627.11 / Max: 4561

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2528 / Avg: 2682.43 / Max: 4556Min: 2600 / Avg: 3003.36 / Max: 4566

OpenVINO GenAI

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO GenAI 2024.5CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3645.71 / Max: 4548Min: 2600 / Avg: 3933.96 / Max: 4565

Llama.cpp

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b4397CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3527.61 / Max: 4565Min: 2600 / Avg: 3577.05 / Max: 4540

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b4397CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3527.95 / Max: 4580Min: 2600 / Avg: 3623.36 / Max: 4547

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b4397CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3527.95 / Max: 4566Min: 2600 / Avg: 3613.53 / Max: 4521

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b4397CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3430.29 / Max: 4544Min: 2600 / Avg: 3454.73 / Max: 4569

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b4397CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3512.12 / Max: 4558Min: 2600 / Avg: 3594.14 / Max: 4556

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b4397CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3535.35 / Max: 4545Min: 2600 / Avg: 3628.57 / Max: 4544

OpenBenchmarking.orgMegahertz, More Is BetterLlama.cpp b4397CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 EnabledAVX-512 Disabled8001600240032004000Min: 2600 / Avg: 3524.34 / Max: 4584Min: 2600 / Avg: 3605.26 / Max: 4568

Numpy Benchmark

OpenBenchmarking.orgMegahertz, More Is BetterNumpy BenchmarkCPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 DisabledAVX-512 Enabled8001600240032004000Min: 2600 / Avg: 4473.89 / Max: 4563Min: 2600 / Avg: 4481.19 / Max: 4567

SMHasher

SMHasher is a hash function tester supporting various algorithms and able to make use of AVX and other modern CPU instruction set extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMiB/sec, More Is BetterSMHasher 2022-08-22Hash: FarmHash32 x86_64 AVXAVX-512 DisabledAVX-512 Enabled8K16K24K32K40KSE +/- 21.63, N = 6SE +/- 10.36, N = 636139.9537638.621. (CXX) g++ options: -O3 -march=znver5 -flto -march=native -flto=auto -fno-fat-lto-objects

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareAVX-512 DisabledAVX-512 Enabled48121620SE +/- 0.04, N = 3SE +/- 0.02, N = 314.1317.42-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto -lm

Numpy Benchmark

This is a test to obtain the general Numpy performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkAVX-512 DisabledAVX-512 Enabled2004006008001000SE +/- 0.83, N = 3SE +/- 1.81, N = 3828.67882.00

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512AVX-512 DisabledAVX-512 Enabled20406080100SE +/- 0.94, N = 3SE +/- 0.76, N = 395.03107.84-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024AVX-512 DisabledAVX-512 Enabled20406080100SE +/- 0.60, N = 3SE +/- 0.46, N = 393.45106.71-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048AVX-512 DisabledAVX-512 Enabled20406080100SE +/- 0.54, N = 3SE +/- 0.56, N = 395.47107.34-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512AVX-512 DisabledAVX-512 Enabled80160240320400SE +/- 2.08, N = 5SE +/- 2.40, N = 5347.10356.76-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512AVX-512 DisabledAVX-512 Enabled20406080100SE +/- 1.01, N = 4SE +/- 1.17, N = 596.15109.25-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024AVX-512 DisabledAVX-512 Enabled20406080100SE +/- 0.77, N = 3SE +/- 1.15, N = 394.90107.32-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048AVX-512 DisabledAVX-512 Enabled20406080100SE +/- 0.65, N = 3SE +/- 0.94, N = 395.27108.74-mno-avx512f1. (CXX) g++ options: -O3 -march=znver5 -flto

OpenVINO GenAI

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPUAVX-512 EnabledAVX-512 Disabled20406080100SE +/- 0.94, N = 3SE +/- 1.17, N = 1566.5791.40

OSPRay Studio

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 42.75 / Avg: 56.59 / Max: 66.88Min: 39.13 / Avg: 53.19 / Max: 62.75

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 44.25 / Avg: 57.72 / Max: 69.13Min: 41.13 / Avg: 53.68 / Max: 64

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 45.25 / Avg: 57.63 / Max: 68.25Min: 40.75 / Avg: 52.93 / Max: 60

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 43.38 / Avg: 56.47 / Max: 66.13Min: 40.25 / Avg: 53.39 / Max: 62.63

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 44.13 / Avg: 57.23 / Max: 68.38Min: 41.25 / Avg: 53.71 / Max: 62.75

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 45 / Avg: 57.53 / Max: 68Min: 41.38 / Avg: 53.67 / Max: 62.63

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 43 / Avg: 56.47 / Max: 65.88Min: 40.38 / Avg: 53.32 / Max: 62.25

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 44 / Avg: 56.51 / Max: 66.25Min: 41.38 / Avg: 53.9 / Max: 63.38

OpenBenchmarking.orgCelsius, Fewer Is BetterOSPRay Studio 1.0CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1326395265Min: 43.63 / Avg: 56.16 / Max: 64.88Min: 41.5 / Avg: 53.34 / Max: 61.5

Y-Cruncher

OpenBenchmarking.orgCelsius, Fewer Is BetterY-Cruncher 0.8.5CPU Temperature MonitorAVX-512 DisabledAVX-512 Enabled1122334455Min: 40.88 / Avg: 52.98 / Max: 57.13Min: 42.63 / Avg: 51.7 / Max: 57.13

OpenBenchmarking.orgCelsius, Fewer Is BetterY-Cruncher 0.8.5CPU Temperature MonitorAVX-512 DisabledAVX-512 Enabled1224364860Min: 40.38 / Avg: 54.9 / Max: 61.38Min: 41.75 / Avg: 53.32 / Max: 61.5

oneDNN

OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.6CPU Temperature MonitorAVX-512 DisabledAVX-512 Enabled918273645Min: 40.13 / Avg: 43.41 / Max: 46Min: 33.38 / Avg: 40.12 / Max: 42.88

OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.6CPU Temperature MonitorAVX-512 DisabledAVX-512 Enabled1020304050Min: 35.13 / Avg: 44.03 / Max: 47.38Min: 34.13 / Avg: 38.62 / Max: 40.38

OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.6CPU Temperature MonitorAVX-512 DisabledAVX-512 Enabled918273645Min: 34 / Avg: 39.16 / Max: 42.25Min: 32.5 / Avg: 37.99 / Max: 40.25

OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.6CPU Temperature MonitorAVX-512 DisabledAVX-512 Enabled1020304050Min: 32.38 / Avg: 43.48 / Max: 49.88Min: 32.88 / Avg: 40.62 / Max: 44.5

OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.6CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1020304050Min: 34.25 / Avg: 45.28 / Max: 47.5Min: 33.38 / Avg: 43.76 / Max: 46.88

OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.6CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled918273645Min: 36.5 / Avg: 44.46 / Max: 46.38Min: 36.5 / Avg: 43.53 / Max: 46.38

OpenVINO

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1122334455Min: 42.5 / Avg: 52.78 / Max: 54.38Min: 40.38 / Avg: 52.66 / Max: 55.88

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 DisabledAVX-512 Enabled1020304050Min: 41.75 / Avg: 49.3 / Max: 51Min: 42.75 / Avg: 48.39 / Max: 50.5

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1122334455Min: 40.5 / Avg: 54.34 / Max: 57.13Min: 40 / Avg: 53.07 / Max: 55.25

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1122334455Min: 43.88 / Avg: 55.64 / Max: 57.25Min: 42.25 / Avg: 52.62 / Max: 53.88

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 DisabledAVX-512 Enabled1122334455Min: 41.75 / Avg: 51.96 / Max: 53.5Min: 44.38 / Avg: 50.87 / Max: 51.63

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1122334455Min: 41.38 / Avg: 55.24 / Max: 57.63Min: 41.75 / Avg: 53.37 / Max: 55.38

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1122334455Min: 44.38 / Avg: 54.39 / Max: 55.75Min: 42.38 / Avg: 53.28 / Max: 54.88

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1224364860Min: 43.75 / Avg: 56.13 / Max: 58.25Min: 42.5 / Avg: 53.6 / Max: 55.5

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1122334455Min: 45.25 / Avg: 54.03 / Max: 55.38Min: 42.75 / Avg: 49.82 / Max: 50.88

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1122334455Min: 44.25 / Avg: 56.07 / Max: 57.63Min: 40.63 / Avg: 53.04 / Max: 54.88

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2024.5CPU Temperature MonitorAVX-512 EnabledAVX-512 Disabled1122334455Min: 44.63 / Avg: 51.63 / Max: 52.75Min: 42.5 / Avg: 48.86 / Max: 49.75

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled2004006008001000SE +/- 0.33, N = 3SE +/- 0.33, N = 3966868

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled3K6K9K12K15KSE +/- 6.36, N = 3SE +/- 15.50, N = 31538213823

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled7K14K21K28K35KSE +/- 88.93, N = 3SE +/- 80.19, N = 33466830958

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled2004006008001000SE +/- 0.67, N = 3SE +/- 0.67, N = 3974873

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled3K6K9K12K15KSE +/- 17.35, N = 3SE +/- 18.02, N = 31550313901

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled7K14K21K28K35KSE +/- 52.20, N = 3SE +/- 30.12, N = 33448031077

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled2004006008001000SE +/- 0.33, N = 3SE +/- 1.15, N = 311351024

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled4K8K12K16K20KSE +/- 34.35, N = 3SE +/- 1.45, N = 31809816297

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 1.0Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer - Acceleration: CPUAVX-512 DisabledAVX-512 Enabled9K18K27K36K45KSE +/- 59.00, N = 3SE +/- 50.45, N = 33978535955

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUAVX-512 DisabledAVX-512 Enabled3691215SE +/- 0.00410, N = 3SE +/- 0.01312, N = 312.860106.72466-mno-avx512f - MIN: 9.8MIN: 4.21. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUAVX-512 DisabledAVX-512 Enabled0.30470.60940.91411.21881.5235SE +/- 0.001116, N = 9SE +/- 0.001216, N = 91.3543900.716979-mno-avx512f - MIN: 1.31MIN: 0.621. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUAVX-512 DisabledAVX-512 Enabled0.14540.29080.43620.58160.727SE +/- 0.000257, N = 4SE +/- 0.001169, N = 40.6462110.533414-mno-avx512f - MIN: 0.59MIN: 0.491. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUAVX-512 DisabledAVX-512 Enabled0.6921.3842.0762.7683.46SE +/- 0.002544, N = 5SE +/- 0.000384, N = 53.0755400.265927-mno-avx512f - MIN: 3.021. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUAVX-512 DisabledAVX-512 Enabled110220330440550SE +/- 0.56, N = 3SE +/- 0.35, N = 3491.69428.50-mno-avx512f - MIN: 486.04MIN: 421.371. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUAVX-512 DisabledAVX-512 Enabled70140210280350SE +/- 0.47, N = 3SE +/- 0.34, N = 3307.05278.31-mno-avx512f - MIN: 301.26MIN: 271.121. (CXX) g++ options: -O3 -march=native -march=znver5 -flto -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenVINO

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection FP16-INT8 - Device: CPUAVX-512 DisabledAVX-512 Enabled120240360480600SE +/- 0.19, N = 3SE +/- 0.27, N = 3558.74325.74MIN: 273.34 / MAX: 587.28-march=znver5 - MIN: 255.71 / MAX: 363.131. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX-512 DisabledAVX-512 Enabled0.11930.23860.35790.47720.5965SE +/- 0.00, N = 3SE +/- 0.00, N = 30.530.31MIN: 0.18 / MAX: 13.44-march=znver5 - MIN: 0.12 / MAX: 25.731. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Detection FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled4080120160200SE +/- 0.68, N = 3SE +/- 0.09, N = 3198.1767.47MIN: 94.52 / MAX: 339.84-march=znver5 - MIN: 29.95 / MAX: 185.481. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled1020304050SE +/- 0.03, N = 3SE +/- 0.01, N = 343.7312.53MIN: 23.02 / MAX: 62.81-march=znver5 - MIN: 5.9 / MAX: 31.11. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled48121620SE +/- 0.01, N = 3SE +/- 0.02, N = 315.267.03MIN: 7.72 / MAX: 48.83-march=znver5 - MIN: 3.43 / MAX: 24.511. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Machine Translation EN To DE FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled4080120160200SE +/- 0.05, N = 3SE +/- 0.07, N = 3162.0255.95MIN: 75.62 / MAX: 264.14-march=znver5 - MIN: 26.76 / MAX: 125.181. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled246810SE +/- 0.01, N = 3SE +/- 0.00, N = 37.142.82MIN: 2.91 / MAX: 32.04-march=znver5 - MIN: 1.14 / MAX: 18.121. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled20406080100SE +/- 0.02, N = 3SE +/- 0.04, N = 379.9027.50MIN: 35.69 / MAX: 151.02-march=znver5 - MIN: 13.39 / MAX: 61.161. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled1530456075SE +/- 0.18, N = 3SE +/- 0.07, N = 367.1222.78MIN: 17.27 / MAX: 133.16-march=znver5 - MIN: 11.57 / MAX: 67.031. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Re-Identification Retail FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled3691215SE +/- 0.01, N = 3SE +/- 0.00, N = 313.304.52MIN: 7.56 / MAX: 31.8-march=znver5 - MIN: 2.34 / MAX: 21.711. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Noise Suppression Poconet-Like FP16 - Device: CPUAVX-512 DisabledAVX-512 Enabled918273645SE +/- 0.09, N = 3SE +/- 0.03, N = 338.0513.47MIN: 16.64 / MAX: 61.98-march=znver5 - MIN: 7.72 / MAX: 51.081. (CXX) g++ options: -fPIC -O3 -flto -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

Mobile Neural Network

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 3.0Model: mobilenetV3AVX-512 DisabledAVX-512 Enabled0.40570.81141.21711.62282.0285SE +/- 0.010, N = 3SE +/- 0.010, N = 31.8031.736MIN: 1.68 / MAX: 2.88-march=znver5 - MIN: 1.59 / MAX: 2.211. (CXX) g++ options: -O3 -flto -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 3.0Model: resnet-v2-50AVX-512 DisabledAVX-512 Enabled3691215SE +/- 0.086, N = 3SE +/- 0.117, N = 39.9507.409MIN: 9.4 / MAX: 12.2-march=znver5 - MIN: 7.08 / MAX: 9.541. (CXX) g++ options: -O3 -flto -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 3.0Model: SqueezeNetV1.0AVX-512 DisabledAVX-512 Enabled0.74861.49722.24582.99443.743SE +/- 0.030, N = 3SE +/- 0.011, N = 33.3273.121MIN: 3.21 / MAX: 5.6-march=znver5 - MIN: 3.05 / MAX: 5.81. (CXX) g++ options: -O3 -flto -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Y-Cruncher

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.5Pi Digits To Calculate: 5BAVX-512 DisabledAVX-512 Enabled816243240SE +/- 0.00, N = 3SE +/- 0.04, N = 333.0921.85

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.5Pi Digits To Calculate: 10BAVX-512 DisabledAVX-512 Enabled1530456075SE +/- 0.19, N = 3SE +/- 0.01, N = 368.1344.05

154 Results Shown

PyTorch:
  CPU - 256 - ResNet-50
  CPU - 512 - ResNet-50
miniBUDE:
  OpenMP - BM1
  OpenMP - BM2
OpenVINO:
  Face Detection FP16-INT8 - CPU
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU
  Person Detection FP16 - CPU
  Weld Porosity Detection FP16 - CPU
  Person Vehicle Bike Detection FP16 - CPU
  Machine Translation EN To DE FP16 - CPU
  Face Detection Retail FP16 - CPU
  Handwritten English Recognition FP16 - CPU
  Road Segmentation ADAS FP16 - CPU
  Person Re-Identification Retail FP16 - CPU
  Noise Suppression Poconet-Like FP16 - CPU
Embree:
  Pathtracer ISPC - Asian Dragon
  Pathtracer ISPC - Asian Dragon Obj
  Pathtracer ISPC - Crown
SVT-AV1:
  Preset 13 - Bosphorus 4K
  Preset 8 - Bosphorus 4K
  Preset 5 - Bosphorus 4K
  Preset 3 - Bosphorus 4K
miniBUDE:
  OpenMP - BM1
  OpenMP - BM2
ACES DGEMM
libxsmm
TensorFlow:
  CPU - 256 - ResNet-50
  CPU - 512 - ResNet-50
ONNX Runtime:
  fcn-resnet101-11 - CPU - Standard
  super-resolution-10 - CPU - Standard
  bertsquad-12 - CPU - Standard
  GPT-2 - CPU - Standard
  ArcFace ResNet-100 - CPU - Standard
  Faster R-CNN R-50-FPN-int8 - CPU - Standard
  T5 Encoder - CPU - Standard
  ResNet101_DUC_HDC-12 - CPU - Standard
OpenVKL
OSPRay:
  gravity_spheres_volume/dim_512/ao/real_time
  gravity_spheres_volume/dim_512/scivis/real_time
  gravity_spheres_volume/dim_512/pathtracer/real_time
Cpuminer-Opt:
  scrypt
  Skeincoin
  LBC, LBRY Credits
Laghos
srsRAN Project
GROMACS:
  CPU Peak Freq (Highest CPU Core Frequency) Monitor:
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
    Megahertz
SMHasher
GROMACS
Numpy Benchmark
Llama.cpp:
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048
OpenVINO GenAI
OSPRay Studio:
  CPU Temp Monitor:
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
    Celsius
OSPRay Studio:
  1 - 4K - 1 - Path Tracer - CPU
  1 - 4K - 16 - Path Tracer - CPU
  1 - 4K - 32 - Path Tracer - CPU
  2 - 4K - 1 - Path Tracer - CPU
  2 - 4K - 16 - Path Tracer - CPU
  2 - 4K - 32 - Path Tracer - CPU
  3 - 4K - 1 - Path Tracer - CPU
  3 - 4K - 16 - Path Tracer - CPU
  3 - 4K - 32 - Path Tracer - CPU
oneDNN:
  Deconvolution Batch shapes_1d - CPU
  Deconvolution Batch shapes_3d - CPU
  IP Shapes 1D - CPU
  IP Shapes 3D - CPU
  Recurrent Neural Network Training - CPU
  Recurrent Neural Network Inference - CPU
OpenVINO:
  Face Detection FP16-INT8 - CPU
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU
  Person Detection FP16 - CPU
  Weld Porosity Detection FP16 - CPU
  Person Vehicle Bike Detection FP16 - CPU
  Machine Translation EN To DE FP16 - CPU
  Face Detection Retail FP16 - CPU
  Handwritten English Recognition FP16 - CPU
  Road Segmentation ADAS FP16 - CPU
  Person Re-Identification Retail FP16 - CPU
  Noise Suppression Poconet-Like FP16 - CPU
Mobile Neural Network:
  mobilenetV3
  resnet-v2-50
  SqueezeNetV1.0
Y-Cruncher:
  5B
  10B