AVX-512 Analysis

AMD Ryzen 7 7840U testing with a PHX Ray_PEU (V1.04 BIOS) and AMD Phoenix1 512MB on Ubuntu 23.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2307079-NE-AVX512ANA28
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
AVX512 On
July 06 2023
  8 Hours, 42 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AVX-512 AnalysisOpenBenchmarking.orgPhoronix Test SuiteAMD Ryzen 7 7840U @ 3.30GHz (8 Cores / 16 Threads)PHX Ray_PEU (V1.04 BIOS)AMD Device 14e816GB1024GB Micron_3400_MTFDKBA1T0TFHAMD Phoenix1 512MB (2700/400MHz)AMD Rembrandt Radeon HD AudioMEDIATEK MT7922 802.11ax PCIUbuntu 23.046.2.0-24-generic (x86_64)KDE Plasma 5.27.4X Server 1.21.1.74.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49)GCC 12.2.0ext43200x2000ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionAVX-512 Analysis PerformanceSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa704101 - Python 3.11.2- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AVX-512 Analysistensorflow: CPU - 64 - GoogLeNettensorflow: CPU - 64 - AlexNettensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 16 - GoogLeNettensorflow: CPU - 16 - AlexNettensorflow: CPU - 16 - ResNet-50libxsmm: 64smhasher: FarmHash32 x86_64 AVXminibude: OpenMP - BM1minibude: OpenMP - BM1openvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUcpuminer-opt: LBC, LBRY Creditscpuminer-opt: Skeincoincpuminer-opt: Garlicoincpuminer-opt: x25xcpuminer-opt: Blake-2 Scpuminer-opt: Magicpuminer-opt: Myriad-Groestlcpuminer-opt: Quad SHA-256, Pyritecpuminer-opt: scryptonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUospray-studio: 3 - 4K - 32 - Path Tracerospray-studio: 3 - 4K - 1 - Path Tracerospray-studio: 3 - 1080p - 32 - Path Tracerospray-studio: 3 - 1080p - 1 - Path Tracerospray-studio: 1 - 4K - 32 - Path Tracerospray-studio: 1 - 4K - 1 - Path Tracerospray-studio: 1 - 1080p - 32 - Path Tracerospray-studio: 1 - 1080p - 1 - Path Tracerospray: particle_volume/pathtracer/real_timeospray: gravity_spheres_volume/dim_512/pathtracer/real_timeospray: gravity_spheres_volume/dim_512/scivis/real_timeospray: gravity_spheres_volume/dim_512/ao/real_timeoidn: RTLightmap.hdr.4096x4096 - CPU-Onlyoidn: RT.ldr_alb_nrm.3840x2160 - CPU-Onlyoidn: RT.hdr_alb_nrm.3840x2160 - CPU-Onlyopenvkl: vklBenchmark ISPCsimdjson: TopTweetsimdjson: DistinctUserIDsimdjson: Kostyasimdjson: LargeRandsimdjson: PartialTweetsembree: Pathtracer ISPC - Crownembree: Pathtracer ISPC - Asian Dragon Objembree: Pathtracer ISPC - Asian Dragonlczero: Eigensmhasher: FarmHash32 x86_64 AVXopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUAVX512 On48.14114.8116.7555.1280.0117.93168.440035.1516.322408.057104.3538.3211.12359.1221.52185.8310.07396.5611.30353.2511.76678.092108.541.892097.521.901.335958.160.819779.57599.116.6744517818592955.63396.50564430369.972823396877175.302234.124281.310.9372571.586661.40812614869187031545094801511935153241292963869105.3552.636442.110432.150930.130.270.271128.678.785.061.577.518.05238.608110.003356726.3911160.983.46OpenBenchmarking.org

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: GoogLeNetAVX512 On1122334455SE +/- 0.48, N = 648.14

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: AlexNetAVX512 On306090120150SE +/- 0.24, N = 3114.81

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: ResNet-50AVX512 On48121620SE +/- 0.18, N = 916.75

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: GoogLeNetAVX512 On1224364860SE +/- 0.09, N = 355.12

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: AlexNetAVX512 On20406080100SE +/- 0.35, N = 380.01

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50AVX512 On48121620SE +/- 0.05, N = 317.93

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64AVX512 On4080120160200SE +/- 0.44, N = 3168.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -pedantic -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2

SMHasher

SMHasher is a hash function tester supporting various algorithms and able to make use of AVX and other modern CPU instruction set extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMiB/sec, More Is BetterSMHasher 2022-08-22Hash: FarmHash32 x86_64 AVXAVX512 On9K18K27K36K45KSE +/- 43.29, N = 640035.151. (CXX) g++ options: -march=native -O3 -flto=auto -fno-fat-lto-objects

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX512 On48121620SE +/- 0.08, N = 316.321. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX512 On90180270360450SE +/- 2.08, N = 3408.061. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUAVX512 On20406080100SE +/- 0.32, N = 3104.35MIN: 80.05 / MAX: 143.061. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUAVX512 On918273645SE +/- 0.12, N = 338.321. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX512 On3691215SE +/- 0.10, N = 311.12MIN: 7.56 / MAX: 31.911. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX512 On80160240320400SE +/- 3.04, N = 3359.121. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUAVX512 On510152025SE +/- 0.14, N = 1321.52MIN: 12.55 / MAX: 53.71. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUAVX512 On4080120160200SE +/- 1.28, N = 13185.831. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUAVX512 On3691215SE +/- 0.11, N = 310.07MIN: 7.2 / MAX: 16.91. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUAVX512 On90180270360450SE +/- 4.54, N = 3396.561. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUAVX512 On3691215SE +/- 0.09, N = 311.30MIN: 8.17 / MAX: 35.221. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUAVX512 On80160240320400SE +/- 2.89, N = 3353.251. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUAVX512 On3691215SE +/- 0.05, N = 311.76MIN: 8.35 / MAX: 28.931. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUAVX512 On150300450600750SE +/- 3.04, N = 3678.091. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUAVX512 On5001000150020002500SE +/- 12.82, N = 32108.54MIN: 1830.34 / MAX: 2271.241. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUAVX512 On0.42530.85061.27591.70122.1265SE +/- 0.01, N = 31.891. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUAVX512 On5001000150020002500SE +/- 24.95, N = 32097.52MIN: 1871.5 / MAX: 2241.541. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUAVX512 On0.42750.8551.28251.712.1375SE +/- 0.02, N = 31.901. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX512 On0.29930.59860.89791.19721.4965SE +/- 0.01, N = 31.33MIN: 0.76 / MAX: 18.191. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX512 On13002600390052006500SE +/- 34.47, N = 35958.161. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX512 On0.18230.36460.54690.72920.9115SE +/- 0.01, N = 30.81MIN: 0.43 / MAX: 30.361. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX512 On2K4K6K8K10KSE +/- 60.96, N = 39779.571. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUAVX512 On130260390520650SE +/- 4.29, N = 3599.11MIN: 573.64 / MAX: 654.621. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUAVX512 On246810SE +/- 0.05, N = 36.671. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: LBC, LBRY CreditsAVX512 On10K20K30K40K50KSE +/- 469.41, N = 3445171. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: SkeincoinAVX512 On20K40K60K80K100KSE +/- 607.98, N = 15818591. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: GarlicoinAVX512 On6001200180024003000SE +/- 22.33, N = 32955.631. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: x25xAVX512 On90180270360450SE +/- 1.29, N = 3396.501. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Blake-2 SAVX512 On120K240K360K480K600KSE +/- 3930.04, N = 35644301. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: MagiAVX512 On80160240320400SE +/- 1.51, N = 3369.971. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Myriad-GroestlAVX512 On6K12K18K24K30KSE +/- 164.15, N = 3282331. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Quad SHA-256, PyriteAVX512 On20K40K60K80K100KSE +/- 254.06, N = 3968771. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: scryptAVX512 On4080120160200SE +/- 0.52, N = 3175.301. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUAVX512 On5001000150020002500SE +/- 10.13, N = 32234.12MIN: 2159.11. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUAVX512 On9001800270036004500SE +/- 57.14, N = 154281.31MIN: 3574.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUAVX512 On0.21090.42180.63270.84361.0545SE +/- 0.001919, N = 40.937257MIN: 0.791. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUAVX512 On0.3570.7141.0711.4281.785SE +/- 0.01241, N = 91.58666MIN: 1.191. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUAVX512 On0.31680.63360.95041.26721.584SE +/- 0.00754, N = 31.40812MIN: 1.111. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerAVX512 On130K260K390K520K650KSE +/- 6283.42, N = 36148691. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerAVX512 On4K8K12K16K20KSE +/- 98.15, N = 3187031. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path TracerAVX512 On30K60K90K120K150KSE +/- 1116.11, N = 31545091. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path TracerAVX512 On10002000300040005000SE +/- 36.61, N = 348011. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerAVX512 On110K220K330K440K550KSE +/- 3918.21, N = 35119351. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerAVX512 On3K6K9K12K15KSE +/- 116.41, N = 3153241. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path TracerAVX512 On30K60K90K120K150KSE +/- 680.99, N = 31292961. (CXX) g++ options: -O3 -lm -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path TracerAVX512 On8001600240032004000SE +/- 31.26, N = 938691. (CXX) g++ options: -O3 -lm -ldl

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: particle_volume/pathtracer/real_timeAVX512 On20406080100SE +/- 0.64, N = 3105.36

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeAVX512 On0.59321.18641.77962.37282.966SE +/- 0.01558, N = 32.63644

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeAVX512 On0.47480.94961.42441.89922.374SE +/- 0.01970, N = 32.11043

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/ao/real_timeAVX512 On0.4840.9681.4521.9362.42SE +/- 0.01201, N = 32.15093

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.0Run: RTLightmap.hdr.4096x4096 - Device: CPU-OnlyAVX512 On0.02930.05860.08790.11720.1465SE +/- 0.00, N = 30.13

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.0Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-OnlyAVX512 On0.06080.12160.18240.24320.304SE +/- 0.00, N = 30.27

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.0Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-OnlyAVX512 On0.06080.12160.18240.24320.304SE +/- 0.00, N = 30.27

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.3.1Benchmark: vklBenchmark ISPCAVX512 On306090120150SE +/- 1.47, N = 9112MIN: 10 / MAX: 1869

simdjson

This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: TopTweetAVX512 On246810SE +/- 0.05, N = 38.671. (CXX) g++ options: -O3

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: DistinctUserIDAVX512 On246810SE +/- 0.05, N = 38.781. (CXX) g++ options: -O3

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: KostyaAVX512 On1.13852.2773.41554.5545.6925SE +/- 0.01, N = 35.061. (CXX) g++ options: -O3

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: LargeRandomAVX512 On0.35330.70661.05991.41321.7665SE +/- 0.00, N = 31.571. (CXX) g++ options: -O3

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: PartialTweetsAVX512 On246810SE +/- 0.02, N = 37.511. (CXX) g++ options: -O3

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: CrownAVX512 On246810SE +/- 0.0666, N = 38.0523MIN: 7.85 / MAX: 8.34

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragon ObjAVX512 On246810SE +/- 0.0092, N = 38.6081MIN: 8.45 / MAX: 8.89

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian DragonAVX512 On3691215SE +/- 0.03, N = 310.00MIN: 9.86 / MAX: 10.33

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenAVX512 On120240360480600SE +/- 5.89, N = 55671. (CXX) g++ options: -flto -pthread

CPU Temperature Monitor

OpenBenchmarking.orgCelsiusCPU Temperature MonitorPhoronix Test Suite System MonitoringAVX512 On20406080100Min: 40.75 / Avg: 69.02 / Max: 92.13

CPU Power Consumption Monitor

OpenBenchmarking.orgWattsCPU Power Consumption MonitorPhoronix Test Suite System MonitoringAVX512 On714212835Min: 1.17 / Avg: 16.21 / Max: 30.84

CPU Peak Freq (Highest CPU Core Frequency) Monitor

OpenBenchmarking.orgMegahertzCPU Peak Freq (Highest CPU Core Frequency) MonitorPhoronix Test Suite System MonitoringAVX512 On9001800270036004500Min: 1283 / Avg: 2907.75 / Max: 5117

OpenVINO

MinAvgMaxAVX512 On55.169.484.8OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2022.3CPU Temperature Monitor20406080100

MinAvgMaxAVX512 On2.517.227.2OpenBenchmarking.orgWatts, Fewer Is BetterOpenVINO 2022.3CPU Power Consumption Monitor816243240

MinAvgMaxAVX512 On139724935114OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2022.3CPU Peak Freq (Highest CPU Core Frequency) Monitor13002600390052006500

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUAVX512 On2004006008001000SE +/- 24.60, N = 121160.98MIN: 969.12 / MAX: 1321.31. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUAVX512 On0.77851.5572.33553.1143.8925SE +/- 0.08, N = 123.461. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared

79 Results Shown

TensorFlow:
  CPU - 64 - GoogLeNet
  CPU - 64 - AlexNet
  CPU - 64 - ResNet-50
  CPU - 16 - GoogLeNet
  CPU - 16 - AlexNet
  CPU - 16 - ResNet-50
libxsmm
SMHasher
miniBUDE:
  OpenMP - BM1:
    Billion Interactions/s
    GFInst/s
OpenVINO:
  Machine Translation EN To DE FP16 - CPU:
    ms
    FPS
  Person Vehicle Bike Detection FP16 - CPU:
    ms
    FPS
  Vehicle Detection FP16 - CPU:
    ms
    FPS
  Vehicle Detection FP16-INT8 - CPU:
    ms
    FPS
  Weld Porosity Detection FP16 - CPU:
    ms
    FPS
  Weld Porosity Detection FP16-INT8 - CPU:
    ms
    FPS
  Person Detection FP32 - CPU:
    ms
    FPS
  Person Detection FP16 - CPU:
    ms
    FPS
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
    ms
    FPS
  Age Gender Recognition Retail 0013 FP16 - CPU:
    ms
    FPS
  Face Detection FP16-INT8 - CPU:
    ms
    FPS
Cpuminer-Opt:
  LBC, LBRY Credits
  Skeincoin
  Garlicoin
  x25x
  Blake-2 S
  Magi
  Myriad-Groestl
  Quad SHA-256, Pyrite
  scrypt
oneDNN:
  Recurrent Neural Network Inference - u8s8f32 - CPU
  Recurrent Neural Network Training - u8s8f32 - CPU
  IP Shapes 1D - u8s8f32 - CPU
  Deconvolution Batch shapes_3d - u8s8f32 - CPU
  Deconvolution Batch shapes_1d - u8s8f32 - CPU
OSPRay Studio:
  3 - 4K - 32 - Path Tracer
  3 - 4K - 1 - Path Tracer
  3 - 1080p - 32 - Path Tracer
  3 - 1080p - 1 - Path Tracer
  1 - 4K - 32 - Path Tracer
  1 - 4K - 1 - Path Tracer
  1 - 1080p - 32 - Path Tracer
  1 - 1080p - 1 - Path Tracer
OSPRay:
  particle_volume/pathtracer/real_time
  gravity_spheres_volume/dim_512/pathtracer/real_time
  gravity_spheres_volume/dim_512/scivis/real_time
  gravity_spheres_volume/dim_512/ao/real_time
Intel Open Image Denoise:
  RTLightmap.hdr.4096x4096 - CPU-Only
  RT.ldr_alb_nrm.3840x2160 - CPU-Only
  RT.hdr_alb_nrm.3840x2160 - CPU-Only
OpenVKL
simdjson:
  TopTweet
  DistinctUserID
  Kostya
  LargeRand
  PartialTweets
Embree:
  Pathtracer ISPC - Crown
  Pathtracer ISPC - Asian Dragon Obj
  Pathtracer ISPC - Asian Dragon
LeelaChessZero
CPU Temperature Monitor:
  Phoronix Test Suite System Monitoring:
    Celsius
    Watts
    Megahertz
  CPU Temp Monitor:
    Celsius
  CPU Power Consumption Monitor:
    Watts
  CPU Peak Freq (Highest CPU Core Frequency) Monitor:
    Megahertz
OpenVINO:
  Face Detection FP16 - CPU:
    ms
    FPS