AMD EPYC 9654 Genoa AVX-512 benchmark comparison by Michael Larabel for a future article.
AVX-512 On Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mprefer-vector-width=512" CFLAGS="-O3 -march=native -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mprefer-vector-width=512"Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AVX-512 Off Processor: 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 22.10, Kernel: 6.1.0-phx (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 12.2.0 + Clang 15.0.2-1, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -mno-avx512f" CFLAGS="-O3 -march=native -mno-avx512f"Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AMD EPYC 4th Gen AVX-512 Comparison OpenBenchmarking.org Phoronix Test Suite 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1002E BIOS) AMD Device 14a4 1520GB 800GB INTEL SSDPF21Q800GB ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Ubuntu 22.10 6.1.0-phx (x86_64) GNOME Shell 43.0 X Server 1.21.1.4 1.3.224 GCC 12.2.0 + Clang 15.0.2-1 ext4 1920x1080 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution AMD EPYC 4th Gen AVX-512 Comparison Benchmarks System Logs - Transparent Huge Pages: madvise - AVX-512 On: CXXFLAGS="-O3 -march=native -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mprefer-vector-width=512" CFLAGS="-O3 -march=native -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mprefer-vector-width=512" - AVX-512 Off: CXXFLAGS="-O3 -march=native -mno-avx512f" CFLAGS="-O3 -march=native -mno-avx512f" - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10110d - Python 3.10.7 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AVX-512 On vs. AVX-512 Off Comparison Phoronix Test Suite Baseline +46.4% +46.4% +92.8% +92.8% +139.2% +139.2% 50.5% 37.6% 3.6% CPU - 16 - AlexNet 185.4% R.N.N.T - bf16bf16bf16 - CPU 155.1% D.T.S 153.4% R.N.N.T - f32 - CPU 152.5% W.P.D.F - CPU 143.4% W.P.D.F - CPU 143% F.D.F - CPU 132.2% F.D.F - CPU 131.9% A.G.R.R.0.F - CPU 120% LBC, LBRY Credits 117% Device AI Score 115.1% M.T.E.T.D.F - CPU 114.8% M.T.E.T.D.F - CPU 114.7% W.P.D.F.I - CPU 104.7% W.P.D.F.I - CPU 104.5% F.D.F.I - CPU 104.2% F.D.F.I - CPU 103.9% V.D.F - CPU 97.1% V.D.F - CPU 97% CPU - 16 - GoogLeNet 93.3% D.I.S 92.8% V.D.F.I - CPU 84.7% V.D.F.I - CPU 84.6% D.B.s - f32 - CPU 78.2% P.V.B.D.F - CPU 76.5% P.V.B.D.F - CPU 76.4% CPU - 16 - ResNet-50 73.7% gravity_spheres_volume/dim_512/scivis/real_time 71.4% Q.S.2.P 70.9% P.D.F - CPU 70% P.D.F - CPU 69.5% P.D.F - CPU 68.4% P.D.F - CPU 67.7% gravity_spheres_volume/dim_512/ao/real_time 67.4% scrypt 62.9% A.G.R.R.0.F.I - CPU 58.3% resnet-v2-50 56.1% N.Q.A.B.b.u.S.1.P - A.M.S 55.2% N.Q.A.B.b.u.S.1.P - A.M.S 55.2% inception-v3 Garlicoin 46.5% OpenMP - BM1 44.1% OpenMP - BM1 44.1% N.Q.A.B.b.u.S.1.P - S.S.S 41.5% N.Q.A.B.b.u.S.1.P - S.S.S 41.5% C.C.R.5.I - A.M.S 38.5% C.C.R.5.I - A.M.S 38.4% R.N.N.I - bf16bf16bf16 - CPU A.G.R.R.0.F - CPU 37.4% Skeincoin 36.7% OpenMP - BM2 35.4% OpenMP - BM2 35.4% gravity_spheres_volume/dim_512/pathtracer/real_time 33.8% x25x 33.1% DistinctUserID 30.7% PartialTweets 30.4% C.D.Y.C - A.M.S 24.1% C.D.Y.C - A.M.S 24% Kostya 23.3% 2 - 1080p - 32 - Path Tracer 23% 2 - 4K - 16 - Path Tracer 22.8% 2 - 1080p - 16 - Path Tracer 22.7% 1 - 1080p - 32 - Path Tracer 22.4% 1 - 1080p - 1 - Path Tracer 22.3% 2 - 4K - 32 - Path Tracer 22.2% 1 - 4K - 1 - Path Tracer 22.2% 1 - 4K - 32 - Path Tracer 22.1% 1 - 4K - 16 - Path Tracer 22.1% 1 - 1080p - 16 - Path Tracer 22% 3 - 1080p - 16 - Path Tracer 21.7% 3 - 1080p - 32 - Path Tracer 21.6% TopTweet 21.2% 3 - 4K - 1 - Path Tracer 21.2% LargeRand 21.2% N.T.C.B.b.u.S - A.M.S 21% 3 - 4K - 16 - Path Tracer 21% N.T.C.B.b.u.S - A.M.S 20.9% 3 - 1080p - 1 - Path Tracer 20.9% 3 - 4K - 32 - Path Tracer 20.7% Pathtracer ISPC - Asian Dragon 20.4% Pathtracer ISPC - Asian Dragon Obj 20.2% C.D.Y.C - S.S.S 20.1% C.D.Y.C - S.S.S 20.1% Pathtracer ISPC - Crown 19.8% 2 - 4K - 1 - Path Tracer 19.5% 2 - 1080p - 1 - Path Tracer 19.5% N.T.C.D.m - A.M.S 19% d.M.M.S - Execution Time 19% N.T.C.D.m - A.M.S 19% OpenMP - Points2Image 18.7% super-resolution-10 - CPU - Standard 17.7% N.T.C.B.b.u.S - S.S.S 15.6% N.T.C.B.b.u.S - S.S.S 15.6% vklBenchmark ISPC 15.3% CPU - vision_transformer 15% ArcFace ResNet-100 - CPU - Standard 14.5% A.G.R.R.0.F.I - CPU 12.8% N.T.C.B.b.u.c - S.S.S 12.8% N.T.C.B.b.u.c - S.S.S 12.8% N.D.C.o.b.u.o.I - S.S.S 12.2% N.D.C.o.b.u.o.I - S.S.S 12.2% CPU - blazeface 12.1% Eigen 11.4% CPU - regnety_400m 9.2% F.x.A 9.1% CPU - efficientnet-b0 8.9% fcn-resnet101-11 - CPU - Standard 8.4% BLAS 7.8% Fayalite-FIST 6.7% SqueezeNetV1.0 6.6% d.M.M.S - Mesh Time 6.3% N.T.C.D.m - S.S.S 5.4% N.T.C.D.m - S.S.S 5.4% Preset 12 - Bosphorus 4K 5% 4.9% JPEG - 90 4.6% CPU - googlenet 4.6% PNG - 90 4.5% CPU - mnasnet 4.5% B.C 4.3% C.B.S.A - f32 - CPU 4.2% RTLightmap.hdr.4096x4096 bertsquad-12 - CPU - Standard 3.4% Windowed Gaussian 3.3% CPU - FastestDet 3% Preset 13 - Bosphorus 4K 2.8% JPEG - 100 2.8% OpenMP - NDT Mapping 2.7% TensorFlow oneDNN AI Benchmark Alpha oneDNN OpenVINO OpenVINO OpenVINO OpenVINO OpenVINO Cpuminer-Opt AI Benchmark Alpha OpenVINO OpenVINO OpenVINO OpenVINO OpenVINO OpenVINO OpenVINO OpenVINO TensorFlow AI Benchmark Alpha OpenVINO OpenVINO oneDNN OpenVINO OpenVINO TensorFlow OSPRay Cpuminer-Opt OpenVINO OpenVINO OpenVINO OpenVINO OSPRay Cpuminer-Opt OpenVINO Mobile Neural Network Neural Magic DeepSparse Neural Magic DeepSparse Mobile Neural Network Cpuminer-Opt miniBUDE miniBUDE Neural Magic DeepSparse Neural Magic DeepSparse Neural Magic DeepSparse Neural Magic DeepSparse oneDNN OpenVINO Cpuminer-Opt miniBUDE miniBUDE OSPRay Cpuminer-Opt simdjson simdjson Neural Magic DeepSparse Neural Magic DeepSparse simdjson OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio OSPRay Studio simdjson OSPRay Studio simdjson Neural Magic DeepSparse OSPRay Studio Neural Magic DeepSparse OSPRay Studio OSPRay Studio Embree Embree Neural Magic DeepSparse Neural Magic DeepSparse Embree OSPRay Studio OSPRay Studio Neural Magic DeepSparse OpenFOAM Neural Magic DeepSparse Darmstadt Automotive Parallel Heterogeneous Suite ONNX Runtime Neural Magic DeepSparse Neural Magic DeepSparse OpenVKL NCNN ONNX Runtime OpenVINO Neural Magic DeepSparse Neural Magic DeepSparse Neural Magic DeepSparse Neural Magic DeepSparse NCNN LeelaChessZero NCNN SMHasher NCNN ONNX Runtime LeelaChessZero CP2K Molecular Dynamics Mobile Neural Network OpenFOAM Neural Magic DeepSparse Neural Magic DeepSparse SVT-AV1 Numpy Benchmark JPEG XL libjxl NCNN JPEG XL libjxl NCNN Numenta Anomaly Benchmark oneDNN Intel Open Image Denoise ONNX Runtime Numenta Anomaly Benchmark NCNN SVT-AV1 JPEG XL libjxl Darmstadt Automotive Parallel Heterogeneous Suite AVX-512 On AVX-512 Off
AMD EPYC 4th Gen AVX-512 Comparison ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Detection,YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection,YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 16 - AlexNet tensorflow: CPU - 16 - GoogLeNet lczero: BLAS lczero: Eigen embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Asian Dragon Obj embree: Pathtracer ISPC - Crown openvkl: vklBenchmark ISPC ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/scivis/real_time ospray: gravity_spheres_volume/dim_512/pathtracer/real_time ospray-studio: 1 - 1080p - 1 - Path Tracer ospray-studio: 1 - 1080p - 16 - Path Tracer ospray-studio: 1 - 1080p - 32 - Path Tracer ospray-studio: 1 - 4K - 1 - Path Tracer ospray-studio: 1 - 4K - 16 - Path Tracer ospray-studio: 1 - 4K - 32 - Path Tracer ospray-studio: 2 - 1080p - 1 - Path Tracer ospray-studio: 2 - 1080p - 16 - Path Tracer ospray-studio: 2 - 1080p - 32 - Path Tracer ospray-studio: 2 - 4K - 1 - Path Tracer ospray-studio: 2 - 4K - 16 - Path Tracer ospray-studio: 2 - 4K - 32 - Path Tracer ospray-studio: 3 - 1080p - 1 - Path Tracer ospray-studio: 3 - 1080p - 16 - Path Tracer ospray-studio: 3 - 1080p - 32 - Path Tracer ospray-studio: 3 - 4K - 1 - Path Tracer ospray-studio: 3 - 4K - 16 - Path Tracer ospray-studio: 3 - 4K - 32 - Path Tracer onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mnn: inception-v3 cpuminer-opt: scrypt cpuminer-opt: Quad SHA-256, Pyrite cpuminer-opt: x25x cpuminer-opt: Garlicoin cpuminer-opt: Skeincoin cpuminer-opt: LBC, LBRY Credits ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - resnet50 ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU gromacs: MPI CPU - water_GMX50_bare onnx: fcn-resnet101-11 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard numpy: minibude: OpenMP - BM1 minibude: OpenMP - BM1 minibude: OpenMP - BM2 minibude: OpenMP - BM2 svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K numenta-nab: Bayesian Changepoint numenta-nab: Windowed Gaussian numenta-nab: Relative Entropy oidn: RT.hdr_alb_nrm.3840x2160 oidn: RT.ldr_alb_nrm.3840x2160 oidn: RTLightmap.hdr.4096x4096 daphne: OpenMP - NDT Mapping daphne: OpenMP - Points2Image simdjson: PartialTweets simdjson: LargeRand simdjson: Kostya simdjson: DistinctUserID simdjson: TopTweet openfoam: drivaerFastback, Medium Mesh Size - Mesh Time openfoam: drivaerFastback, Medium Mesh Size - Execution Time cp2k: Fayalite-FIST smhasher: FarmHash32 x86_64 AVX smhasher: FarmHash32 x86_64 AVX jpegxl: JPEG - 90 jpegxl: JPEG - 100 jpegxl: PNG - 90 AVX-512 On AVX-512 Off 3510 2701 6211 84.9988 11.7586 616.6804 155.3462 187.0837 5.3413 1195.9367 80.0826 1953.0255 49.0328 34.5206 28.9609 100.5458 9.9408 761.2241 125.7860 189.5611 5.2712 857.2265 111.7282 34.2518 29.1892 22.15 157.29 60.17 9077 9096 212.9364 184.4701 180.9315 1332 44.1566 43.0711 54.2187 148 2353 4698 581 9271 18546 154 2384 4764 604 9385 18808 177 2804 5608 694 11070 22184 0.516167 22.5875 0.989591 1962.06 1955.39 2361.27 15.441 8.579 45.812 4800.61 2260680 7696.06 71553 1951330 1066950 42.30 57.71 25.88 72.80 66.34 247.32 74.93 58.95 102.04 469.29 193.93 246.98 148967.98 0.55 170652.71 0.36 43.34 1100.25 43.34 1101.00 19800.20 9.63 9988.44 4.79 11202.62 4.28 7452.96 6.43 9065.34 5.29 956.88 50.11 18.764 271 7401 516 1051 575.66 7299.545 291.982 8652.005 346.081 246.572 245.555 16.677 4.727 9.898 3.51 3.51 1.66 1371.18 13677.764494843 6.70 1.26 4.18 6.86 6.52 135.77418 113.63733 1122.725 39794.44 26.387 9.55 0.74 9.95 1821 1066 2887 73.5380 13.5907 509.4462 187.8540 177.5387 5.6282 1005.1615 95.3347 1410.8051 67.9118 30.6062 32.6647 71.0437 14.0655 490.3658 195.1656 157.819 6.3313 690.8722 138.5861 30.5267 32.7505 12.75 55.11 31.12 8423 8162 176.8831 153.4476 151.0558 1155 26.3794 25.1354 40.5117 181 2871 5749 710 11321 22647 184 2924 5861 722 11524 22985 214 3413 6820 841 13398 26785 0.537682 22.8354 1.76336 4953.98 4989.13 1715.76 24.098 9.146 30.436 2946.08 1322960 5784.25 48827 1427543 491740 44.19 62.86 29.00 76.13 67.48 270.12 86.16 60.73 43.94 1088.50 94.97 503.50 108449.49 1.21 151239.60 0.57 25.50 1864.93 25.74 1845.95 9672.94 19.69 4110.49 11.66 6065.76 7.90 3782.08 12.67 5135.56 9.33 445.59 107.62 18.467 250 6288 499 918 548.83 5065.096 202.604 6391.519 255.661 239.881 233.944 17.389 4.882 10.035 3.50 3.51 1.72 1335.63 11521.38 5.14 1.04 3.39 5.25 5.38 144.3478 135.21648 1198.408 36483.70 26.412 9.13 0.72 9.52 OpenBenchmarking.org
Neural Magic DeepSparse OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream AVX-512 Off AVX-512 On 20 40 60 80 100 SE +/- 0.27, N = 3 SE +/- 0.45, N = 3 73.54 85.00
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream AVX-512 Off AVX-512 On 130 260 390 520 650 SE +/- 0.33, N = 3 SE +/- 1.92, N = 3 509.45 616.68
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream AVX-512 Off AVX-512 On 40 80 120 160 200 SE +/- 0.29, N = 3 SE +/- 0.11, N = 3 177.54 187.08
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream AVX-512 Off AVX-512 On 300 600 900 1200 1500 SE +/- 0.78, N = 3 SE +/- 2.39, N = 3 1005.16 1195.94
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream AVX-512 Off AVX-512 On 400 800 1200 1600 2000 SE +/- 0.17, N = 3 SE +/- 4.32, N = 3 1410.81 1953.03
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream AVX-512 Off AVX-512 On 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 30.61 34.52
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream AVX-512 Off AVX-512 On 20 40 60 80 100 SE +/- 0.17, N = 3 SE +/- 0.17, N = 3 71.04 100.55
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream AVX-512 Off AVX-512 On 160 320 480 640 800 SE +/- 0.61, N = 3 SE +/- 0.79, N = 3 490.37 761.22
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.1 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream AVX-512 Off AVX-512 On 8 16 24 32 40 SE +/- 0.10, N = 3 SE +/- 0.08, N = 3 30.53 34.25
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OSPRay Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OSPRay Studio Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
Mobile Neural Network MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: resnet-v2-50 AVX-512 Off AVX-512 On 6 12 18 24 30 SE +/- 0.08, N = 8 SE +/- 0.08, N = 9 24.10 15.44 -mno-avx512f - MIN: 23.44 / MAX: 71.32 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 14.79 / MAX: 54.05 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: SqueezeNetV1.0 AVX-512 Off AVX-512 On 3 6 9 12 15 SE +/- 0.091, N = 8 SE +/- 0.147, N = 9 9.146 8.579 -mno-avx512f - MIN: 7.72 / MAX: 19.03 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 6.67 / MAX: 21.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Cpuminer-Opt Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.
NCNN NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet AVX-512 Off AVX-512 On 10 20 30 40 50 SE +/- 0.49, N = 3 SE +/- 0.26, N = 8 44.19 42.30 -mno-avx512f - MIN: 41.56 / MAX: 148.98 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 39.64 / MAX: 571.78 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 AVX-512 Off AVX-512 On 14 28 42 56 70 SE +/- 0.31, N = 3 SE +/- 0.35, N = 8 62.86 57.71 -mno-avx512f - MIN: 59.46 / MAX: 154.82 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 54.52 / MAX: 522.74 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: blazeface AVX-512 Off AVX-512 On 7 14 21 28 35 SE +/- 0.54, N = 3 SE +/- 0.18, N = 8 29.00 25.88 -mno-avx512f - MIN: 26.03 / MAX: 144.12 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 24.45 / MAX: 112.37 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet AVX-512 Off AVX-512 On 20 40 60 80 100 SE +/- 1.35, N = 3 SE +/- 0.76, N = 8 76.13 72.80 -mno-avx512f - MIN: 70.13 / MAX: 155.61 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 67.13 / MAX: 388.52 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 AVX-512 Off AVX-512 On 15 30 45 60 75 SE +/- 0.89, N = 3 SE +/- 0.50, N = 8 67.48 66.34 -mno-avx512f - MIN: 63.4 / MAX: 170.14 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 62.92 / MAX: 194.74 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m AVX-512 Off AVX-512 On 60 120 180 240 300 SE +/- 3.75, N = 3 SE +/- 2.25, N = 8 270.12 247.32 -mno-avx512f - MIN: 245.47 / MAX: 498.8 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 232.61 / MAX: 506.55 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer AVX-512 Off AVX-512 On 20 40 60 80 100 SE +/- 5.82, N = 3 SE +/- 1.59, N = 8 86.16 74.93 -mno-avx512f - MIN: 73.72 / MAX: 1760.74 -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi - MIN: 65.04 / MAX: 2154.62 1. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Face Detection FP16 - Device: CPU AVX-512 Off AVX-512 On 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 43.94 102.04 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Face Detection FP16-INT8 - Device: CPU AVX-512 Off AVX-512 On 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 94.97 193.93 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU AVX-512 Off AVX-512 On 30K 60K 90K 120K 150K SE +/- 1130.85, N = 4 SE +/- 1328.20, N = 3 108449.49 148967.98 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU AVX-512 Off AVX-512 On 40K 80K 120K 160K 200K SE +/- 1455.37, N = 3 SE +/- 127.59, N = 3 151239.60 170652.71 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Person Detection FP16 - Device: CPU AVX-512 Off AVX-512 On 10 20 30 40 50 SE +/- 0.22, N = 3 SE +/- 0.20, N = 3 25.50 43.34 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Person Detection FP32 - Device: CPU AVX-512 Off AVX-512 On 10 20 30 40 50 SE +/- 0.04, N = 3 SE +/- 0.18, N = 3 25.74 43.34 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU AVX-512 Off AVX-512 On 4K 8K 12K 16K 20K SE +/- 1.81, N = 3 SE +/- 12.72, N = 3 9672.94 19800.20 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Weld Porosity Detection FP16 - Device: CPU AVX-512 Off AVX-512 On 2K 4K 6K 8K 10K SE +/- 4.29, N = 3 SE +/- 6.96, N = 3 4110.49 9988.44 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU AVX-512 Off AVX-512 On 2K 4K 6K 8K 10K SE +/- 1.39, N = 3 SE +/- 2.53, N = 3 6065.76 11202.62 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Vehicle Detection FP16 - Device: CPU AVX-512 Off AVX-512 On 1600 3200 4800 6400 8000 SE +/- 10.17, N = 3 SE +/- 3.52, N = 3 3782.08 7452.96 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Person Vehicle Bike Detection FP16 - Device: CPU AVX-512 Off AVX-512 On 2K 4K 6K 8K 10K SE +/- 2.17, N = 3 SE +/- 4.01, N = 3 5135.56 9065.34 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU AVX-512 Off AVX-512 On 200 400 600 800 1000 SE +/- 3.04, N = 3 SE +/- 4.00, N = 3 445.59 956.88 -mno-avx512f -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi 1. (CXX) g++ options: -fPIC -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -flto -shared
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 AVX-512 Off AVX-512 On 1600 3200 4800 6400 8000 SE +/- 26.42, N = 8 SE +/- 5.79, N = 10 5065.10 7299.55 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM2 AVX-512 Off AVX-512 On 2K 4K 6K 8K 10K SE +/- 31.65, N = 3 SE +/- 11.62, N = 4 6391.52 8652.01 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
Numenta Anomaly Benchmark Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial time-series data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.
Darmstadt Automotive Parallel Heterogeneous Suite DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time AVX-512 Off AVX-512 On 30 60 90 120 150 144.35 135.77 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
CP2K Molecular Dynamics CP2K is an open-source molecular dynamics software package focused on quantum chemistry and solid-state physics. This test profile currently uses the SSMP (OpenMP) version of cp2k. Learn more via the OpenBenchmarking.org test page.
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
CPU Power Consumption Monitor OpenBenchmarking.org Watts CPU Power Consumption Monitor Phoronix Test Suite System Monitoring AVX-512 Off AVX-512 On 130 260 390 520 650 Min: 106.95 / Avg: 449.58 / Max: 735.32 Min: 26.37 / Avg: 434.8 / Max: 766.01
CPU Temperature Monitor OpenBenchmarking.org Celsius CPU Temperature Monitor Phoronix Test Suite System Monitoring AVX-512 Off AVX-512 On 14 28 42 56 70 Min: 35.5 / Avg: 51.26 / Max: 73.75 Min: 30.13 / Avg: 49.97 / Max: 73.38
AVX-512 On Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mprefer-vector-width=512" CFLAGS="-O3 -march=native -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mprefer-vector-width=512"Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 18 December 2022 08:13 by user phoronix.
AVX-512 Off Processor: 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1002E BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 22.10, Kernel: 6.1.0-phx (x86_64), Desktop: GNOME Shell 43.0, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 12.2.0 + Clang 15.0.2-1, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=native -mno-avx512f" CFLAGS="-O3 -march=native -mno-avx512f"Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10110dPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 18 December 2022 08:13 by user phoronix.