ddf Tests for a future article. AMD EPYC 8534PN 64-Core testing with a AMD Cinnabar (RCB1009C BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2401089-NE-DDF54911740 a Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212Python Notes: Python 3.11.5Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
b Processor: AMD EPYC 8534PN 64-Core @ 2.00GHz (64 Cores / 128 Threads), Motherboard: AMD Cinnabar (RCB1009C BIOS), Chipset: AMD Device 14a4, Memory: 192GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.10, Kernel: 6.5.0-5-generic (x86_64), Desktop: GNOME Shell, Display Server: X Server 1.21.1.7, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 640x480
ddf OpenBenchmarking.org Phoronix Test Suite AMD EPYC 8534PN 64-Core @ 2.00GHz (64 Cores / 128 Threads) AMD Cinnabar (RCB1009C BIOS) AMD Device 14a4 192GB 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.5.0-5-generic (x86_64) GNOME Shell X Server 1.21.1.7 GCC 13.2.0 ext4 640x480 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution Ddf Benchmarks System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212 - Python 3.11.5 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
a vs. b Comparison Phoronix Test Suite Baseline +3.1% +3.1% +6.2% +6.2% +9.3% +9.3% +12.4% +12.4% 12.4% 4% 3.7% 3% 2.3% 2.1% BLAS Q.7.C.E.7 5.9% Time To Compile 5.4% Preset 13 - Bosphorus 4K Q.9.C.E.7 3.8% Multi-Threaded 3.8% Eigen CPU - 16 - ResNet-152 3.7% Update Rand CryptoNight-Heavy - 1M 3% Preset 4 - Bosphorus 1080p N.T.C.B.b.u.S.S.I - S.S.S 2.2% N.T.C.B.b.u.S.S.I - S.S.S 2.1% Rand Fill Sync Preset 8 - Bosphorus 1080p 2.1% LeelaChessZero WebP2 Image Encode Timed Gem5 Compilation SVT-AV1 WebP2 Image Encode QuantLib LeelaChessZero PyTorch Speedb Xmrig SVT-AV1 Neural Magic DeepSparse Neural Magic DeepSparse Speedb SVT-AV1 a b
ddf blender: BMW27 - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only cloverleaf: clover_bm cloverleaf: clover_bm64_short easywave: e2Asean Grid + BengkuluSept2007 Source - 240 easywave: e2Asean Grid + BengkuluSept2007 Source - 1200 easywave: e2Asean Grid + BengkuluSept2007 Source - 2400 embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Asian Dragon Obj ffmpeg: libx265 - Live ffmpeg: libx265 - Upload ffmpeg: libx265 - Platform ffmpeg: libx265 - Video On Demand lczero: BLAS lczero: Eigen deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream openradioss: Bumper Beam openradioss: Chrysler Neon 1M openradioss: Cell Phone Drop Test openradioss: Bird Strike on Windshield openradioss: Rubber O-Ring Seal Installation openradioss: INIVOL and Fluid Structure Interaction Drop Container pytorch: CPU - 1 - ResNet-50 pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 16 - ResNet-50 pytorch: CPU - 32 - ResNet-50 pytorch: CPU - 64 - ResNet-50 pytorch: CPU - 16 - ResNet-152 pytorch: CPU - 32 - ResNet-152 pytorch: CPU - 64 - ResNet-152 pytorch: CPU - 1 - Efficientnet_v2_l pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 64 - Efficientnet_v2_l quantlib: Multi-Threaded quantlib: Single-Threaded quicksilver: CTS2 quicksilver: CORAL2 P1 quicksilver: CORAL2 P2 rav1e: 1 rav1e: 5 rav1e: 6 rav1e: 10 speedb: Rand Fill speedb: Rand Read speedb: Update Rand speedb: Seq Fill speedb: Rand Fill Sync speedb: Read While Writing speedb: Read Rand Write Rand svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p tensorflow: CPU - 1 - VGG-16 tensorflow: CPU - 1 - AlexNet tensorflow: CPU - 16 - VGG-16 tensorflow: CPU - 16 - AlexNet tensorflow: CPU - 1 - GoogLeNet tensorflow: CPU - 1 - ResNet-50 tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 16 - ResNet-50 build-ffmpeg: Time To Compile build-gem5: Time To Compile webp2: Default webp2: Quality 75, Compression Effort 7 webp2: Quality 95, Compression Effort 7 webp2: Quality 100, Compression Effort 5 webp2: Quality 100, Lossless Compression xmrig: KawPow - 1M xmrig: Monero - 1M xmrig: Wownero - 1M xmrig: GhostRider - 1M xmrig: CryptoNight-Heavy - 1M xmrig: CryptoNight-Femto UPX2 - 1M y-cruncher: 1B y-cruncher: 5B y-cruncher: 10B y-cruncher: 500M a b 26.71 67.06 35.54 239.83 86.14 13.67 57.21 1.947 39.643 111.038 67.9313 69.2031 77.2581 69.0867 83.8771 71.4939 114.74 23.20 47.15 47.03 315 272 36.8118 852.1762 27.5732 36.2588 1458.049 21.9243 195.6 5.1074 486.6043 65.6643 185.3007 5.3901 3813.6379 8.3709 800.5916 1.2456 220.0706 145.021 145.6232 6.8554 46.9635 673.3343 30.3213 32.9699 485.9989 65.6917 185.9864 5.3704 222.2492 143.5315 147.2472 6.7843 323.0662 98.8186 130.9206 7.6305 64.0666 496.3373 41.0087 24.3612 698.3574 45.7574 63.2873 15.7849 37.088 854.1221 27.5048 36.3487 88.11 297.08 31.94 143.85 76.78 164.67 45.19 16.54 36.12 36.66 36.87 14.70 14.37 14.89 9.53 6.00 6.06 6.03 176928.1 2634.4 16240000 21350000 16170000 0.85 3.584 4.851 12.32 369023 303866048 350612 371857 239469 15411684 2539184 6.628 67.98 190.949 187.077 17.075 131.9 511.853 597.058 9.87 30.46 35.21 299.15 17 5.97 155.77 51.06 18.145 211.562 7.44 0.54 0.27 11.63 0.06 20682.1 20714.7 40330.7 4436.2 20666.7 20698.4 10.245 53.068 112.543 5.11 26.75 67.51 35.33 240.12 86.16 13.93 57.09 1.958 39.484 111.072 67.1138 68.5827 77.103 68.5948 83.5831 71.4428 115.73 23.29 47.07 47.05 354 282 37.0225 853.3085 27.5404 36.3016 1454.9899 21.9623 191.4824 5.217 485.7323 65.7823 186.2351 5.3631 3800.6772 8.399 799.9581 1.2465 220.1545 144.9123 145.1485 6.8775 46.8626 674.9861 30.1617 33.1445 485.8785 65.7405 184.5078 5.4136 221.7174 143.8442 147.1485 6.7891 322.9189 98.8586 131.4576 7.5994 63.9058 497.6216 40.9888 24.3739 695.3481 45.9477 63.3431 15.77 37.16 853.694 27.5295 36.316 88.16 295.72 31.84 142.38 76.1 163.46 45.42 16.58 36.35 36.38 36.29 14.18 14.13 14.61 9.61 5.98 6.08 6.03 170381.3 2633.8 16310000 21280000 16190000 0.85 3.596 4.891 12.36 367684 304143127 361192 369178 244535 15231425 2539687 6.75 68.523 194.49 194.587 17.467 129.181 503.618 601.573 9.87 30.45 35.14 297.84 17 5.97 155.78 50.8 18.042 223.062 7.49 0.51 0.26 11.68 0.06 20732.7 20732.3 40146.1 4442.9 20071.9 20683 10.202 53.146 112.813 5.106 OpenBenchmarking.org
Blender Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.0 Blend File: BMW27 - Compute: CPU-Only a b 6 12 18 24 30 26.71 26.75
OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short a b 13 26 39 52 65 57.21 57.09 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
easyWave The easyWave software allows simulating tsunami generation and propagation in the context of early warning systems. EasyWave supports making use of OpenMP for CPU multi-threading and there are also GPU ports available but not currently incorporated as part of this test profile. The easyWave tsunami generation software is run with one of the example/reference input files for measuring the CPU execution time. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 240 a b 0.4406 0.8812 1.3218 1.7624 2.203 1.947 1.958 1. (CXX) g++ options: -O3 -fopenmp
OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 a b 9 18 27 36 45 39.64 39.48 1. (CXX) g++ options: -O3 -fopenmp
OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 a b 20 40 60 80 100 111.04 111.07 1. (CXX) g++ options: -O3 -fopenmp
Embree Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Crown a b 15 30 45 60 75 67.93 67.11 MIN: 66.43 / MAX: 69.87 MIN: 65.7 / MAX: 68.8
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown a b 15 30 45 60 75 69.20 68.58 MIN: 67.75 / MAX: 70.94 MIN: 67.24 / MAX: 70.49
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon a b 20 40 60 80 100 77.26 77.10 MIN: 76.76 / MAX: 78.27 MIN: 76.54 / MAX: 78.04
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon Obj a b 15 30 45 60 75 69.09 68.59 MIN: 68.49 / MAX: 70.1 MIN: 67.96 / MAX: 69.51
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon a b 20 40 60 80 100 83.88 83.58 MIN: 83.23 / MAX: 85.03 MIN: 82.9 / MAX: 84.61
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b 16 32 48 64 80 71.49 71.44 MIN: 70.94 / MAX: 72.42 MIN: 70.79 / MAX: 72.51
FFmpeg This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Live a b 30 60 90 120 150 114.74 115.73 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Upload a b 6 12 18 24 30 23.20 23.29 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Platform a b 11 22 33 44 55 47.15 47.07 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Video On Demand a b 11 22 33 44 55 47.03 47.05 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Bumper Beam a b 20 40 60 80 100 88.11 88.16
PyTorch This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Currently this test profile is catered to CPU-based testing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 a b 10 20 30 40 50 45.19 45.42 MIN: 43.72 / MAX: 45.91 MIN: 44.29 / MAX: 45.97
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 a b 4 8 12 16 20 16.54 16.58 MIN: 16.37 / MAX: 16.68 MIN: 16.43 / MAX: 16.69
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 a b 8 16 24 32 40 36.12 36.35 MIN: 35.2 / MAX: 36.51 MIN: 35.07 / MAX: 36.8
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 a b 8 16 24 32 40 36.66 36.38 MIN: 29.64 / MAX: 37.04 MIN: 30.16 / MAX: 36.75
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 a b 8 16 24 32 40 36.87 36.29 MIN: 35.7 / MAX: 37.3 MIN: 35.06 / MAX: 36.7
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 a b 4 8 12 16 20 14.70 14.18 MIN: 14.51 / MAX: 14.85 MIN: 14.06 / MAX: 14.27
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 a b 4 8 12 16 20 14.37 14.13 MIN: 13.3 / MAX: 14.46 MIN: 12.97 / MAX: 14.21
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 a b 4 8 12 16 20 14.89 14.61 MIN: 13.43 / MAX: 14.98 MIN: 13.45 / MAX: 14.71
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l a b 3 6 9 12 15 9.53 9.61 MIN: 9.35 / MAX: 9.64 MIN: 9.52 / MAX: 9.7
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l a b 2 4 6 8 10 6.00 5.98 MIN: 5.47 / MAX: 6.12 MIN: 5.52 / MAX: 6.11
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l a b 2 4 6 8 10 6.06 6.08 MIN: 5.71 / MAX: 6.17 MIN: 5.69 / MAX: 6.19
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l a b 2 4 6 8 10 6.03 6.03 MIN: 5.52 / MAX: 6.16 MIN: 5.51 / MAX: 6.13
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Multi-Threaded a b 40K 80K 120K 160K 200K 176928.1 170381.3 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Single-Threaded a b 600 1200 1800 2400 3000 2634.4 2633.8 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
Quicksilver Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CTS2 a b 3M 6M 9M 12M 15M 16240000 16310000 1. (CXX) g++ options: -fopenmp -O3 -march=native
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read a b 70M 140M 210M 280M 350M 303866048 304143127 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random a b 80K 160K 240K 320K 400K 350612 361192 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Sequential Fill a b 80K 160K 240K 320K 400K 371857 369178 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill Sync a b 50K 100K 150K 200K 250K 239469 244535 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read While Writing a b 3M 6M 9M 12M 15M 15411684 15231425 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random a b 500K 1000K 1500K 2000K 2500K 2539184 2539687 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
SVT-AV1 This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b 2 4 6 8 10 6.628 6.750 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b 15 30 45 60 75 67.98 68.52 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b 40 80 120 160 200 190.95 194.49 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b 40 80 120 160 200 187.08 194.59 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b 4 8 12 16 20 17.08 17.47 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b 30 60 90 120 150 131.90 129.18 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b 110 220 330 440 550 511.85 503.62 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b 130 260 390 520 650 597.06 601.57 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: VGG-16 a b 3 6 9 12 15 9.87 9.87
WebP2 Image Encode This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Default a b 2 4 6 8 10 7.44 7.49 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: KawPow - Hash Count: 1M a b 4K 8K 12K 16K 20K 20682.1 20732.7 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: Monero - Hash Count: 1M a b 4K 8K 12K 16K 20K 20714.7 20732.3 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: Wownero - Hash Count: 1M a b 9K 18K 27K 36K 45K 40330.7 40146.1 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: GhostRider - Hash Count: 1M a b 1000 2000 3000 4000 5000 4436.2 4442.9 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: CryptoNight-Heavy - Hash Count: 1M a b 4K 8K 12K 16K 20K 20666.7 20071.9 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: CryptoNight-Femto UPX2 - Hash Count: 1M a b 4K 8K 12K 16K 20K 20698.4 20683.0 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
a Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212Python Notes: Python 3.11.5Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 7 January 2024 21:18 by user phoronix.
b Processor: AMD EPYC 8534PN 64-Core @ 2.00GHz (64 Cores / 128 Threads), Motherboard: AMD Cinnabar (RCB1009C BIOS), Chipset: AMD Device 14a4, Memory: 192GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 23.10, Kernel: 6.5.0-5-generic (x86_64), Desktop: GNOME Shell, Display Server: X Server 1.21.1.7, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 640x480
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212Python Notes: Python 3.11.5Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 8 January 2024 00:13 by user phoronix.