ddf Tests for a future article. AMD EPYC 8534PN 64-Core testing with a AMD Cinnabar (RCB1009C BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2401089-NE-DDF54911740&grs&sor .
ddf Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution a b AMD EPYC 8534PN 64-Core @ 2.00GHz (64 Cores / 128 Threads) AMD Cinnabar (RCB1009C BIOS) AMD Device 14a4 192GB 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.5.0-5-generic (x86_64) GNOME Shell X Server 1.21.1.7 GCC 13.2.0 ext4 640x480 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212 Python Details - Python 3.11.5 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
ddf lczero: BLAS webp2: Quality 75, Compression Effort 7 build-gem5: Time To Compile svt-av1: Preset 13 - Bosphorus 4K webp2: Quality 95, Compression Effort 7 quantlib: Multi-Threaded lczero: Eigen pytorch: CPU - 16 - ResNet-152 speedb: Update Rand xmrig: CryptoNight-Heavy - 1M svt-av1: Preset 4 - Bosphorus 1080p deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream speedb: Rand Fill Sync svt-av1: Preset 8 - Bosphorus 1080p pytorch: CPU - 64 - ResNet-152 cloverleaf: clover_bm svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 4 - Bosphorus 4K pytorch: CPU - 32 - ResNet-152 svt-av1: Preset 12 - Bosphorus 1080p pytorch: CPU - 64 - ResNet-50 embree: Pathtracer - Crown speedb: Read While Writing openradioss: Bird Strike on Windshield embree: Pathtracer ISPC - Crown openradioss: Rubber O-Ring Seal Installation ffmpeg: libx265 - Live pytorch: CPU - 1 - Efficientnet_v2_l rav1e: 6 deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream svt-av1: Preset 8 - Bosphorus 4K pytorch: CPU - 32 - ResNet-50 svt-av1: Preset 13 - Bosphorus 1080p openradioss: INIVOL and Fluid Structure Interaction Drop Container speedb: Seq Fill embree: Pathtracer - Asian Dragon Obj webp2: Default blender: Classroom - CPU-Only pytorch: CPU - 16 - ResNet-50 blender: Fishy Cat - CPU-Only deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream build-ffmpeg: Time To Compile easywave: e2Asean Grid + BengkuluSept2007 Source - 240 deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream tensorflow: CPU - 16 - ResNet-50 pytorch: CPU - 1 - ResNet-50 deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream openradioss: Chrysler Neon 1M xmrig: Wownero - 1M tensorflow: CPU - 16 - AlexNet deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream quicksilver: CTS2 webp2: Quality 100, Compression Effort 5 y-cruncher: 1B deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream easywave: e2Asean Grid + BengkuluSept2007 Source - 1200 ffmpeg: libx265 - Upload speedb: Rand Fill embree: Pathtracer ISPC - Asian Dragon deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream rav1e: 5 pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l quicksilver: CORAL2 P1 deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream rav1e: 10 deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream openradioss: Cell Phone Drop Test deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream xmrig: KawPow - 1M pytorch: CPU - 1 - ResNet-152 y-cruncher: 10B deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream cloverleaf: clover_bm64_short embree: Pathtracer - Asian Dragon tensorflow: CPU - 16 - VGG-16 deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream ffmpeg: libx265 - Platform xmrig: GhostRider - 1M blender: BMW27 - CPU-Only y-cruncher: 5B deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream quicksilver: CORAL2 P2 blender: Barbershop - CPU-Only deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream speedb: Rand Read deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream xmrig: Monero - 1M deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream y-cruncher: 500M deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream xmrig: CryptoNight-Femto UPX2 - 1M deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream embree: Pathtracer ISPC - Asian Dragon Obj deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream openradioss: Bumper Beam deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream ffmpeg: libx265 - Video On Demand deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream tensorflow: CPU - 1 - AlexNet easywave: e2Asean Grid + BengkuluSept2007 Source - 2400 deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream blender: Pabellon Barcelona - CPU-Only quantlib: Single-Threaded speedb: Read Rand Write Rand tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 1 - ResNet-50 tensorflow: CPU - 1 - GoogLeNet tensorflow: CPU - 1 - VGG-16 pytorch: CPU - 64 - Efficientnet_v2_l rav1e: 1 webp2: Quality 100, Lossless Compression a b 315 0.54 211.562 187.077 0.27 176928.1 272 14.70 350612 20666.7 17.075 195.6 5.1074 239469 131.9 14.89 13.67 190.949 6.628 14.37 511.853 36.87 67.9313 15411684 143.85 69.2031 76.78 114.74 9.53 4.851 5.3704 185.9864 67.98 36.66 597.058 164.67 371857 69.0867 7.44 67.06 36.12 35.54 36.8118 18.145 1.947 32.9699 30.3213 51.06 45.19 185.3007 5.3901 297.08 40330.7 299.15 698.3574 16240000 11.63 10.245 45.7574 130.9206 7.6305 39.643 23.20 369023 83.8771 3813.6379 8.3709 3.584 6.00 6.06 21350000 145.6232 12.32 6.8554 31.94 496.3373 64.0666 673.3343 20682.1 16.54 112.543 222.2492 143.5315 46.9635 1458.049 57.21 77.2581 35.21 37.088 65.6643 486.6043 21.9243 47.15 4436.2 26.71 53.068 852.1762 16170000 239.83 27.5732 36.2588 15.7849 303866048 36.3487 27.5048 63.2873 20714.7 800.5916 5.11 145.021 20698.4 65.6917 1.2456 71.4939 6.7843 147.2472 88.11 24.3612 854.1221 41.0087 323.0662 47.03 98.8186 220.0706 30.46 111.038 485.9989 86.14 2634.4 2539184 155.77 5.97 17 9.87 6.03 0.85 0.06 354 0.51 223.062 194.587 0.26 170381.3 282 14.18 361192 20071.9 17.467 191.4824 5.217 244535 129.181 14.61 13.93 194.49 6.75 14.13 503.618 36.29 67.1138 15231425 142.38 68.5827 76.1 115.73 9.61 4.891 5.4136 184.5078 68.523 36.38 601.573 163.46 369178 68.5948 7.49 67.51 36.35 35.33 37.0225 18.042 1.958 33.1445 30.1617 50.8 45.42 186.2351 5.3631 295.72 40146.1 297.84 695.3481 16310000 11.68 10.202 45.9477 131.4576 7.5994 39.484 23.29 367684 83.5831 3800.6772 8.399 3.596 5.98 6.08 21280000 145.1485 12.36 6.8775 31.84 497.6216 63.9058 674.9861 20732.7 16.58 112.813 221.7174 143.8442 46.8626 1454.9899 57.09 77.103 35.14 37.16 65.7823 485.7323 21.9623 47.07 4442.9 26.75 53.146 853.3085 16190000 240.12 27.5404 36.3016 15.77 304143127 36.316 27.5295 63.3431 20732.3 799.9581 5.106 144.9123 20683 65.7405 1.2465 71.4428 6.7891 147.1485 88.16 24.3739 853.694 40.9888 322.9189 47.05 98.8586 220.1545 30.45 111.072 485.8785 86.16 2633.8 2539687 155.78 5.97 17 9.87 6.03 0.85 0.06 OpenBenchmarking.org
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.30 Backend: BLAS b a 80 160 240 320 400 354 315 1. (CXX) g++ options: -flto -pthread
WebP2 Image Encode Encode Settings: Quality 75, Compression Effort 7 OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Quality 75, Compression Effort 7 a b 0.1215 0.243 0.3645 0.486 0.6075 0.54 0.51 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
Timed Gem5 Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Gem5 Compilation 23.0.1 Time To Compile a b 50 100 150 200 250 211.56 223.06
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 13 - Input: Bosphorus 4K b a 40 80 120 160 200 194.59 187.08 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
WebP2 Image Encode Encode Settings: Quality 95, Compression Effort 7 OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Quality 95, Compression Effort 7 a b 0.0608 0.1216 0.1824 0.2432 0.304 0.27 0.26 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
QuantLib Configuration: Multi-Threaded OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Multi-Threaded a b 40K 80K 120K 160K 200K 176928.1 170381.3 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.30 Backend: Eigen b a 60 120 180 240 300 282 272 1. (CXX) g++ options: -flto -pthread
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 a b 4 8 12 16 20 14.70 14.18 MIN: 14.51 / MAX: 14.85 MIN: 14.06 / MAX: 14.27
Speedb Test: Update Random OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random b a 80K 160K 240K 320K 400K 361192 350612 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Xmrig Variant: CryptoNight-Heavy - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: CryptoNight-Heavy - Hash Count: 1M a b 4K 8K 12K 16K 20K 20666.7 20071.9 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 4 - Input: Bosphorus 1080p b a 4 8 12 16 20 17.47 17.08 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream a b 40 80 120 160 200 195.60 191.48
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream a b 1.1738 2.3476 3.5214 4.6952 5.869 5.1074 5.2170
Speedb Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill Sync b a 50K 100K 150K 200K 250K 244535 239469 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b 30 60 90 120 150 131.90 129.18 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 a b 4 8 12 16 20 14.89 14.61 MIN: 13.43 / MAX: 14.98 MIN: 13.45 / MAX: 14.71
CloverLeaf Input: clover_bm OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm a b 4 8 12 16 20 13.67 13.93 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 12 - Input: Bosphorus 4K b a 40 80 120 160 200 194.49 190.95 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 4 - Input: Bosphorus 4K b a 2 4 6 8 10 6.750 6.628 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 a b 4 8 12 16 20 14.37 14.13 MIN: 13.3 / MAX: 14.46 MIN: 12.97 / MAX: 14.21
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b 110 220 330 440 550 511.85 503.62 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 a b 8 16 24 32 40 36.87 36.29 MIN: 35.7 / MAX: 37.3 MIN: 35.06 / MAX: 36.7
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Crown a b 15 30 45 60 75 67.93 67.11 MIN: 66.43 / MAX: 69.87 MIN: 65.7 / MAX: 68.8
Speedb Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read While Writing a b 3M 6M 9M 12M 15M 15411684 15231425 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenRadioss Model: Bird Strike on Windshield OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Bird Strike on Windshield b a 30 60 90 120 150 142.38 143.85
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Crown a b 15 30 45 60 75 69.20 68.58 MIN: 67.75 / MAX: 70.94 MIN: 67.24 / MAX: 70.49
OpenRadioss Model: Rubber O-Ring Seal Installation OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Rubber O-Ring Seal Installation b a 20 40 60 80 100 76.10 76.78
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Live b a 30 60 90 120 150 115.73 114.74 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l b a 3 6 9 12 15 9.61 9.53 MIN: 9.52 / MAX: 9.7 MIN: 9.35 / MAX: 9.64
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.7 Speed: 6 b a 1.1005 2.201 3.3015 4.402 5.5025 4.891 4.851
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b 1.2181 2.4362 3.6543 4.8724 6.0905 5.3704 5.4136
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b 40 80 120 160 200 185.99 184.51
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 8 - Input: Bosphorus 4K b a 15 30 45 60 75 68.52 67.98 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 a b 8 16 24 32 40 36.66 36.38 MIN: 29.64 / MAX: 37.04 MIN: 30.16 / MAX: 36.75
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 13 - Input: Bosphorus 1080p b a 130 260 390 520 650 601.57 597.06 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OpenRadioss Model: INIVOL and Fluid Structure Interaction Drop Container OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: INIVOL and Fluid Structure Interaction Drop Container b a 40 80 120 160 200 163.46 164.67
Speedb Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Sequential Fill a b 80K 160K 240K 320K 400K 371857 369178 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon Obj a b 15 30 45 60 75 69.09 68.59 MIN: 68.49 / MAX: 70.1 MIN: 67.96 / MAX: 69.51
WebP2 Image Encode Encode Settings: Default OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Default b a 2 4 6 8 10 7.49 7.44 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.0 Blend File: Classroom - Compute: CPU-Only a b 15 30 45 60 75 67.06 67.51
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 b a 8 16 24 32 40 36.35 36.12 MIN: 35.07 / MAX: 36.8 MIN: 35.2 / MAX: 36.51
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.0 Blend File: Fishy Cat - Compute: CPU-Only b a 8 16 24 32 40 35.33 35.54
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream b a 9 18 27 36 45 37.02 36.81
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.1 Time To Compile b a 4 8 12 16 20 18.04 18.15
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 240 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 240 a b 0.4406 0.8812 1.3218 1.7624 2.203 1.947 1.958 1. (CXX) g++ options: -O3 -fopenmp
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream a b 8 16 24 32 40 32.97 33.14
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream a b 7 14 21 28 35 30.32 30.16
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 a b 12 24 36 48 60 51.06 50.80
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 b a 10 20 30 40 50 45.42 45.19 MIN: 44.29 / MAX: 45.97 MIN: 43.72 / MAX: 45.91
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream b a 40 80 120 160 200 186.24 185.30
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream b a 1.2128 2.4256 3.6384 4.8512 6.064 5.3631 5.3901
OpenRadioss Model: Chrysler Neon 1M OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Chrysler Neon 1M b a 60 120 180 240 300 295.72 297.08
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: Wownero - Hash Count: 1M a b 9K 18K 27K 36K 45K 40330.7 40146.1 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet a b 70 140 210 280 350 299.15 297.84
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 150 300 450 600 750 698.36 695.35
Quicksilver Input: CTS2 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CTS2 b a 3M 6M 9M 12M 15M 16310000 16240000 1. (CXX) g++ options: -fopenmp -O3 -march=native
WebP2 Image Encode Encode Settings: Quality 100, Compression Effort 5 OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Quality 100, Compression Effort 5 b a 3 6 9 12 15 11.68 11.63 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.3 Pi Digits To Calculate: 1B b a 3 6 9 12 15 10.20 10.25
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 10 20 30 40 50 45.76 45.95
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b a 30 60 90 120 150 131.46 130.92
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b a 2 4 6 8 10 7.5994 7.6305
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 b a 9 18 27 36 45 39.48 39.64 1. (CXX) g++ options: -O3 -fopenmp
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Upload b a 6 12 18 24 30 23.29 23.20 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Speedb Test: Random Fill OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill a b 80K 160K 240K 320K 400K 369023 367684 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon a b 20 40 60 80 100 83.88 83.58 MIN: 83.23 / MAX: 85.03 MIN: 82.9 / MAX: 84.61
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 800 1600 2400 3200 4000 3813.64 3800.68
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 2 4 6 8 10 8.3709 8.3990
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.7 Speed: 5 b a 0.8091 1.6182 2.4273 3.2364 4.0455 3.596 3.584
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l a b 2 4 6 8 10 6.00 5.98 MIN: 5.47 / MAX: 6.12 MIN: 5.52 / MAX: 6.11
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l b a 2 4 6 8 10 6.08 6.06 MIN: 5.69 / MAX: 6.19 MIN: 5.71 / MAX: 6.17
Quicksilver Input: CORAL2 P1 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 a b 5M 10M 15M 20M 25M 21350000 21280000 1. (CXX) g++ options: -fopenmp -O3 -march=native
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b 30 60 90 120 150 145.62 145.15
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.7 Speed: 10 b a 3 6 9 12 15 12.36 12.32
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b 2 4 6 8 10 6.8554 6.8775
OpenRadioss Model: Cell Phone Drop Test OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Cell Phone Drop Test b a 7 14 21 28 35 31.84 31.94
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 110 220 330 440 550 496.34 497.62
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 14 28 42 56 70 64.07 63.91
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream a b 150 300 450 600 750 673.33 674.99
Xmrig Variant: KawPow - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: KawPow - Hash Count: 1M b a 4K 8K 12K 16K 20K 20732.7 20682.1 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 b a 4 8 12 16 20 16.58 16.54 MIN: 16.43 / MAX: 16.69 MIN: 16.37 / MAX: 16.68
Y-Cruncher Pi Digits To Calculate: 10B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.3 Pi Digits To Calculate: 10B a b 30 60 90 120 150 112.54 112.81
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 50 100 150 200 250 222.25 221.72
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 143.53 143.84
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream a b 11 22 33 44 55 46.96 46.86
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 300 600 900 1200 1500 1458.05 1454.99
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short b a 13 26 39 52 65 57.09 57.21 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer - Model: Asian Dragon a b 20 40 60 80 100 77.26 77.10 MIN: 76.76 / MAX: 78.27 MIN: 76.54 / MAX: 78.04
TensorFlow Device: CPU - Batch Size: 16 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: VGG-16 a b 8 16 24 32 40 35.21 35.14
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream b a 9 18 27 36 45 37.16 37.09
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b 15 30 45 60 75 65.66 65.78
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b 110 220 330 440 550 486.60 485.73
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b 5 10 15 20 25 21.92 21.96
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Platform a b 11 22 33 44 55 47.15 47.07 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Xmrig Variant: GhostRider - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: GhostRider - Hash Count: 1M b a 1000 2000 3000 4000 5000 4442.9 4436.2 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.0 Blend File: BMW27 - Compute: CPU-Only a b 6 12 18 24 30 26.71 26.75
Y-Cruncher Pi Digits To Calculate: 5B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.3 Pi Digits To Calculate: 5B a b 12 24 36 48 60 53.07 53.15
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b 200 400 600 800 1000 852.18 853.31
Quicksilver Input: CORAL2 P2 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 b a 3M 6M 9M 12M 15M 16190000 16170000 1. (CXX) g++ options: -fopenmp -O3 -march=native
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.0 Blend File: Barbershop - Compute: CPU-Only a b 50 100 150 200 250 239.83 240.12
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b 6 12 18 24 30 27.57 27.54
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b 8 16 24 32 40 36.26 36.30
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream b a 4 8 12 16 20 15.77 15.78
Speedb Test: Random Read OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read b a 70M 140M 210M 280M 350M 304143127 303866048 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream b a 8 16 24 32 40 36.32 36.35
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream b a 6 12 18 24 30 27.53 27.50
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream b a 14 28 42 56 70 63.34 63.29
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: Monero - Hash Count: 1M b a 4K 8K 12K 16K 20K 20732.3 20714.7 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream a b 200 400 600 800 1000 800.59 799.96
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.3 Pi Digits To Calculate: 500M b a 1.1498 2.2996 3.4494 4.5992 5.749 5.106 5.110
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream b a 30 60 90 120 150 144.91 145.02
Xmrig Variant: CryptoNight-Femto UPX2 - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: CryptoNight-Femto UPX2 - Hash Count: 1M a b 4K 8K 12K 16K 20K 20698.4 20683.0 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b 15 30 45 60 75 65.69 65.74
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream a b 0.2805 0.561 0.8415 1.122 1.4025 1.2456 1.2465
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.3 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b 16 32 48 64 80 71.49 71.44 MIN: 70.94 / MAX: 72.42 MIN: 70.79 / MAX: 72.51
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream a b 2 4 6 8 10 6.7843 6.7891
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream a b 30 60 90 120 150 147.25 147.15
OpenRadioss Model: Bumper Beam OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2023.09.15 Model: Bumper Beam a b 20 40 60 80 100 88.11 88.16
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b 6 12 18 24 30 24.36 24.37
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream b a 200 400 600 800 1000 853.69 854.12
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b 9 18 27 36 45 41.01 40.99
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 70 140 210 280 350 323.07 322.92
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.1 Encoder: libx265 - Scenario: Video On Demand b a 11 22 33 44 55 47.05 47.03 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 20 40 60 80 100 98.82 98.86
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream b a 50 100 150 200 250 220.15 220.07
TensorFlow Device: CPU - Batch Size: 1 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: AlexNet a b 7 14 21 28 35 30.46 30.45
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 a b 20 40 60 80 100 111.04 111.07 1. (CXX) g++ options: -O3 -fopenmp
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b 110 220 330 440 550 486.00 485.88
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.0 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 20 40 60 80 100 86.14 86.16
QuantLib Configuration: Single-Threaded OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Single-Threaded a b 600 1200 1800 2400 3000 2634.4 2633.8 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
Speedb Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random b a 500K 1000K 1500K 2000K 2500K 2539687 2539184 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet b a 30 60 90 120 150 155.78 155.77
TensorFlow Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: ResNet-50 b a 1.3433 2.6866 4.0299 5.3732 6.7165 5.97 5.97
TensorFlow Device: CPU - Batch Size: 1 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: GoogLeNet b a 4 8 12 16 20 17 17
TensorFlow Device: CPU - Batch Size: 1 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: VGG-16 b a 3 6 9 12 15 9.87 9.87
PyTorch Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l b a 2 4 6 8 10 6.03 6.03 MIN: 5.51 / MAX: 6.13 MIN: 5.52 / MAX: 6.16
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.7 Speed: 1 b a 0.1913 0.3826 0.5739 0.7652 0.9565 0.85 0.85
WebP2 Image Encode Encode Settings: Quality 100, Lossless Compression OpenBenchmarking.org MP/s, More Is Better WebP2 Image Encode 20220823 Encode Settings: Quality 100, Lossless Compression b a 0.0135 0.027 0.0405 0.054 0.0675 0.06 0.06 1. (CXX) g++ options: -msse4.2 -fno-rtti -O3 -ldl
Phoronix Test Suite v10.8.5