dsd tests for a future article on Phoronix. AMD Ryzen 5 5500U testing with a NB01 NL5xNU (1.07.11RTR1 BIOS) and AMD Lucienne 512MB on Tuxedo 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2306284-NE-DSD34976978&sro&grs .
dsd Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b c AMD Ryzen 5 5500U @ 4.06GHz (6 Cores / 12 Threads) NB01 NL5xNU (1.07.11RTR1 BIOS) AMD Renoir/Cezanne 16GB Samsung SSD 970 EVO Plus 500GB AMD Lucienne 512MB (1800/400MHz) AMD Renoir Radeon HD Audio Realtek RTL8111/8168/8411 + Intel Wi-Fi 6 AX200 Tuxedo 22.04 6.0.0-1010-oem (x86_64) KDE Plasma 5.26.5 X Server 1.21.1.3 4.6 Mesa 22.3.7 (LLVM 14.0.0 DRM 3.48) 1.3.230 GCC 11.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / noatime,rw / Block Size: 4096 Processor Details - Scaling Governor: amd-pstate ondemand (Boost: Enabled) - CPU Microcode: 0x8608103 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
dsd srsran: Downlink Processor Benchmark liquid-dsp: 4 - 256 - 57 stress-ng: Semaphores stress-ng: Matrix 3D Math liquid-dsp: 12 - 256 - 32 stress-ng: CPU Cache qmcpack: Li2_STO_ae leveldb: Hot Read stress-ng: Forking leveldb: Rand Read stress-ng: Floating Point deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream qmcpack: FeCO6_b3lyp_gms leveldb: Fill Sync leveldb: Seq Fill leveldb: Fill Sync leveldb: Seq Fill srsran: PUSCH Processor Benchmark, Throughput Thread svt-av1: Preset 13 - Bosphorus 1080p qmcpack: FeCO6_b3lyp_gms svt-av1: Preset 12 - Bosphorus 1080p stress-ng: IO_uring deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream cp2k: H20-64 nekrs: Kershaw stress-ng: Glibc C String Functions hpcg: 104 104 104 - 60 stress-ng: Pthread embree: Pathtracer ISPC - Crown deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream stress-ng: MMAP deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream leveldb: Rand Delete leveldb: Rand Fill blender: Classroom - CPU-Only leveldb: Overwrite leveldb: Rand Fill leveldb: Overwrite srsran: PUSCH Processor Benchmark, Throughput Total libxsmm: 256 stress-ng: MEMFD stress-ng: Pipe mocassin: Gas HII40 stress-ng: Glibc Qsort Data Sorting stress-ng: CPU Stress deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream blender: Pabellon Barcelona - CPU-Only deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream libxsmm: 128 gpaw: Carbon Nanotube blender: BMW27 - CPU-Only qmcpack: simple-H2O deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream nekrs: TurboPipe Periodic deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream ospray: gravity_spheres_volume/dim_512/ao/real_time stress-ng: AVL Tree stress-ng: Vector Floating Point embree: Pathtracer - Asian Dragon svt-av1: Preset 8 - Bosphorus 4K cp2k: Fayalite-FIST svt-av1: Preset 4 - Bosphorus 1080p deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream leveldb: Seek Rand deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream stress-ng: Vector Shuffle embree: Pathtracer ISPC - Asian Dragon Obj deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream ospray: gravity_spheres_volume/dim_512/pathtracer/real_time embree: Pathtracer - Crown liquid-dsp: 4 - 256 - 32 stress-ng: SENDFILE deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream svt-av1: Preset 12 - Bosphorus 4K ospray: gravity_spheres_volume/dim_512/scivis/real_time stress-ng: Zlib svt-av1: Preset 8 - Bosphorus 1080p blender: Barbershop - CPU-Only dav1d: Summer Nature 1080p embree: Pathtracer ISPC - Asian Dragon stress-ng: System V Message Passing embree: Pathtracer - Asian Dragon Obj libxsmm: 64 stress-ng: Cloning dav1d: Summer Nature 4K deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream liquid-dsp: 8 - 256 - 57 deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream liquid-dsp: 4 - 256 - 512 deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream stress-ng: Matrix Math dav1d: Chimera 1080p 10-bit stress-ng: Wide Vector Math liquid-dsp: 8 - 256 - 512 stress-ng: Crypto svt-av1: Preset 13 - Bosphorus 4K stress-ng: Poll deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream svt-av1: Preset 4 - Bosphorus 4K deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream liquid-dsp: 12 - 256 - 57 deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream libxsmm: 32 dav1d: Chimera 1080p mocassin: Dust 2D tau100.0 blender: Fishy Cat - CPU-Only stress-ng: Vector Math liquid-dsp: 12 - 256 - 512 deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream stress-ng: Futex deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream liquid-dsp: 2 - 256 - 57 stress-ng: Fused Multiply-Add liquid-dsp: 2 - 256 - 512 liquid-dsp: 1 - 256 - 57 deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream stress-ng: Socket Activity liquid-dsp: 2 - 256 - 32 ospray: particle_volume/scivis/real_time stress-ng: Atomic stress-ng: Function Call stress-ng: NUMA stress-ng: Hash liquid-dsp: 1 - 256 - 32 stress-ng: Malloc ospray: particle_volume/pathtracer/real_time stress-ng: Memory Copying liquid-dsp: 8 - 256 - 32 stress-ng: Mutex liquid-dsp: 1 - 256 - 512 ospray: particle_volume/ao/real_time stress-ng: Context Switching oidn: RTLightmap.hdr.4096x4096 - CPU-Only oidn: RT.ldr_alb_nrm.3840x2160 - CPU-Only oidn: RT.hdr_alb_nrm.3840x2160 - CPU-Only cp2k: H2O-DFT-LS a b c 605.4 181520000 11539831.67 655.53 352320000 1602736.87 561.43 5.597 24635.01 5.395 1902.32 124.9355 709.6704 23.9725 4.2139 224.86 55.654 57.917 22 22.9 205.1 263.162 215.91 197.351 149358.92 12.5069 239.3094 202.509 3262230000 5286881.52 4.89062 88422.54 5.6779 22.417 44.5963 27.6316 36.179 606.7911 105.65 4.9355 58.116 61.601 718.97 21.3 21.5 62.261 884.3 153.3 252.81 4100875.46 28.316 155.16 13540.22 17.1168 58.4135 917.42 44.5117 67.375 166.7 558.433 273.44 33.422 27.9207 35.805 4048400000 4.2154 0.819999 36.3 16236.58 7.5769 19.991 269.084 7.53 4.0575 8.655 246.4486 4014.79 6.0927 12.0937 82.6788 1.26523 6.1914 166600000 97433.05 52.1325 57.5184 55.453 0.772072 778.8 54.539 2831.5 486.15 7.1124 6932771.01 6.8126 103.8 789.77 117 20.9596 47.6854 274560000 245.6048 4.0714 39178000 706.9965 31926.73 288.59 239285.79 64708000 13387.42 57.518 699321.18 4.9412 2.063 202.3654 300830000 39.4979 57.2 384.5 278.101 349.67 37545.21 81954000 158.5844 2157458.17 75.8477 97318000 5552017.37 19638000 48644000 18.9025 4064.79 83918000 1.68158 458.83 4327.17 111.32 1345330.53 41901000 2829929.26 48.9571 1990.22 289130000 2838893.85 9799300 1.69541 2040963.94 0.11 0.23 0.23 727.6 193320000 12288957.01 696.06 370010000 1540822.35 582.88 5.405 25320.92 5.435 1952.65 128.2129 727.893 23.3785 4.1108 227.81 54.321 59.303 22.5 22.4 209.6 263.714 211.82 198.72 146785.45 12.7213 235.4564 199.293 3311070000 5361145.86 4.95055 89529.4 5.6125 22.1646 45.1041 27.9432 35.7757 613.5612 106.78 4.8843 58.271 61.921 725.85 21.5 21.4 61.802 883 154.6 250.89 4131488.24 28.473 154.03 13639.31 17.2382 58.0023 923.92 44.819 66.9132 167.8 554.853 271.69 33.573 27.7551 36.0182 4071760000 4.1914 0.815374 36.1 16147.3 7.5774 19.955 268.665 7.514 4.0383 8.643 247.6174 3995.92 6.1096 12.042 83.0332 1.27059 6.1852 167240000 97789.15 52.3179 57.318 55.402 0.7694 781.5 54.357 2841.21 484.56 7.1082 6911550.22 6.8102 104.1 787.51 116.67 21.0173 47.5558 273880000 246.2085 4.0615 39083000 708.6911 31854.25 289.22 238777.83 64842000 13360.65 57.633 697928.53 4.9316 2.062 202.7579 300260000 39.5706 57.3 384.49 278.011 350.26 37487.44 81833000 158.3581 2154685.66 75.7549 97437000 5545681.56 19617000 48683000 18.9175 4061.68 83864000 1.68263 459.09 4325.36 111.36 1344929.58 41889000 2829179.78 48.9458 1989.81 289080000 2838432.45 9800700 1.69519 2041049.74 0.11 0.23 0.23 727.8 563.76 5.508 5.539 222.29 55.151 58.657 22.1 22.6 209.3 258.642 212.71 195.102 199.868 3294910000 4.95616 5.6451 57.697 61.331 21.5 21.6 61.706 890.9 154.2 28.265 167.2 33.362 4063470000 7.6181 19.889 270.039 7.493 8.684 6.0823 6.1671 55.595 54.544 484.62 7.1306 6.7924 104 116.72 288.96 57.522 2.059 57.2 383.85 277.632 0.11 0.23 0.23 OpenBenchmarking.org
srsRAN Project Test: Downlink Processor Benchmark OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: Downlink Processor Benchmark a b c 160 320 480 640 800 605.4 727.6 727.8 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 57 a b 40M 80M 120M 160M 200M 181520000 193320000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Semaphores a b 3M 6M 9M 12M 15M 11539831.67 12288957.01 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Matrix 3D Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix 3D Math a b 150 300 450 600 750 655.53 696.06 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 12 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 12 - Buffer Length: 256 - Filter Length: 32 a b 80M 160M 240M 320M 400M 352320000 370010000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Cache a b 300K 600K 900K 1200K 1500K 1602736.87 1540822.35 1. (CXX) g++ options: -O2 -std=gnu99 -lc
QMCPACK Input: Li2_STO_ae OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: Li2_STO_ae a b c 130 260 390 520 650 561.43 582.88 563.76 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
LevelDB Benchmark: Hot Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Hot Read a b c 1.2593 2.5186 3.7779 5.0372 6.2965 5.597 5.405 5.508 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Forking a b 5K 10K 15K 20K 25K 24635.01 25320.92 1. (CXX) g++ options: -O2 -std=gnu99 -lc
LevelDB Benchmark: Random Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Read a b c 1.2463 2.4926 3.7389 4.9852 6.2315 5.395 5.435 5.539 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
Stress-NG Test: Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Floating Point a b 400 800 1200 1600 2000 1902.32 1952.65 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b 30 60 90 120 150 124.94 128.21
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b 160 320 480 640 800 709.67 727.89
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b 6 12 18 24 30 23.97 23.38
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b 0.9481 1.8962 2.8443 3.7924 4.7405 4.2139 4.1108
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: FeCO6_b3lyp_gms a b c 50 100 150 200 250 224.86 227.81 222.29 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
LevelDB Benchmark: Fill Sync OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Fill Sync a b c 13 26 39 52 65 55.65 54.32 55.15 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Sequential Fill a b c 13 26 39 52 65 57.92 59.30 58.66 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
LevelDB Benchmark: Fill Sync OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Fill Sync a b c 5 10 15 20 25 22.0 22.5 22.1 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Sequential Fill a b c 5 10 15 20 25 22.9 22.4 22.6 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread a b c 50 100 150 200 250 205.1 209.6 209.3 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b c 60 120 180 240 300 263.16 263.71 258.64 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: FeCO6_b3lyp_gms a b c 50 100 150 200 250 215.91 211.82 212.71 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b c 40 80 120 160 200 197.35 198.72 195.10 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Stress-NG Test: IO_uring OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: IO_uring a b 30K 60K 90K 120K 150K 149358.92 146785.45 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b 3 6 9 12 15 12.51 12.72
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b 50 100 150 200 250 239.31 235.46
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: H20-64 a b c 40 80 120 160 200 202.51 199.29 199.87 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
nekRS Input: Kershaw OpenBenchmarking.org flops/rank, More Is Better nekRS 23.0 Input: Kershaw a b c 700M 1400M 2100M 2800M 3500M 3262230000 3311070000 3294910000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc C String Functions a b 1.1M 2.2M 3.3M 4.4M 5.5M 5286881.52 5361145.86 1. (CXX) g++ options: -O2 -std=gnu99 -lc
High Performance Conjugate Gradient X Y Z: 104 104 104 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 60 a b c 1.1151 2.2302 3.3453 4.4604 5.5755 4.89062 4.95055 4.95616 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pthread a b 20K 40K 60K 80K 100K 88422.54 89529.40 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown a b c 1.2775 2.555 3.8325 5.11 6.3875 5.6779 5.6125 5.6451 MIN: 5.64 / MAX: 5.77 MIN: 5.57 / MAX: 5.69 MIN: 5.61 / MAX: 5.72
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b 5 10 15 20 25 22.42 22.16
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b 10 20 30 40 50 44.60 45.10
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream a b 7 14 21 28 35 27.63 27.94
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream a b 8 16 24 32 40 36.18 35.78
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 130 260 390 520 650 606.79 613.56
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: MMAP a b 20 40 60 80 100 105.65 106.78 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b 1.1105 2.221 3.3315 4.442 5.5525 4.9355 4.8843
LevelDB Benchmark: Random Delete OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Delete a b c 13 26 39 52 65 58.12 58.27 57.70 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
LevelDB Benchmark: Random Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Random Fill a b c 14 28 42 56 70 61.60 61.92 61.33 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: CPU-Only a b 160 320 480 640 800 718.97 725.85
LevelDB Benchmark: Overwrite OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Overwrite a b c 5 10 15 20 25 21.3 21.5 21.5 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
LevelDB Benchmark: Random Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.23 Benchmark: Random Fill a b c 5 10 15 20 25 21.5 21.4 21.6 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
LevelDB Benchmark: Overwrite OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Overwrite a b c 14 28 42 56 70 62.26 61.80 61.71 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total a b c 200 400 600 800 1000 884.3 883.0 890.9 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 a b c 30 60 90 120 150 153.3 154.6 154.2 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: MEMFD a b 60 120 180 240 300 252.81 250.89 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Pipe OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Pipe a b 900K 1800K 2700K 3600K 4500K 4100875.46 4131488.24 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Monte Carlo Simulations of Ionised Nebulae Input: Gas HII40 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 a b c 7 14 21 28 35 28.32 28.47 28.27 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Glibc Qsort Data Sorting a b 30 60 90 120 150 155.16 154.03 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Stress a b 3K 6K 9K 12K 15K 13540.22 13639.31 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b 4 8 12 16 20 17.12 17.24
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b 13 26 39 52 65 58.41 58.00
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 200 400 600 800 1000 917.42 923.92
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b 10 20 30 40 50 44.51 44.82
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b 15 30 45 60 75 67.38 66.91
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 a b c 40 80 120 160 200 166.7 167.8 167.2 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube a b 120 240 360 480 600 558.43 554.85 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: CPU-Only a b 60 120 180 240 300 273.44 271.69
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: simple-H2O a b c 8 16 24 32 40 33.42 33.57 33.36 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b 7 14 21 28 35 27.92 27.76
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b 8 16 24 32 40 35.81 36.02
nekRS Input: TurboPipe Periodic OpenBenchmarking.org flops/rank, More Is Better nekRS 23.0 Input: TurboPipe Periodic a b c 900M 1800M 2700M 3600M 4500M 4048400000 4071760000 4063470000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b 0.9485 1.897 2.8455 3.794 4.7425 4.2154 4.1914
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time a b 0.1845 0.369 0.5535 0.738 0.9225 0.819999 0.815374
Stress-NG Test: AVL Tree OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: AVL Tree a b 8 16 24 32 40 36.3 36.1 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point a b 3K 6K 9K 12K 15K 16236.58 16147.30 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon a b c 2 4 6 8 10 7.5769 7.5774 7.6181 MIN: 7.53 / MAX: 7.74 MIN: 7.53 / MAX: 7.73 MIN: 7.57 / MAX: 7.76
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b c 5 10 15 20 25 19.99 19.96 19.89 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: Fayalite-FIST a b c 60 120 180 240 300 269.08 268.67 270.04 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b c 2 4 6 8 10 7.530 7.514 7.493 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b 0.9129 1.8258 2.7387 3.6516 4.5645 4.0575 4.0383
LevelDB Benchmark: Seek Random OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.23 Benchmark: Seek Random a b c 2 4 6 8 10 8.655 8.643 8.684 1. (CXX) g++ options: -fno-exceptions -fno-rtti -O3 -lgmock -lgtest -lsnappy
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b 50 100 150 200 250 246.45 247.62
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle a b 900 1800 2700 3600 4500 4014.79 3995.92 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b c 2 4 6 8 10 6.0927 6.1096 6.0823 MIN: 6.06 / MAX: 6.24 MIN: 6.07 / MAX: 6.23 MIN: 6.05 / MAX: 6.22
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b 3 6 9 12 15 12.09 12.04
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b 20 40 60 80 100 82.68 83.03
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time a b 0.2859 0.5718 0.8577 1.1436 1.4295 1.26523 1.27059
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown a b c 2 4 6 8 10 6.1914 6.1852 6.1671 MIN: 6.15 / MAX: 6.26 MIN: 6.15 / MAX: 6.3 MIN: 6.13 / MAX: 6.26
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 32 a b 40M 80M 120M 160M 200M 166600000 167240000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: SENDFILE a b 20K 40K 60K 80K 100K 97433.05 97789.15 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b 12 24 36 48 60 52.13 52.32
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b 13 26 39 52 65 57.52 57.32
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b c 12 24 36 48 60 55.45 55.40 55.60 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time a b 0.1737 0.3474 0.5211 0.6948 0.8685 0.772072 0.769400
Stress-NG Test: Zlib OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Zlib a b 200 400 600 800 1000 778.8 781.5 1. (CXX) g++ options: -O2 -std=gnu99 -lc
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b c 12 24 36 48 60 54.54 54.36 54.54 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: CPU-Only a b 600 1200 1800 2400 3000 2831.50 2841.21
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 1080p a b c 110 220 330 440 550 486.15 484.56 484.62 1. (CC) gcc options: -pthread -lm
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon a b c 2 4 6 8 10 7.1124 7.1082 7.1306 MIN: 7.07 / MAX: 7.24 MIN: 7.07 / MAX: 7.23 MIN: 7.09 / MAX: 7.24
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: System V Message Passing a b 1.5M 3M 4.5M 6M 7.5M 6932771.01 6911550.22 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj a b c 2 4 6 8 10 6.8126 6.8102 6.7924 MIN: 6.76 / MAX: 6.97 MIN: 6.77 / MAX: 6.97 MIN: 6.75 / MAX: 6.96
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 a b c 20 40 60 80 100 103.8 104.1 104.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Stress-NG Test: Cloning OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Cloning a b 200 400 600 800 1000 789.77 787.51 1. (CXX) g++ options: -O2 -std=gnu99 -lc
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Summer Nature 4K a b c 30 60 90 120 150 117.00 116.67 116.72 1. (CC) gcc options: -pthread -lm
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b 5 10 15 20 25 20.96 21.02
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b 11 22 33 44 55 47.69 47.56
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 57 a b 60M 120M 180M 240M 300M 274560000 273880000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b 50 100 150 200 250 245.60 246.21
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b 0.9161 1.8322 2.7483 3.6644 4.5805 4.0714 4.0615
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 512 a b 8M 16M 24M 32M 40M 39178000 39083000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b 150 300 450 600 750 707.00 708.69
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix Math a b 7K 14K 21K 28K 35K 31926.73 31854.25 1. (CXX) g++ options: -O2 -std=gnu99 -lc
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p 10-bit a b c 60 120 180 240 300 288.59 289.22 288.96 1. (CC) gcc options: -pthread -lm
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math a b 50K 100K 150K 200K 250K 239285.79 238777.83 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 512 a b 14M 28M 42M 56M 70M 64708000 64842000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Crypto a b 3K 6K 9K 12K 15K 13387.42 13360.65 1. (CXX) g++ options: -O2 -std=gnu99 -lc
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b c 13 26 39 52 65 57.52 57.63 57.52 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Poll a b 150K 300K 450K 600K 750K 699321.18 697928.53 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b 1.1118 2.2236 3.3354 4.4472 5.559 4.9412 4.9316
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b c 0.4642 0.9284 1.3926 1.8568 2.321 2.063 2.062 2.059 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b 40 80 120 160 200 202.37 202.76
Liquid-DSP Threads: 12 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 12 - Buffer Length: 256 - Filter Length: 57 a b 60M 120M 180M 240M 300M 300830000 300260000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 9 18 27 36 45 39.50 39.57
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 a b c 13 26 39 52 65 57.2 57.3 57.2 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.2.1 Video Input: Chimera 1080p a b c 80 160 240 320 400 384.50 384.49 383.85 1. (CC) gcc options: -pthread -lm
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 a b c 60 120 180 240 300 278.10 278.01 277.63 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: CPU-Only a b 80 160 240 320 400 349.67 350.26
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Math a b 8K 16K 24K 32K 40K 37545.21 37487.44 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 12 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 12 - Buffer Length: 256 - Filter Length: 512 a b 20M 40M 60M 80M 100M 81954000 81833000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b 40 80 120 160 200 158.58 158.36
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Futex a b 500K 1000K 1500K 2000K 2500K 2157458.17 2154685.66 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b 20 40 60 80 100 75.85 75.75
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 57 a b 20M 40M 60M 80M 100M 97318000 97437000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: Fused Multiply-Add OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Fused Multiply-Add a b 1.2M 2.4M 3.6M 4.8M 6M 5552017.37 5545681.56 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 512 a b 4M 8M 12M 16M 20M 19638000 19617000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 a b 10M 20M 30M 40M 50M 48644000 48683000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b 5 10 15 20 25 18.90 18.92
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Socket Activity a b 900 1800 2700 3600 4500 4064.79 4061.68 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 32 a b 20M 40M 60M 80M 100M 83918000 83864000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/scivis/real_time a b 0.3786 0.7572 1.1358 1.5144 1.893 1.68158 1.68263
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Atomic a b 100 200 300 400 500 458.83 459.09 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Function Call a b 900 1800 2700 3600 4500 4327.17 4325.36 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: NUMA a b 20 40 60 80 100 111.32 111.36 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Hash a b 300K 600K 900K 1200K 1500K 1345330.53 1344929.58 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 a b 9M 18M 27M 36M 45M 41901000 41889000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Malloc a b 600K 1200K 1800K 2400K 3000K 2829929.26 2829179.78 1. (CXX) g++ options: -O2 -std=gnu99 -lc
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time a b 11 22 33 44 55 48.96 48.95
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Memory Copying a b 400 800 1200 1600 2000 1990.22 1989.81 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 32 a b 60M 120M 180M 240M 300M 289130000 289080000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Mutex a b 600K 1200K 1800K 2400K 3000K 2838893.85 2838432.45 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 a b 2M 4M 6M 8M 10M 9799300 9800700 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/ao/real_time a b 0.3815 0.763 1.1445 1.526 1.9075 1.69541 1.69519
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Context Switching a b 400K 800K 1200K 1600K 2000K 2040963.94 2041049.74 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only a b c 0.0248 0.0496 0.0744 0.0992 0.124 0.11 0.11 0.11
Intel Open Image Denoise Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only a b c 0.0518 0.1036 0.1554 0.2072 0.259 0.23 0.23 0.23
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 2.0 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only a b c 0.0518 0.1036 0.1554 0.2072 0.259 0.23 0.23 0.23
Phoronix Test Suite v10.8.5