AMD EPYC 9684X 3D V-Cache AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite. Default: Processor: AMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads), Motherboard: AMD Titanite_4G (RTI1007B BIOS), Chipset: AMD Device 14a4, Memory: 768GB, Disk: 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007, Graphics: ASPEED, Network: Broadcom NetXtreme BCM5720 PCIe OS: Ubuntu 22.04, Kernel: 5.19.0-41-generic (x86_64), Desktop: GNOME Shell 42.5, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 11.3.0, File-System: ext4, Screen Resolution: 1024x768 High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 60 GFLOP/s > Higher Is Better Default . 26.83 |============================================================== High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 GFLOP/s > Higher Is Better Default . 24.61 |============================================================== High Performance Conjugate Gradient 3.1 X Y Z: 160 160 160 - RT: 60 GFLOP/s > Higher Is Better Default . 23.84 |============================================================== High Performance Conjugate Gradient 3.1 X Y Z: 192 192 192 - RT: 60 GFLOP/s > Higher Is Better Default . 22.83 |============================================================== NAS Parallel Benchmarks 3.4 Test / Class: BT.C Total Mop/s > Higher Is Better Default . 314777.90 |========================================================== NAS Parallel Benchmarks 3.4 Test / Class: CG.C Total Mop/s > Higher Is Better Default . 59737.94 |=========================================================== NAS Parallel Benchmarks 3.4 Test / Class: EP.D Total Mop/s > Higher Is Better Default . 10697.90 |=========================================================== NAS Parallel Benchmarks 3.4 Test / Class: FT.C Total Mop/s > Higher Is Better Default . 118831.89 |========================================================== NAS Parallel Benchmarks 3.4 Test / Class: IS.D Total Mop/s > Higher Is Better Default . 5696.51 |============================================================ NAS Parallel Benchmarks 3.4 Test / Class: LU.C Total Mop/s > Higher Is Better Default . 337910.64 |========================================================== NAS Parallel Benchmarks 3.4 Test / Class: MG.C Total Mop/s > Higher Is Better Default . 137308.27 |========================================================== NAS Parallel Benchmarks 3.4 Test / Class: SP.C Total Mop/s > Higher Is Better Default . 207614.70 |========================================================== LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better Default . 9760 |=============================================================== LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better Default . 11884 |============================================================== miniFE 2.2 Problem Size: Small CG Mflops > Higher Is Better Default . 54408.2 |============================================================ CloverLeaf Lagrangian-Eulerian Hydrodynamics Seconds < Lower Is Better Default . 10.29 |============================================================== NAMD 2.14 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better Default . 0.24733 |============================================================ Algebraic Multi-Grid Benchmark 1.2 Figure Of Merit > Higher Is Better Default . 2414283000 |========================================================= libxsmm 2-1.17-3645 M N K: 128 GFLOPS/s > Higher Is Better Default . 2913.4 |============================================================= libxsmm 2-1.17-3645 M N K: 256 GFLOPS/s > Higher Is Better Default . 3064.4 |============================================================= libxsmm 2-1.17-3645 M N K: 32 GFLOPS/s > Higher Is Better Default . 1311.3 |============================================================= libxsmm 2-1.17-3645 M N K: 64 GFLOPS/s > Higher Is Better Default . 2455.4 |============================================================= Laghos 3.1 Test: Triple Point Problem Major Kernels Total Rate > Higher Is Better Default . 238.89 |============================================================= Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh Major Kernels Total Rate > Higher Is Better Default . 446.93 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 GFLOP/s > Higher Is Better Default . 126.03 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 GFLOP/s > Higher Is Better Default . 178.44 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 GFLOP/s > Higher Is Better Default . 153.84 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 GFLOP/s > Higher Is Better Default . 185.39 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 GFLOP/s > Higher Is Better Default . 315.81 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 GFLOP/s > Higher Is Better Default . 332.74 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 GFLOP/s > Higher Is Better Default . 79.92 |============================================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 GFLOP/s > Higher Is Better Default . 87.04 |============================================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 GFLOP/s > Higher Is Better Default . 67.85 |============================================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 GFLOP/s > Higher Is Better Default . 129.88 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 GFLOP/s > Higher Is Better Default . 190.13 |============================================================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 GFLOP/s > Higher Is Better Default . 134.82 |============================================================= Palabos 2.3 Grid Size: 100 Mega Site Updates Per Second > Higher Is Better Default . 591.53 |============================================================= Palabos 2.3 Grid Size: 400 Mega Site Updates Per Second > Higher Is Better Default . 317.82 |============================================================= Palabos 2.3 Grid Size: 500 Mega Site Updates Per Second > Higher Is Better Default . 328.71 |============================================================= Palabos 2.3 Grid Size: 1000 Mega Site Updates Per Second > Higher Is Better Default . 370.19 |============================================================= Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction Seconds < Lower Is Better Default . 2.17521613 |========================================================= Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction Seconds < Lower Is Better Default . 7.68528681 |========================================================= Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 Seconds < Lower Is Better Default . 13.28 |============================================================== Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 Seconds < Lower Is Better Default . 190.07 |============================================================= OpenFOAM 10 Input: drivaerFastback, Large Mesh Size - Mesh Time Seconds < Lower Is Better Default . 585.14 |============================================================= OpenFOAM 10 Input: drivaerFastback, Large Mesh Size - Execution Time Seconds < Lower Is Better Default . 8994.47 |============================================================ OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time Seconds < Lower Is Better Default . 22.97 |============================================================== OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time Seconds < Lower Is Better Default . 29.88 |============================================================== OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time Seconds < Lower Is Better Default . 108.37 |============================================================= OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time Seconds < Lower Is Better Default . 181.02 |============================================================= Remhos 1.0 Test: Sample Remap Example Seconds < Lower Is Better Default . 10.29 |============================================================== SPECFEM3D 4.0 Model: Mount St. Helens Seconds < Lower Is Better Default . 8.898145240 |======================================================== SPECFEM3D 4.0 Model: Layered Halfspace Seconds < Lower Is Better Default . 21.35 |============================================================== SPECFEM3D 4.0 Model: Tomographic Model Seconds < Lower Is Better Default . 8.738903585 |======================================================== SPECFEM3D 4.0 Model: Homogeneous Halfspace Seconds < Lower Is Better Default . 11.32 |============================================================== SPECFEM3D 4.0 Model: Water-layered Halfspace Seconds < Lower Is Better Default . 21.12 |============================================================== LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms ns/day > Higher Is Better Default . 39.89 |============================================================== LULESH 2.0.3 z/s > Higher Is Better Default . 30715.33 |=========================================================== Xmrig 6.18.1 Variant: Monero - Hash Count: 1M H/s > Higher Is Better Default . 69684.3 |============================================================ Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M H/s > Higher Is Better Default . 74009.7 |============================================================ Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed MB/s > Higher Is Better Default . 3989.7 |============================================================= Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better Default . 1498.8 |============================================================= Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed MB/s > Higher Is Better Default . 1226.7 |============================================================= Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed MB/s > Higher Is Better Default . 1639.9 |============================================================= Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed MB/s > Higher Is Better Default . 329.3 |============================================================== Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed MB/s > Higher Is Better Default . 1653.3 |============================================================= Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed MB/s > Higher Is Better Default . 17.3 |=============================================================== Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better Default . 1433.4 |============================================================= Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed MB/s > Higher Is Better Default . 889.9 |============================================================== Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed MB/s > Higher Is Better Default . 1534.5 |============================================================= Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed MB/s > Higher Is Better Default . 876.9 |============================================================== Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed MB/s > Higher Is Better Default . 1648.1 |============================================================= Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed MB/s > Higher Is Better Default . 8.52 |=============================================================== Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed MB/s > Higher Is Better Default . 1357.8 |============================================================= srsRAN Project 23.5 Test: Downlink Processor Benchmark Mbps > Higher Is Better Default . 721.3 |============================================================== srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total Mbps > Higher Is Better Default . 18408.1 |============================================================ Embree 4.1 Binary: Pathtracer ISPC - Model: Crown Frames Per Second > Higher Is Better Default . 117.83 |============================================================= Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Frames Per Second > Higher Is Better Default . 144.00 |============================================================= Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj Frames Per Second > Higher Is Better Default . 123.51 |============================================================= ACES DGEMM 1.0 Sustained Floating-Point Rate GFLOP/s > Higher Is Better Default . 40.55 |============================================================== OSPRay 2.12 Benchmark: particle_volume/ao/real_time Items Per Second > Higher Is Better Default . 25.16 |============================================================== OSPRay 2.12 Benchmark: particle_volume/scivis/real_time Items Per Second > Higher Is Better Default . 25.11 |============================================================== OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time Items Per Second > Higher Is Better Default . 199.70 |============================================================= OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time Items Per Second > Higher Is Better Default . 26.79 |============================================================== OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time Items Per Second > Higher Is Better Default . 25.93 |============================================================== OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time Items Per Second > Higher Is Better Default . 26.51 |============================================================== 7-Zip Compression 22.01 Test: Compression Rating MIPS > Higher Is Better Default . 647063 |============================================================= 7-Zip Compression 22.01 Test: Decompression Rating MIPS > Higher Is Better Default . 626472 |============================================================= Stockfish 15 Total Time Nodes Per Second > Higher Is Better Default . 297289892 |========================================================== asmFish 2018-07-23 1024 Hash Memory, 26 Depth Nodes/second > Higher Is Better Default . 234040870 |========================================================== Timed Gem5 Compilation 21.2 Time To Compile Seconds < Lower Is Better Default . 137.69 |============================================================= Timed Godot Game Engine Compilation 4.0 Time To Compile Seconds < Lower Is Better Default . 88.38 |============================================================== Timed Linux Kernel Compilation 6.1 Build: defconfig Seconds < Lower Is Better Default . 22.92 |============================================================== Timed Linux Kernel Compilation 6.1 Build: allmodconfig Seconds < Lower Is Better Default . 202.56 |============================================================= Timed LLVM Compilation 16.0 Build System: Ninja Seconds < Lower Is Better Default . 112.89 |============================================================= Timed LLVM Compilation 16.0 Build System: Unix Makefiles Seconds < Lower Is Better Default . 184.05 |============================================================= Timed Node.js Compilation 19.8.1 Time To Compile Seconds < Lower Is Better Default . 105.23 |============================================================= Timed PHP Compilation 8.1.9 Time To Compile Seconds < Lower Is Better Default . 33.45 |============================================================== OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer ms < Lower Is Better Default . 1059 |=============================================================== OSPRay Studio 0.11 Camera: 2 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer ms < Lower Is Better Default . 1065 |=============================================================== OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer ms < Lower Is Better Default . 1261 |=============================================================== OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer ms < Lower Is Better Default . 16953 |============================================================== OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer ms < Lower Is Better Default . 34007 |============================================================== OSPRay Studio 0.11 Camera: 2 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer ms < Lower Is Better Default . 17078 |============================================================== OSPRay Studio 0.11 Camera: 2 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer ms < Lower Is Better Default . 34341 |============================================================== OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path Tracer ms < Lower Is Better Default . 20223 |============================================================== OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path Tracer ms < Lower Is Better Default . 40392 |============================================================== Numpy Benchmark Score > Higher Is Better Default . 586.50 |============================================================= Ngspice 34 Circuit: C2670 Seconds < Lower Is Better Default . 118.39 |============================================================= Ngspice 34 Circuit: C7552 Seconds < Lower Is Better Default . 100.85 |============================================================= Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better Default . 2181266667 |========================================================= Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Default . 2609466667 |========================================================= Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better Default . 3768733333 |========================================================= Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Default . 3876733333 |========================================================= Liquid-DSP 1.6 Threads: 192 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better Default . 4958533333 |========================================================= Liquid-DSP 1.6 Threads: 192 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better Default . 4634633333 |========================================================= Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better Default . 727523333 |========================================================== Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better Default . 1081100000 |========================================================= Liquid-DSP 1.6 Threads: 192 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better Default . 1287866667 |========================================================= ASKAP 1.0 Test: tConvolve MT - Gridding Million Grid Points Per Second > Higher Is Better Default . 13582.8 |============================================================ ASKAP 1.0 Test: tConvolve MT - Degridding Million Grid Points Per Second > Higher Is Better Default . 15603.2 |============================================================ ASKAP 1.0 Test: tConvolve MPI - Degridding Mpix/sec > Higher Is Better Default . 59791.1 |============================================================ ASKAP 1.0 Test: tConvolve MPI - Gridding Mpix/sec > Higher Is Better Default . 73226.7 |============================================================ ASKAP 1.0 Test: tConvolve OpenMP - Gridding Million Grid Points Per Second > Higher Is Better Default . 26625.6 |============================================================ ASKAP 1.0 Test: tConvolve OpenMP - Degridding Million Grid Points Per Second > Higher Is Better Default . 55153.0 |============================================================ ASKAP 1.0 Test: Hogbom Clean OpenMP Iterations Per Second > Higher Is Better Default . 1212.17 |============================================================ ASTC Encoder 4.0 Preset: Medium MT/s > Higher Is Better Default . 419.43 |============================================================= ASTC Encoder 4.0 Preset: Thorough MT/s > Higher Is Better Default . 56.78 |============================================================== ASTC Encoder 4.0 Preset: Exhaustive MT/s > Higher Is Better Default . 6.1411 |============================================================= GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better Default . 11.79 |============================================================== TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet images/sec > Higher Is Better Default . 303.80 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet images/sec > Higher Is Better Default . 518.15 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet images/sec > Higher Is Better Default . 786.60 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet images/sec > Higher Is Better Default . 1196.16 |============================================================ TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet images/sec > Higher Is Better Default . 1287.81 |============================================================ TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet images/sec > Higher Is Better Default . 156.94 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 images/sec > Higher Is Better Default . 57.21 |============================================================== TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet images/sec > Higher Is Better Default . 238.18 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 images/sec > Higher Is Better Default . 78.11 |============================================================== TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet images/sec > Higher Is Better Default . 296.19 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 images/sec > Higher Is Better Default . 94.50 |============================================================== TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet images/sec > Higher Is Better Default . 387.68 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: ResNet-50 images/sec > Higher Is Better Default . 115.51 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet images/sec > Higher Is Better Default . 409.02 |============================================================= TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: ResNet-50 images/sec > Higher Is Better Default . 116.15 |============================================================= Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 60.63 |============================================================== Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 783.58 |============================================================= Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 1219.50 |============================================================ Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 39.31 |============================================================== Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 329.98 |============================================================= Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 145.05 |============================================================= Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 344.41 |============================================================= Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 139.02 |============================================================= Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 797.95 |============================================================= Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 60.06 |============================================================== Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 513.19 |============================================================= Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 93.32 |============================================================== Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 114.18 |============================================================= Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 417.86 |============================================================= Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 257.57 |============================================================= Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 185.85 |============================================================= Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better Default . 60.58 |============================================================== Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better Default . 785.21 |============================================================= Google Draco 1.5.6 Model: Lion ms < Lower Is Better Default . 5011 |=============================================================== Google Draco 1.5.6 Model: Church Facade ms < Lower Is Better Default . 6043 |=============================================================== Stress-NG 0.15.10 Test: Hash Bogo Ops/s > Higher Is Better Default . 18746640.55 |======================================================== Stress-NG 0.15.10 Test: MMAP Bogo Ops/s > Higher Is Better Default . 1444.92 |============================================================ Stress-NG 0.15.10 Test: NUMA Bogo Ops/s > Higher Is Better Default . 2478.54 |============================================================ Stress-NG 0.15.10 Test: Pipe Bogo Ops/s > Higher Is Better Default . 60302912.67 |======================================================== Stress-NG 0.15.10 Test: Poll Bogo Ops/s > Higher Is Better Default . 13269096.62 |======================================================== Stress-NG 0.15.10 Test: Zlib Bogo Ops/s > Higher Is Better Default . 10471.17 |=========================================================== Stress-NG 0.15.10 Test: Futex Bogo Ops/s > Higher Is Better Default . 3985709.56 |========================================================= Stress-NG 0.15.10 Test: MEMFD Bogo Ops/s > Higher Is Better Default . 454.75 |============================================================= Stress-NG 0.15.10 Test: Mutex Bogo Ops/s > Higher Is Better Default . 49736118.06 |======================================================== Stress-NG 0.15.10 Test: Atomic Bogo Ops/s > Higher Is Better Default . 237.83 |============================================================= Stress-NG 0.15.10 Test: Crypto Bogo Ops/s > Higher Is Better Default . 202368.61 |========================================================== Stress-NG 0.15.10 Test: Malloc Bogo Ops/s > Higher Is Better Default . 360348999.65 |======================================================= Stress-NG 0.15.10 Test: Cloning Bogo Ops/s > Higher Is Better Default . 12478.74 |=========================================================== Stress-NG 0.15.10 Test: Forking Bogo Ops/s > Higher Is Better Default . 40400.83 |=========================================================== Stress-NG 0.15.10 Test: Pthread Bogo Ops/s > Higher Is Better Default . 103280.79 |========================================================== Stress-NG 0.15.10 Test: AVL Tree Bogo Ops/s > Higher Is Better Default . 1665.57 |============================================================ Stress-NG 0.15.10 Test: IO_uring Bogo Ops/s > Higher Is Better Default . 4790428.16 |========================================================= Stress-NG 0.15.10 Test: SENDFILE Bogo Ops/s > Higher Is Better Default . 1517063.84 |========================================================= Stress-NG 0.15.10 Test: CPU Cache Bogo Ops/s > Higher Is Better Default . 1397574.75 |========================================================= Stress-NG 0.15.10 Test: CPU Stress Bogo Ops/s > Higher Is Better Default . 212380.76 |========================================================== Stress-NG 0.15.10 Test: Semaphores Bogo Ops/s > Higher Is Better Default . 223213939.86 |======================================================= Stress-NG 0.15.10 Test: Matrix Math Bogo Ops/s > Higher Is Better Default . 418033.13 |========================================================== Stress-NG 0.15.10 Test: Vector Math Bogo Ops/s > Higher Is Better Default . 545725.73 |========================================================== Stress-NG 0.15.10 Test: Function Call Bogo Ops/s > Higher Is Better Default . 67313.39 |=========================================================== Stress-NG 0.15.10 Test: x86_64 RdRand Stress-NG 0.15.10 Test: Floating Point Bogo Ops/s > Higher Is Better Default . 29883.93 |=========================================================== Stress-NG 0.15.10 Test: Matrix 3D Math Bogo Ops/s > Higher Is Better Default . 16595.28 |=========================================================== Stress-NG 0.15.10 Test: Memory Copying Bogo Ops/s > Higher Is Better Default . 32994.18 |=========================================================== Stress-NG 0.15.10 Test: Vector Shuffle Bogo Ops/s > Higher Is Better Default . 63761.29 |=========================================================== Stress-NG 0.15.10 Test: Socket Activity Bogo Ops/s > Higher Is Better Default . 2960.24 |============================================================ Stress-NG 0.15.10 Test: Wide Vector Math Bogo Ops/s > Higher Is Better Default . 3485374.33 |========================================================= Stress-NG 0.15.10 Test: Context Switching Bogo Ops/s > Higher Is Better Default . 15480601.67 |======================================================== Stress-NG 0.15.10 Test: Fused Multiply-Add Bogo Ops/s > Higher Is Better Default . 76566577.12 |======================================================== Stress-NG 0.15.10 Test: Vector Floating Point Bogo Ops/s > Higher Is Better Default . 257925.61 |========================================================== Stress-NG 0.15.10 Test: Glibc C String Functions Bogo Ops/s > Higher Is Better Default . 81208060.17 |======================================================== Stress-NG 0.15.10 Test: Glibc Qsort Data Sorting Bogo Ops/s > Higher Is Better Default . 2108.30 |============================================================ Stress-NG 0.15.10 Test: System V Message Passing Bogo Ops/s > Higher Is Better Default . 12085427.22 |======================================================== WRF 4.2.2 Input: conus 2.5km Seconds < Lower Is Better Default . 11269.26 |=========================================================== GPAW 23.6 Input: Carbon Nanotube Seconds < Lower Is Better Default . 34.77 |============================================================== Blender 3.6 Blend File: BMW27 - Compute: CPU-Only Seconds < Lower Is Better Default . 16.28 |============================================================== Blender 3.6 Blend File: Classroom - Compute: CPU-Only Seconds < Lower Is Better Default . 40.51 |============================================================== Blender 3.6 Blend File: Fishy Cat - Compute: CPU-Only Seconds < Lower Is Better Default . 20.62 |============================================================== Blender 3.6 Blend File: Barbershop - Compute: CPU-Only Seconds < Lower Is Better Default . 142.03 |============================================================= Blender 3.6 Blend File: Pabellon Barcelona - Compute: CPU-Only Seconds < Lower Is Better Default . 49.61 |============================================================== OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU FPS > Higher Is Better Default . 47.62 |============================================================== OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU ms < Lower Is Better Default . 1003.51 |============================================================ OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU FPS > Higher Is Better Default . 27.11 |============================================================== OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU ms < Lower Is Better Default . 1753.63 |============================================================ OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU FPS > Higher Is Better Default . 26.89 |============================================================== OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU ms < Lower Is Better Default . 1766.54 |============================================================ OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU FPS > Higher Is Better Default . 3860.64 |============================================================ OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU ms < Lower Is Better Default . 12.42 |============================================================== OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU FPS > Higher Is Better Default . 90.36 |============================================================== OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU ms < Lower Is Better Default . 529.68 |============================================================= OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU FPS > Higher Is Better Default . 5672.06 |============================================================ OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU ms < Lower Is Better Default . 8.45 |=============================================================== OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU FPS > Higher Is Better Default . 4663.97 |============================================================ OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU ms < Lower Is Better Default . 10.28 |============================================================== OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU FPS > Higher Is Better Default . 535.77 |============================================================= OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU ms < Lower Is Better Default . 89.50 |============================================================== OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU FPS > Higher Is Better Default . 8846.67 |============================================================ OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU ms < Lower Is Better Default . 10.84 |============================================================== OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU FPS > Higher Is Better Default . 5732.75 |============================================================ OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU ms < Lower Is Better Default . 8.36 |=============================================================== OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU FPS > Higher Is Better Default . 94505.11 |=========================================================== OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU ms < Lower Is Better Default . 0.84 |=============================================================== OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU FPS > Higher Is Better Default . 64461.93 |=========================================================== OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU ms < Lower Is Better Default . 1.33 |=============================================================== PETSc 3.19 Test: Streams MB/s > Higher Is Better Default . 272616.80 |========================================================== PyHPC Benchmarks 3.0 Device: CPU - Backend: JAX - Project Size: 4194304 - Benchmark: Equation of State Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: JAX - Project Size: 4194304 - Benchmark: Isoneutral Mixing Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: Numba - Project Size: 4194304 - Benchmark: Equation of State Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: Numba - Project Size: 4194304 - Benchmark: Isoneutral Mixing Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Equation of State Seconds < Lower Is Better Default . 0.767 |============================================================== PyHPC Benchmarks 3.0 Device: CPU - Backend: Numpy - Project Size: 4194304 - Benchmark: Isoneutral Mixing Seconds < Lower Is Better Default . 1.578 |============================================================== PyHPC Benchmarks 3.0 Device: CPU - Backend: Aesara - Project Size: 4194304 - Benchmark: Equation of State Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: Aesara - Project Size: 4194304 - Benchmark: Isoneutral Mixing Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: PyTorch - Project Size: 4194304 - Benchmark: Equation of State Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: PyTorch - Project Size: 4194304 - Benchmark: Isoneutral Mixing Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: TensorFlow - Project Size: 4194304 - Benchmark: Equation of State Seconds < Lower Is Better PyHPC Benchmarks 3.0 Device: CPU - Backend: TensorFlow - Project Size: 4194304 - Benchmark: Isoneutral Mixing Seconds < Lower Is Better Whisper.cpp 1.4 Model: ggml-base.en - Input: 2016 State of the Union Seconds < Lower Is Better Default . 332.02 |============================================================= Whisper.cpp 1.4 Model: ggml-small.en - Input: 2016 State of the Union Seconds < Lower Is Better Default . 769.11 |============================================================= Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union Seconds < Lower Is Better Default . 1444.14 |============================================================ Kripke 1.2.6 Throughput FoM > Higher Is Better