lll onnx AMD Ryzen Threadripper 7980X 64-Cores testing with a ASUS Pro WS TRX50-SAGE WIFI (0607 BIOS) and NAVI32 16GB on Pop 22.04 via the Phoronix Test Suite. a: Processor: AMD Ryzen Threadripper 7980X 64-Cores @ 8.21GHz (64 Cores / 128 Threads), Motherboard: ASUS Pro WS TRX50-SAGE WIFI (0607 BIOS), Chipset: AMD Device 14a4, Memory: 4 x 32GB DRAM-6400MT/s F5-6400R3239G32GQ, Disk: 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: NAVI32 16GB (2124/1218MHz), Audio: Realtek ALC1220, Monitor: DELL U2723QE, Network: Aquantia Device 04c0 + Intel Device 125b + MEDIATEK Device 0616 OS: Pop 22.04, Kernel: 6.6.6-76060606-generic (x86_64), Desktop: GNOME Shell 42.5, Display Server: X Server, OpenGL: 4.6 Mesa 23.3.2-1pop0~1704238321~22.04~36f1d0e (LLVM 15.0.7 DRM 3.54), Vulkan: 1.3.267, Compiler: GCC 11.4.0, File-System: ext4, Screen Resolution: 3840x2160 b: Processor: AMD Ryzen Threadripper 7980X 64-Cores @ 8.21GHz (64 Cores / 128 Threads), Motherboard: ASUS Pro WS TRX50-SAGE WIFI (0607 BIOS), Chipset: AMD Device 14a4, Memory: 4 x 32GB DRAM-6400MT/s F5-6400R3239G32GQ, Disk: 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: NAVI32 16GB (2124/1218MHz), Audio: Realtek ALC1220, Monitor: DELL U2723QE, Network: Aquantia Device 04c0 + Intel Device 125b + MEDIATEK Device 0616 OS: Pop 22.04, Kernel: 6.6.6-76060606-generic (x86_64), Desktop: GNOME Shell 42.5, Display Server: X Server, OpenGL: 4.6 Mesa 23.3.2-1pop0~1704238321~22.04~36f1d0e (LLVM 15.0.7 DRM 3.54), Vulkan: 1.3.267, Compiler: GCC 11.4.0, File-System: ext4, Screen Resolution: 3840x2160 c: Processor: AMD Ryzen Threadripper 7980X 64-Cores @ 8.21GHz (64 Cores / 128 Threads), Motherboard: ASUS Pro WS TRX50-SAGE WIFI (0607 BIOS), Chipset: AMD Device 14a4, Memory: 4 x 32GB DRAM-6400MT/s F5-6400R3239G32GQ, Disk: 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: NAVI32 16GB (2124/1218MHz), Audio: Realtek ALC1220, Monitor: DELL U2723QE, Network: Aquantia Device 04c0 + Intel Device 125b + MEDIATEK Device 0616 OS: Pop 22.04, Kernel: 6.6.6-76060606-generic (x86_64), Desktop: GNOME Shell 42.5, Display Server: X Server, OpenGL: 4.6 Mesa 23.3.2-1pop0~1704238321~22.04~36f1d0e (LLVM 15.0.7 DRM 3.54), Vulkan: 1.3.267, Compiler: GCC 11.4.0, File-System: ext4, Screen Resolution: 3840x2160 d: Processor: AMD Ryzen Threadripper 7980X 64-Cores @ 8.21GHz (64 Cores / 128 Threads), Motherboard: ASUS Pro WS TRX50-SAGE WIFI (0607 BIOS), Chipset: AMD Device 14a4, Memory: 4 x 32GB DRAM-6400MT/s F5-6400R3239G32GQ, Disk: 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: NAVI32 16GB (2124/1218MHz), Audio: Realtek ALC1220, Monitor: DELL U2723QE, Network: Aquantia Device 04c0 + Intel Device 125b + MEDIATEK Device 0616 OS: Pop 22.04, Kernel: 6.6.6-76060606-generic (x86_64), Desktop: GNOME Shell 42.5, Display Server: X Server, OpenGL: 4.6 Mesa 23.3.2-1pop0~1704238321~22.04~36f1d0e (LLVM 15.0.7 DRM 3.54), Vulkan: 1.3.267, Compiler: GCC 11.4.0, File-System: ext4, Screen Resolution: 3840x2160 ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 178.53 |================================================================== b . 178.97 |=================================================================== c . 178.80 |=================================================================== d . 179.91 |=================================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 161.47 |================================================================== b . 162.17 |=================================================================== c . 163.10 |=================================================================== d . 159.67 |================================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 11.37 |================================================================== b . 11.38 |================================================================== c . 11.21 |================================================================= d . 11.66 |==================================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 11.14 |==================================================================== b . 10.74 |================================================================== c . 10.85 |================================================================== d . 11.14 |==================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 394.97 |================================================================== b . 397.55 |=================================================================== c . 394.76 |================================================================== d . 397.95 |=================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 251.62 |=================================================================== b . 250.62 |=================================================================== c . 249.14 |================================================================== d . 249.00 |================================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 13.40 |==================================================================== b . 13.15 |=================================================================== c . 13.29 |=================================================================== d . 12.78 |================================================================= ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 16.82 |==================================================================== b . 15.96 |================================================================= c . 15.85 |================================================================ d . 15.93 |================================================================ ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 752.48 |================================================================= b . 771.51 |=================================================================== c . 746.48 |================================================================= d . 738.65 |================================================================ ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 768.49 |================================================================= b . 702.14 |============================================================ c . 694.31 |=========================================================== d . 788.87 |=================================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 3.33783 |================================================================ b . 3.42174 |================================================================== c . 3.44674 |================================================================== d . 3.43087 |================================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 5.03178 |========================================================= b . 5.39325 |============================================================= c . 5.80848 |================================================================== d . 4.79144 |====================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 35.29 |==================================================================== b . 34.32 |================================================================== c . 34.20 |================================================================== d . 33.97 |================================================================= ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 42.41 |==================================================================== b . 38.51 |============================================================== c . 38.91 |============================================================== d . 38.77 |============================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 263.06 |================================================================== b . 267.12 |=================================================================== c . 261.19 |================================================================== d . 252.84 |=============================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 302.69 |================================================================= b . 312.22 |=================================================================== c . 300.16 |================================================================ d . 298.81 |================================================================ ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 140.32 |=================================================================== b . 140.17 |=================================================================== c . 138.46 |================================================================== d . 138.37 |================================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 143.64 |================================================================== b . 146.30 |=================================================================== c . 144.66 |================================================================== d . 145.86 |=================================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 35.67 |================================================================= b . 36.60 |=================================================================== c . 37.22 |==================================================================== d . 37.05 |==================================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 46.48 |================================================================ b . 49.04 |==================================================================== c . 46.57 |================================================================= d . 46.65 |================================================================= LZ4 Compression 1.9.4 Compression Level: 1 - Compression Speed MB/s > Higher Is Better a . 994.40 |=================================================================== b . 988.75 |=================================================================== c . 991.61 |=================================================================== d . 992.06 |=================================================================== LZ4 Compression 1.9.4 Compression Level: 1 - Decompression Speed MB/s > Higher Is Better a . 6415.8 |=================================================================== b . 6383.7 |=================================================================== c . 6396.8 |=================================================================== d . 6398.3 |=================================================================== LZ4 Compression 1.9.4 Compression Level: 3 - Compression Speed MB/s > Higher Is Better a . 151.28 |================================================================= b . 152.15 |================================================================= c . 156.43 |=================================================================== d . 149.86 |================================================================ LZ4 Compression 1.9.4 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better a . 5718.1 |================================================================= b . 5737.4 |================================================================= c . 5884.5 |=================================================================== d . 5633.0 |================================================================ LZ4 Compression 1.9.4 Compression Level: 9 - Compression Speed MB/s > Higher Is Better a . 49.70 |=================================================================== b . 49.59 |=================================================================== c . 50.15 |==================================================================== d . 49.93 |==================================================================== LZ4 Compression 1.9.4 Compression Level: 9 - Decompression Speed MB/s > Higher Is Better a . 6029.7 |================================================================== b . 5995.6 |================================================================== c . 6088.1 |=================================================================== d . 6019.3 |================================================================== Llamafile 0.6 Test: llava-v1.5-7b-q4 - Acceleration: CPU Tokens Per Second > Higher Is Better a . 27.00 |=================================================================== b . 27.12 |==================================================================== c . 26.93 |=================================================================== d . 27.23 |==================================================================== Llamafile 0.6 Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU Tokens Per Second > Higher Is Better a . 16.80 |==================================================================== b . 16.76 |==================================================================== c . 16.72 |==================================================================== d . 16.82 |==================================================================== Llamafile 0.6 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU Tokens Per Second > Higher Is Better a . 5.98 |===================================================================== b . 5.97 |===================================================================== c . 5.97 |===================================================================== d . 5.97 |===================================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 5.59535 |================================================================== b . 5.58189 |================================================================== c . 5.58651 |================================================================== d . 5.55278 |================================================================= ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 6.19118 |================================================================= b . 6.16453 |================================================================= c . 6.13468 |================================================================= d . 6.26087 |================================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 87.92 |=================================================================== b . 87.87 |=================================================================== c . 89.24 |==================================================================== d . 85.78 |================================================================= ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 89.79 |================================================================== b . 93.15 |==================================================================== c . 92.20 |=================================================================== d . 89.78 |================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 2.53056 |================================================================== b . 2.51439 |================================================================== c . 2.53183 |================================================================== d . 2.51190 |================================================================= ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 3.97365 |================================================================= b . 3.98960 |================================================================== c . 4.01444 |================================================================== d . 4.01557 |================================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 74.60 |================================================================= b . 76.07 |================================================================== c . 75.22 |================================================================= d . 78.24 |==================================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 59.44 |================================================================ b . 62.68 |==================================================================== c . 63.10 |==================================================================== d . 62.78 |==================================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 1.32744 |================================================================= b . 1.29988 |=============================================================== c . 1.33991 |================================================================= d . 1.35233 |================================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 1.30087 |============================================================ b . 1.42403 |================================================================= c . 1.44009 |================================================================== d . 1.26730 |========================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 299.59 |=================================================================== b . 292.35 |================================================================= c . 290.22 |================================================================= d . 291.47 |================================================================= ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 198.74 |================================================================ b . 187.89 |============================================================ c . 174.83 |======================================================== d . 208.70 |=================================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 28.34 |================================================================= b . 29.14 |=================================================================== c . 29.24 |==================================================================== d . 29.44 |==================================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 23.58 |============================================================== b . 25.96 |==================================================================== c . 25.70 |=================================================================== d . 25.79 |==================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 3.80037 |=============================================================== b . 3.74911 |=============================================================== c . 3.82755 |================================================================ d . 3.95402 |================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 3.30338 |================================================================= b . 3.21958 |================================================================ c . 3.33110 |================================================================== d . 3.34628 |================================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 7.12533 |================================================================= b . 7.13343 |================================================================= c . 7.22117 |================================================================== d . 7.22558 |================================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 6.96159 |================================================================== b . 6.83507 |================================================================= c . 6.91241 |================================================================== d . 6.85547 |================================================================= ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 28.03 |==================================================================== b . 27.33 |================================================================== c . 26.87 |================================================================= d . 26.99 |================================================================= ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 21.51 |==================================================================== b . 20.60 |================================================================= c . 21.47 |==================================================================== d . 21.43 |====================================================================