POWER9 Blackbird POWER9 testing with a PowerNV C1P9S01 REV 1.01 and ASPEED on Ubuntu 20.10 via the Phoronix Test Suite. Run 1: Processor: POWER9 @ 3.80GHz (4 Cores / 16 Threads), Motherboard: PowerNV C1P9S01 REV 1.01, Memory: 128GB, Disk: 1024GB SAMSUNG MZVLB1T0HALR-000L7, Graphics: ASPEED, Network: 3 x Broadcom NetXtreme BCM5719 PCIe OS: Ubuntu 20.10, Kernel: 5.8.0-29-generic (ppc64le), Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1024x768 Run 2: Processor: POWER9 @ 3.80GHz (4 Cores / 16 Threads), Motherboard: PowerNV C1P9S01 REV 1.01, Memory: 128GB, Disk: 1024GB SAMSUNG MZVLB1T0HALR-000L7, Graphics: ASPEED, Network: 3 x Broadcom NetXtreme BCM5719 PCIe OS: Ubuntu 20.10, Kernel: 5.8.0-29-generic (ppc64le), Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1024x768 Run 3: Processor: POWER9 @ 3.80GHz (4 Cores / 16 Threads), Motherboard: PowerNV C1P9S01 REV 1.01, Memory: 128GB, Disk: 1024GB SAMSUNG MZVLB1T0HALR-000L7, Graphics: ASPEED, Network: 3 x Broadcom NetXtreme BCM5719 PCIe OS: Ubuntu 20.10, Kernel: 5.8.0-29-generic (ppc64le), Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1024x768 4: Processor: POWER9 @ 3.80GHz (4 Cores / 16 Threads), Motherboard: PowerNV C1P9S01 REV 1.01, Memory: 128GB, Disk: 1024GB SAMSUNG MZVLB1T0HALR-000L7, Graphics: ASPEED, Network: 3 x Broadcom NetXtreme BCM5719 PCIe OS: Ubuntu 20.10, Kernel: 5.8.0-29-generic (ppc64le), Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1024x768 CLOMP 1.2 Static OMP Speedup Speedup > Higher Is Better Run 1 . 8.9 |================================================================== Run 2 . 7.8 |========================================================== Run 3 . 7.9 |=========================================================== High Performance Conjugate Gradient 3.1 GFLOP/s > Higher Is Better Run 1 . 3.22415 |=========================================================== Run 2 . 3.40852 |============================================================== Run 3 . 3.37791 |============================================================= 4 ..... 3.27655 |============================================================ Dolfyn 0.527 Computational Fluid Dynamics Seconds < Lower Is Better Run 1 . 39.05 |============================================================= Run 2 . 38.59 |============================================================= Run 3 . 40.75 |================================================================ 4 ..... 38.97 |============================================================= Monkey Audio Encoding 3.99.6 WAV To APE Seconds < Lower Is Better Run 1 . 21.71 |============================================================= Run 2 . 22.65 |================================================================ Run 3 . 22.68 |================================================================ Timed Apache Compilation 2.4.41 Time To Compile Seconds < Lower Is Better Run 1 . 56.71 |================================================================ Run 2 . 54.57 |============================================================== Run 3 . 54.59 |============================================================== 4 ..... 54.42 |============================================================= Basis Universal 1.12 Settings: ETC1S Seconds < Lower Is Better Run 1 . 107.71 |=============================================================== Run 2 . 104.17 |============================================================= Run 3 . 103.80 |============================================================= BRL-CAD 7.30.8 VGR Performance Metric VGR Performance Metric > Higher Is Better Run 1 . 25777 |============================================================== Run 2 . 26544 |================================================================ Run 3 . 26662 |================================================================ C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better Run 1 . 216.93 |=============================================================== Run 2 . 213.18 |============================================================== Run 3 . 211.38 |============================================================= GraphicsMagick 1.3.33 Operation: Sharpen Iterations Per Minute > Higher Is Better Run 1 . 41 |================================================================= Run 2 . 42 |=================================================================== Run 3 . 41 |================================================================= 4 ..... 42 |=================================================================== oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU ms < Lower Is Better Run 1 . 35678.4 |============================================================== Run 2 . 34887.4 |============================================================= Run 3 . 35697.9 |============================================================== 4 ..... 34900.7 |============================================================= x264 2019-12-17 H.264 Video Encoding Frames Per Second > Higher Is Better Run 1 . 13.95 |================================================================ Run 2 . 13.68 |=============================================================== Run 3 . 13.64 |=============================================================== 4 ..... 13.72 |=============================================================== Basis Universal 1.12 Settings: UASTC Level 3 Seconds < Lower Is Better Run 1 . 185.43 |=============================================================== Run 2 . 183.15 |============================================================== Run 3 . 181.48 |============================================================== Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA Seconds < Lower Is Better Run 1 . 14.17 |=============================================================== Run 2 . 14.09 |=============================================================== Run 3 . 14.40 |================================================================ 4 ..... 14.11 |=============================================================== Basis Universal 1.12 Settings: UASTC Level 2 Seconds < Lower Is Better Run 1 . 91.43 |================================================================ Run 2 . 90.66 |=============================================================== Run 3 . 89.73 |=============================================================== Smallpt 1.0 Global Illumination Renderer; 128 Samples Seconds < Lower Is Better Run 1 . 39.98 |================================================================ Run 2 . 39.60 |=============================================================== Run 3 . 39.24 |=============================================================== GraphicsMagick 1.3.33 Operation: Enhanced Iterations Per Minute > Higher Is Better Run 1 . 54 |=================================================================== Run 2 . 54 |=================================================================== Run 3 . 53 |================================================================== 4 ..... 54 |=================================================================== GraphicsMagick 1.3.33 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better Run 1 . 57 |================================================================== Run 2 . 58 |=================================================================== Run 3 . 57 |================================================================== 4 ..... 58 |=================================================================== Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing Seconds < Lower Is Better Run 1 . 1413.84 |============================================================== Run 2 . 1396.34 |============================================================= Run 3 . 1389.47 |============================================================= oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Run 1 . 34619.5 |============================================================= Run 2 . 34819.2 |============================================================= Run 3 . 35222.4 |============================================================== 4 ..... 34863.6 |============================================================= GraphicsMagick 1.3.33 Operation: Resizing Iterations Per Minute > Higher Is Better Run 1 . 309 |================================================================= Run 2 . 312 |================================================================== Run 3 . 308 |================================================================= 4 ..... 313 |================================================================== BYTE Unix Benchmark 3.6 Computational Test: Dhrystone 2 LPS > Higher Is Better Run 1 . 26948919.6 |=========================================================== Run 2 . 26802144.6 |=========================================================== Run 3 . 26536978.6 |========================================================== 4 ..... 26522860.7 |========================================================== oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU ms < Lower Is Better Run 1 . 410.52 |============================================================== Run 2 . 414.56 |=============================================================== Run 3 . 413.27 |============================================================== 4 ..... 416.66 |=============================================================== oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU ms < Lower Is Better Run 1 . 47.59 |================================================================ Run 2 . 47.05 |=============================================================== Run 3 . 47.32 |================================================================ 4 ..... 46.92 |=============================================================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better Run 1 . 83541.79 |============================================================ Run 2 . 84212.96 |============================================================= Run 3 . 83708.51 |============================================================ 4 ..... 84749.71 |============================================================= oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU ms < Lower Is Better Run 1 . 40.31 |================================================================ Run 2 . 40.23 |================================================================ Run 3 . 40.11 |================================================================ 4 ..... 39.75 |=============================================================== oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Run 1 . 69366.7 |============================================================== Run 2 . 68947.3 |============================================================== Run 3 . 69145.5 |============================================================== 4 ..... 68410.4 |============================================================= oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU ms < Lower Is Better Run 1 . 68835.1 |============================================================== Run 2 . 68454.8 |============================================================== Run 3 . 68623.4 |============================================================== 4 ..... 67915.1 |============================================================= oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better Run 1 . 69245.6 |============================================================== Run 2 . 68964.1 |============================================================== Run 3 . 68767.9 |============================================================== 4 ..... 68367.1 |============================================================= Build2 0.13 Time To Compile Seconds < Lower Is Better Run 1 . 474.42 |=============================================================== Run 2 . 471.48 |=============================================================== Run 3 . 468.50 |============================================================== Timed FFmpeg Compilation 4.2.2 Time To Compile Seconds < Lower Is Better Run 1 . 198.73 |=============================================================== Run 2 . 199.15 |=============================================================== Run 3 . 196.89 |============================================================== GraphicsMagick 1.3.33 Operation: HWB Color Space Iterations Per Minute > Higher Is Better Run 1 . 454 |================================================================= Run 2 . 459 |================================================================== Run 3 . 459 |================================================================== 4 ..... 458 |================================================================== oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU ms < Lower Is Better Run 1 . 69.61 |================================================================ Run 2 . 69.25 |================================================================ Run 3 . 68.85 |=============================================================== 4 ..... 69.57 |================================================================ x265 3.4 Video Input: Bosphorus 1080p Frames Per Second > Higher Is Better Run 1 . 5.52 |================================================================ Run 2 . 5.56 |================================================================= Run 3 . 5.55 |================================================================= 4 ..... 5.58 |================================================================= oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better Run 1 . 34997.2 |============================================================== Run 2 . 34956.6 |============================================================== Run 3 . 34743.7 |============================================================== 4 ..... 34670.7 |============================================================= WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression Encode Time - Seconds < Lower Is Better Run 1 . 62.54 |================================================================ Run 2 . 62.11 |=============================================================== Run 3 . 62.68 |================================================================ 4 ..... 62.20 |================================================================ RNNoise 2020-06-28 Seconds < Lower Is Better Run 1 . 38.12 |=============================================================== Run 2 . 38.13 |=============================================================== Run 3 . 38.47 |================================================================ Basis Universal 1.12 Settings: UASTC Level 0 Seconds < Lower Is Better Run 1 . 14.23 |================================================================ Run 2 . 14.17 |================================================================ Run 3 . 14.10 |=============================================================== libavif avifenc 0.7.3 Encoder Speed: 8 Seconds < Lower Is Better Run 1 . 30.40 |================================================================ Run 2 . 30.15 |=============================================================== Run 3 . 30.30 |================================================================ libavif avifenc 0.7.3 Encoder Speed: 2 Seconds < Lower Is Better Run 1 . 815.68 |=============================================================== Run 2 . 809.99 |=============================================================== Run 3 . 809.27 |=============================================================== WebP Image Encode 1.1 Encode Settings: Default Encode Time - Seconds < Lower Is Better Run 1 . 7.873 |================================================================ Run 2 . 7.872 |================================================================ Run 3 . 7.933 |================================================================ 4 ..... 7.871 |=============================================================== oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Run 1 . 87.32 |================================================================ Run 2 . 87.43 |================================================================ Run 3 . 87.10 |================================================================ 4 ..... 86.78 |================================================================ oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU ms < Lower Is Better Run 1 . 56.92 |================================================================ Run 2 . 57.17 |================================================================ Run 3 . 57.15 |================================================================ 4 ..... 57.33 |================================================================ libavif avifenc 0.7.3 Encoder Speed: 10 Seconds < Lower Is Better Run 1 . 26.89 |================================================================ Run 2 . 26.80 |================================================================ Run 3 . 26.71 |================================================================ Timed Eigen Compilation 3.3.9 Time To Compile Seconds < Lower Is Better Run 1 . 128.44 |=============================================================== Run 2 . 129.12 |=============================================================== Run 3 . 128.97 |=============================================================== Compile Bench 0.6 Test: Compile MB/s > Higher Is Better Run 1 . 2152.14 |============================================================== Run 2 . 2140.98 |============================================================== Run 3 . 2145.28 |============================================================== 4 ..... 2142.99 |============================================================== GraphicsMagick 1.3.33 Operation: Rotate Iterations Per Minute > Higher Is Better Run 1 . 720 |================================================================== Run 2 . 722 |================================================================== Run 3 . 723 |================================================================== 4 ..... 721 |================================================================== SQLite Speedtest 3.30 Timed Time - Size 1,000 Seconds < Lower Is Better Run 1 . 165.44 |=============================================================== Run 2 . 165.97 |=============================================================== Run 3 . 165.89 |=============================================================== 4 ..... 166.11 |=============================================================== C-Blosc 2.0 Beta 5 Compressor: blosclz MB/s > Higher Is Better Run 1 . 9147.1 |=============================================================== Run 2 . 9138.6 |=============================================================== Run 3 . 9113.9 |=============================================================== 4 ..... 9121.6 |=============================================================== oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Run 1 . 73.74 |================================================================ Run 2 . 73.48 |================================================================ Run 3 . 73.56 |================================================================ 4 ..... 73.54 |================================================================ oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Run 1 . 113.78 |=============================================================== Run 2 . 113.73 |=============================================================== Run 3 . 113.83 |=============================================================== 4 ..... 113.44 |=============================================================== oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Run 1 . 211.05 |=============================================================== Run 2 . 210.85 |=============================================================== Run 3 . 210.44 |=============================================================== 4 ..... 210.73 |=============================================================== oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Run 1 . 79.63 |================================================================ Run 2 . 79.48 |================================================================ Run 3 . 79.53 |================================================================ 4 ..... 79.42 |================================================================ Node.js V8 Web Tooling Benchmark runs/s > Higher Is Better Run 1 . 4.06 |================================================================= Run 2 . 4.06 |================================================================= Run 3 . 4.07 |================================================================= libavif avifenc 0.7.3 Encoder Speed: 0 Seconds < Lower Is Better Run 1 . 1461.48 |============================================================== Run 2 . 1458.01 |============================================================== Run 3 . 1459.94 |============================================================== Compile Bench 0.6 Test: Read Compiled Tree MB/s > Higher Is Better Run 1 . 1537.55 |============================================================== Run 2 . 1534.17 |============================================================== Run 3 . 1534.99 |============================================================== 4 ..... 1535.62 |============================================================== Compile Bench 0.6 Test: Initial Create MB/s > Higher Is Better Run 1 . 269.80 |=============================================================== Run 2 . 269.82 |=============================================================== Run 3 . 269.50 |=============================================================== 4 ..... 270.06 |=============================================================== oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU ms < Lower Is Better Run 1 . 433.18 |=============================================================== Run 2 . 434.07 |=============================================================== Run 3 . 433.87 |=============================================================== 4 ..... 434.08 |=============================================================== Crypto++ 8.2 Test: All Algorithms MiB/second > Higher Is Better Run 1 . 719.70 |=============================================================== Run 2 . 719.22 |=============================================================== Run 3 . 718.39 |=============================================================== 4 ..... 719.51 |=============================================================== WavPack Audio Encoding 5.3 WAV To WavPack Seconds < Lower Is Better Run 1 . 129.46 |=============================================================== Run 2 . 129.25 |=============================================================== Run 3 . 129.32 |=============================================================== PHPBench 0.8.1 PHP Benchmark Suite Score > Higher Is Better Run 1 . 168649 |=============================================================== Run 2 . 168821 |=============================================================== Run 3 . 168642 |=============================================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless Encode Time - Seconds < Lower Is Better Run 1 . 29.98 |================================================================ Run 2 . 30.00 |================================================================ Run 3 . 29.99 |================================================================ 4 ..... 29.98 |================================================================ Crypto++ 8.2 Test: Integer + Elliptic Curve Public Key Algorithms MiB/second > Higher Is Better Run 1 . 1927.74 |============================================================== Run 2 . 1928.69 |============================================================== Run 3 . 1928.66 |============================================================== 4 ..... 1928.85 |============================================================== WebP Image Encode 1.1 Encode Settings: Quality 100 Encode Time - Seconds < Lower Is Better Run 1 . 10.96 |================================================================ Run 2 . 10.96 |================================================================ Run 3 . 10.96 |================================================================ 4 ..... 10.96 |================================================================ Crypto++ 8.2 Test: Unkeyed Algorithms MiB/second > Higher Is Better Run 1 . 206.58 |=============================================================== Run 2 . 206.57 |=============================================================== Run 3 . 206.55 |=============================================================== 4 ..... 206.62 |=============================================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Encode Time - Seconds < Lower Is Better Run 1 . 17.67 |================================================================ Run 2 . 17.67 |================================================================ Run 3 . 17.66 |================================================================ 4 ..... 17.66 |================================================================ Crypto++ 8.2 Test: Keyed Algorithms MiB/second > Higher Is Better Run 1 . 297.73 |=============================================================== Run 2 . 297.71 |=============================================================== Run 3 . 297.70 |=============================================================== 4 ..... 297.69 |=============================================================== simdjson 0.7.1 Throughput Test: DistinctUserID GB/s > Higher Is Better Run 1 . 1.23 |================================================================= Run 2 . 1.23 |================================================================= Run 3 . 1.23 |================================================================= 4 ..... 1.23 |================================================================= simdjson 0.7.1 Throughput Test: PartialTweets GB/s > Higher Is Better Run 1 . 1.19 |================================================================= Run 2 . 1.19 |================================================================= Run 3 . 1.19 |================================================================= 4 ..... 1.19 |================================================================= simdjson 0.7.1 Throughput Test: LargeRandom GB/s > Higher Is Better Run 1 . 0.48 |================================================================= Run 2 . 0.48 |================================================================= Run 3 . 0.48 |================================================================= 4 ..... 0.48 |================================================================= simdjson 0.7.1 Throughput Test: Kostya GB/s > Higher Is Better Run 1 . 1.06 |================================================================= Run 2 . 1.06 |================================================================= Run 3 . 1.06 |================================================================= 4 ..... 1.06 |================================================================= oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU ms < Lower Is Better Run 1 . 117.88 |=========================================================== Run 2 . 124.74 |============================================================== Run 3 . 124.07 |============================================================== 4 ..... 126.07 |=============================================================== GraphicsMagick 1.3.33 Operation: Swirl Iterations Per Minute > Higher Is Better Run 1 . 145 |================================================================ Run 2 . 148 |================================================================== Run 3 . 146 |================================================================= 4 ..... 149 |================================================================== x265 3.4 Video Input: Bosphorus 4K Frames Per Second > Higher Is Better Run 1 . 1.33 |============================================================== Run 2 . 1.38 |================================================================= Run 3 . 1.37 |================================================================ 4 ..... 1.39 |=================================================================