Xavier Carmel CPU Core Compiler Tests NVIDIA Jetson Xavier ARMv8 compiler benchmarks on GCC and LLVM Clang for a future article on Phoronix. GCC 7.3.0: Processor: ARMv8 rev 0 @ 2.27GHz (8 Cores), Motherboard: jetson-xavier, Memory: 16384MB, Disk: 31GB HBG4a2, Graphics: NVIDIA TEGRA OS: Ubuntu 18.04, Kernel: 4.9.108-tegra (aarch64), Desktop: Unity 7.5.0, Display Server: X Server 1.19.6, Display Driver: NVIDIA 1.0.0, Vulkan: 1.1.76, Compiler: GCC 7.3.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 1920x2160 Clang 6.0: Processor: ARMv8 rev 0 @ 2.27GHz (8 Cores), Motherboard: jetson-xavier, Memory: 16384MB, Disk: 31GB HBG4a2, Graphics: NVIDIA TEGRA OS: Ubuntu 18.04, Kernel: 4.9.108-tegra (aarch64), Desktop: Unity 7.5.0, Display Server: X Server 1.19.6, Display Driver: NVIDIA 1.0.0, Vulkan: 1.1.76, Compiler: Clang 6.0.0-1ubuntu2 + CUDA 10.0, File-System: ext4, Screen Resolution: 1920x2160 GCC 8.2.0: Processor: ARMv8 rev 0 @ 2.27GHz (8 Cores), Motherboard: jetson-xavier, Memory: 16384MB, Disk: 31GB HBG4a2, Graphics: NVIDIA TEGRA OS: Ubuntu 18.04, Kernel: 4.9.108-tegra (aarch64), Desktop: Unity 7.5.0, Display Server: X Server 1.19.6, Display Driver: NVIDIA 1.0.0, Vulkan: 1.1.76, Compiler: GCC 8.2.0 + clang (GCC) 8.2.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 1920x2160 GCC 9.0.0: Processor: ARMv8 rev 0 @ 2.27GHz (8 Cores), Motherboard: jetson-xavier, Memory: 16384MB, Disk: 31GB HBG4a2, Graphics: NVIDIA TEGRA OS: Ubuntu 18.04, Kernel: 4.9.108-tegra (aarch64), Desktop: Unity 7.5.0, Display Server: X Server 1.19.6, Display Driver: NVIDIA 1.0.0, Vulkan: 1.1.76, Compiler: GCC 9.0.0 20181230 + clang (GCC) 9.0.0 20181230 (experimental) + CUDA 10.0, File-System: ext4, Screen Resolution: 1920x2160 Clang 8.0 SVN: Processor: ARMv8 rev 0 @ 2.27GHz (8 Cores), Motherboard: jetson-xavier, Memory: 16384MB, Disk: 31GB HBG4a2, Graphics: NVIDIA TEGRA OS: Ubuntu 18.04, Kernel: 4.9.108-tegra (aarch64), Desktop: Unity 7.5.0, Display Server: X Server 1.19.6, Display Driver: NVIDIA 1.0.0, Vulkan: 1.1.76, Compiler: Clang 8.0.0 (SVN 350356) + LLVM 8.0.0svn + CUDA 10.0, File-System: ext4, Screen Resolution: 1920x2160 Clang 7.0.1: Processor: ARMv8 rev 0 @ 2.27GHz (8 Cores), Motherboard: jetson-xavier, Memory: 16384MB, Disk: 31GB HBG4a2, Graphics: NVIDIA TEGRA OS: Ubuntu 18.04, Kernel: 4.9.108-tegra (aarch64), Desktop: Unity 7.5.0, Display Server: X Server 1.19.6, Display Driver: NVIDIA 1.0.0, Vulkan: 1.1.76, Compiler: Clang 7.0.1 + LLVM 7.0.1 + CUDA 10.0, File-System: ext4, Screen Resolution: 1920x2160 LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better GCC 7.3.0 ..... 31.83 |=========== Clang 6.0 ..... 18.86 |====== GCC 8.2.0 ..... 160.36 |======================================================= GCC 9.0.0 ..... 139.86 |================================================ Clang 8.0 SVN . 53.26 |================== Clang 7.0.1 ... 49.86 |================= SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better GCC 7.3.0 ..... 620.00 |======================================================= Clang 6.0 ..... 546.00 |================================================ GCC 8.2.0 ..... 565.00 |================================================== GCC 9.0.0 ..... 118.00 |========== Clang 8.0 SVN . 78.18 |======= Clang 7.0.1 ... 78.11 |======= SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better GCC 7.3.0 ..... 962 |========================================================== Clang 6.0 ..... 840 |================================================== GCC 8.2.0 ..... 966 |========================================================== GCC 9.0.0 ..... 151 |========= Clang 8.0 SVN . 198 |============ Clang 7.0.1 ... 197 |============ SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better GCC 7.3.0 ..... 217.00 |======================================================= Clang 6.0 ..... 214.00 |====================================================== GCC 8.2.0 ..... 197.00 |================================================== GCC 9.0.0 ..... 47.45 |============ Clang 8.0 SVN . 34.30 |========= Clang 7.0.1 ... 34.33 |========= SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better GCC 7.3.0 ..... 879 |======================================================= Clang 6.0 ..... 935 |========================================================== GCC 8.2.0 ..... 879 |======================================================= GCC 9.0.0 ..... 245 |=============== Clang 8.0 SVN . 148 |========= Clang 7.0.1 ... 148 |========= SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better GCC 7.3.0 ..... 572 |========================================================== Clang 6.0 ..... 540 |======================================================= GCC 8.2.0 ..... 554 |======================================================== GCC 9.0.0 ..... 126 |============= Clang 8.0 SVN . 102 |========== Clang 7.0.1 ... 102 |========== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better GCC 7.3.0 ..... 183.00 |======================================================= Clang 6.0 ..... 163.00 |================================================= GCC 8.2.0 ..... 164.00 |================================================= GCC 9.0.0 ..... 69.79 |===================== Clang 8.0 SVN . 52.86 |================ Clang 7.0.1 ... 52.89 |================ Redis 4.0.8 Test: LPOP Requests Per Second > Higher Is Better GCC 7.3.0 ..... 1105270 |================================================ Clang 6.0 ..... 1254672 |====================================================== GCC 8.2.0 ..... 417642 |================== GCC 9.0.0 ..... 377319 |================ Clang 8.0 SVN . 372959 |================ Clang 7.0.1 ... 377274 |================ Redis 4.0.8 Test: LPUSH Requests Per Second > Higher Is Better GCC 7.3.0 ..... 705232 |==================================================== Clang 6.0 ..... 744500 |======================================================= GCC 8.2.0 ..... 254896 |=================== GCC 9.0.0 ..... 239389 |================== Clang 8.0 SVN . 237439 |================== Clang 7.0.1 ... 234910 |================= Redis 4.0.8 Test: GET Requests Per Second > Higher Is Better GCC 7.3.0 ..... 973500 |============================================== Clang 6.0 ..... 1138493 |====================================================== GCC 8.2.0 ..... 391000 |=================== GCC 9.0.0 ..... 363700 |================= Clang 8.0 SVN . 366870 |================= Clang 7.0.1 ... 366301 |================= Redis 4.0.8 Test: SET Requests Per Second > Higher Is Better GCC 7.3.0 ..... 756550 |=================================================== Clang 6.0 ..... 816377 |======================================================= GCC 8.2.0 ..... 274012 |================== GCC 9.0.0 ..... 274302 |================== Clang 8.0 SVN . 262650 |================== Clang 7.0.1 ... 268782 |================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better GCC 7.3.0 ..... 258 |=================== Clang 6.0 ..... 408 |============================== GCC 8.2.0 ..... 469 |================================== GCC 9.0.0 ..... 451 |================================= Clang 8.0 SVN . 796 |========================================================== Clang 7.0.1 ... 796 |========================================================== Redis 4.0.8 Test: SADD Requests Per Second > Higher Is Better GCC 7.3.0 ..... 899066 |====================================================== Clang 6.0 ..... 923930 |======================================================= GCC 8.2.0 ..... 310886 |=================== GCC 9.0.0 ..... 303100 |================== Clang 8.0 SVN . 299731 |================== Clang 7.0.1 ... 302329 |================== FLAC Audio Encoding 1.3.2 WAV To FLAC Seconds < Lower Is Better GCC 7.3.0 ..... 57.27 |=================== Clang 6.0 ..... 54.61 |================== GCC 8.2.0 ..... 152.90 |================================================== GCC 9.0.0 ..... 140.12 |============================================== Clang 8.0 SVN . 166.90 |======================================================= Clang 7.0.1 ... 166.52 |======================================================= Bullet Physics Engine 2.81 Test: 1000 Convex Seconds < Lower Is Better GCC 7.3.0 . 9.95 |====================== GCC 8.2.0 . 26.96 |============================================================ GCC 9.0.0 . 25.50 |========================================================= Bullet Physics Engine 2.81 Test: Raytests Seconds < Lower Is Better GCC 7.3.0 . 5.53 |======================== GCC 8.2.0 . 13.92 |============================================================ GCC 9.0.0 . 13.02 |======================================================== Bullet Physics Engine 2.81 Test: Convex Trimesh Seconds < Lower Is Better GCC 7.3.0 . 2.32 |========================== GCC 8.2.0 . 5.52 |============================================================= GCC 9.0.0 . 5.04 |======================================================== Bullet Physics Engine 2.81 Test: 1000 Stack Seconds < Lower Is Better GCC 7.3.0 . 12.86 |=============================== GCC 8.2.0 . 25.01 |============================================================ GCC 9.0.0 . 22.16 |===================================================== Bullet Physics Engine 2.81 Test: Prim Trimesh Seconds < Lower Is Better GCC 7.3.0 . 2.12 |================================== GCC 8.2.0 . 3.85 |============================================================= GCC 9.0.0 . 3.35 |===================================================== Apache Benchmark 2.4.29 Static Web Page Serving Requests Per Second > Higher Is Better GCC 7.3.0 ..... 10782 |======================================================== Clang 6.0 ..... 10741 |======================================================== GCC 8.2.0 ..... 6218 |================================ GCC 9.0.0 ..... 6009 |=============================== Clang 8.0 SVN . 6061 |=============================== Clang 7.0.1 ... 6040 |=============================== asmFish 2018-07-23 1024 Hash Memory, 26 Depth Nodes/second > Higher Is Better GCC 7.3.0 ..... 8295044 |====================================================== Clang 6.0 ..... 8305398 |====================================================== GCC 8.2.0 ..... 7195166 |=============================================== GCC 9.0.0 ..... 4991900 |================================ Clang 8.0 SVN . 5046980 |================================= Clang 7.0.1 ... 5083834 |================================= Timed Linux Kernel Compilation 4.18 Time To Compile Seconds < Lower Is Better GCC 7.3.0 . 831 |======================================= GCC 8.2.0 . 1312 |============================================================= GCC 9.0.0 . 963 |============================================= Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better GCC 7.3.0 ..... 269 |========================================== Clang 6.0 ..... 265 |========================================== GCC 8.2.0 ..... 248 |======================================= GCC 9.0.0 ..... 368 |========================================================== Clang 8.0 SVN . 270 |=========================================== Clang 7.0.1 ... 281 |============================================ Timed PHP Compilation 7.1.9 Time To Compile Seconds < Lower Is Better GCC 7.3.0 ..... 276 |======================================= Clang 6.0 ..... 319 |============================================= GCC 8.2.0 ..... 409 |========================================================== Clang 8.0 SVN . 355 |================================================== Clang 7.0.1 ... 337 |================================================ Bullet Physics Engine 2.81 Test: 136 Ragdolls Seconds < Lower Is Better GCC 7.3.0 . 7.30 |================================== GCC 8.2.0 . 12.93 |============================================================ GCC 9.0.0 . 11.47 |===================================================== Bullet Physics Engine 2.81 Test: 3000 Fall Seconds < Lower Is Better GCC 7.3.0 . 12.68 |======================================== GCC 8.2.0 . 19.23 |============================================================ GCC 9.0.0 . 17.13 |===================================================== Primesieve 7.2 1e12 Prime Number Generation Seconds < Lower Is Better GCC 7.3.0 ..... 106 |=========================================== Clang 6.0 ..... 106 |=========================================== GCC 8.2.0 ..... 134 |====================================================== Clang 8.0 SVN . 137 |======================================================= Clang 7.0.1 ... 144 |========================================================== 7-Zip Compression 16.02 Compress Speed Test MIPS > Higher Is Better GCC 7.3.0 . 18139 |====================================================== GCC 8.2.0 . 20239 |============================================================ GCC 9.0.0 . 8651 |========================== TTSIOD 3D Renderer 2.3b Phong Rendering With Soft-Shadow Mapping FPS > Higher Is Better GCC 7.3.0 . 146 |============================================================== Clang 6.0 . 137 |========================================================== GCC 8.2.0 . 127 |======================================================