xanmod-kernel-v3 AMD Ryzen 9 3900X 12-Core testing with a MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.61 BIOS) and MSI NVIDIA GeForce GT 1030 on Debian stable-updates via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2008021-NE-XANMODKER88&grr&sor .
xanmod-kernel-v3 Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Display Server Display Driver Compiler File-System stock-linux-kernel xanmod-kernel-stock xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-blender AMD Ryzen 9 3900X 12-Core (12 Cores / 24 Threads) MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.61 BIOS) AMD Starship/Matisse 4 x 16384 MB DDR4-3200MT/s CMK32GX4M2D3000C16 1000GB Samsung SSD 970 EVO 1TB + 8002GB Western Digital WD80EMAZ-00W MSI NVIDIA GeForce GT 1030 NVIDIA GP108 HD Audio Realtek RTL8111/8168/8411 Debian testing 5.7.6-050706-lowlatency (x86_64) X Server 1.20.8 modesetting 1.20.8 GCC 9.3.0 + Clang 9.0.1-13 + LLVM 9.0.1 ext4 5.7.10-xanmod1 (x86_64) 4 x 16384 MB DDR4-3000MT/s CMK32GX4M2D3000C16 Debian stable-updates OpenBenchmarking.org Environment Details - stock-linux-kernel: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-stock: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-stock-v2: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-optimized: RADV_PERFTEST=aco - xanmod-kernel-optimized-blender: RADV_PERFTEST=aco Compiler Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-0xEOmg/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized: NONE / errors=remount-ro,relatime,rw Processor Details - CPU Microcode: 0x8701013 Java Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized: OpenJDK Runtime Environment (build 11.0.7+10-post-Debian-3deb10u1) Python Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized: Python 2.7.18 + Python 3.8.5 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
xanmod-kernel-v3 mysqlslap: 1 sqlite: 128 blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only sqlite-speedtest: Timed Time - Size 1,000 fftw: Float + SSE - 2D FFT Size 4096 blender: Classroom - CPU-Only keydb: core-latency: Average Latency Between CPU Cores wireguard: hpcg: npb: EP.D blender: Fishy Cat - CPU-Only cachebench: Read / Modify / Write cachebench: Write cachebench: Read blender: BMW27 - CPU-Only npb: BT.C ramspeed: Triad - Integer ramspeed: Add - Integer ramspeed: Add - Floating Point ramspeed: Triad - Floating Point npb: SP.B npb: LU.C ramspeed: Scale - Integer ramspeed: Average - Integer ramspeed: Copy - Integer ramspeed: Copy - Floating Point ramspeed: Scale - Floating Point ramspeed: Average - Floating Point onednn: IP Batch All - u8s8f32 - CPU onednn: IP Batch All - f32 - CPU build-linux-kernel: Time To Compile java-scimark2: Composite deepspeech: CPU c-ray: Total Time - 4K, 16 Rays Per Pixel gnupg: 2GB File Encryption build-php: Time To Compile sqlite: 1 onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU npb: FT.C compress-7zip: Compress Speed Test build-ffmpeg: Time To Compile openssl: RSA 4096-bit Performance onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU npb: EP.C npb: MG.C rays1bench: Large Scene darktable: Boat - CPU-only x265: H.265 1080p Video Encoding fftw: Float + SSE - 1D FFT Size 4096 onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU darktable: Masskrug - CPU-only ffmpeg: H.264 HD To NTSC DV darktable: Server Room - CPU-only fftw: Float + SSE - 2D FFT Size 32 fftw: Float + SSE - 1D FFT Size 32 onednn: Deconvolution Batch deconv_3d - u8s8f32 - CPU onednn: Deconvolution Batch deconv_3d - f32 - CPU darktable: Server Rack - CPU-only java-scimark2: Jacobi Successive Over-Relaxation java-scimark2: Dense LU Matrix Factorization java-scimark2: Sparse Matrix Multiply java-scimark2: Fast Fourier Transform java-scimark2: Monte Carlo stock-linux-kernel xanmod-kernel-stock xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-blender 253 2449.769 70.115 21291 119368.92 142.26 230.356 5.68363 782.03 61924.050792619 32183.805464159 3025.46 30353.28 31334.76 31078.60 31350.57 31043.71 13080.84 33564.28 27448.34 28792.55 26675.04 27445.78 26941.81 29306.84 24.6152 42.5231 47.366 3039.19 56.36584 42.935 11.190 39.101 39.236 207.249 42.5089 12485.72 81231 32.487 3494.0 0.805316 1.93922 785.50 16862.70 88.05 8.401 68.73 57181 13.1453 12.0726 4.453 5.289 3.035 45476 14673 3.61101 4.68846 0.140 1972.21 6768.17 2720.27 2030.91 1704.38 40.608 259 2456.356 67.550 20269 123834.22 144.04 191.279 5.69740 787.40 63966.482233318 32004.956297381 3014.52 30132.56 31765.58 30294.84 31894.48 30172.84 13058.58 33654.86 26628.92 28894.30 27440.96 27024.57 27390.70 28902.08 24.5144 42.4609 46.484 3075.13 56.36727 42.747 11.243 38.528 40.313 205.187 41.5252 12492.07 81679 32.118 3506.5 0.803284 1.94085 788.37 16879.80 88.31 8.402 68.90 55567 13.1457 12.0672 4.480 5.402 3.015 45621 14968 3.60538 4.66477 0.138 1965.14 6940.61 2701.65 2067.65 1700.61 275 67.829 19876 130220.58 139.73 184.804 5.54178 812.44 62834.856835413 32511.703295762 3050.10 29948.81 31510.40 31234.97 30949.80 30019.67 12922.63 33335.37 26780.83 28801.17 26885.94 26768.43 27120.62 29355.41 23.8092 41.5866 46.003 3069.36 56.25400 42.044 11.311 38.145 38.417 200.186 40.5261 12085.26 83170 31.827 3595.2 0.784978 1.88292 823.86 16422.81 90.07 8.545 70.97 56232 13.5271 12.5181 4.441 5.171 2.998 45825 14923 3.47507 4.51250 0.143 1975.61 6892.47 2700.25 2076.80 1701.68 444.76 373.42 296.71 163.16 111.00 OpenBenchmarking.org
MariaDB Clients: 1 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 60 120 180 240 300 SE +/- 4.27, N = 9 SE +/- 4.19, N = 9 SE +/- 5.99, N = 9 275 259 253 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -llz4 -llzma -lbz2 -laio -lnuma -lpcre2-8 -lcrypt -lz -lm -lssl -lcrypto -ldl
SQLite Threads / Copies: 128 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 128 stock-linux-kernel xanmod-kernel-stock-v2 500 1000 1500 2000 2500 SE +/- 2.38, N = 3 SE +/- 10.49, N = 3 2449.77 2456.36 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Barbershop - Compute: CPU-Only xanmod-kernel-optimized-blender 100 200 300 400 500 SE +/- 0.49, N = 3 444.76
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Pabellon Barcelona - Compute: CPU-Only xanmod-kernel-optimized-blender 80 160 240 320 400 SE +/- 0.81, N = 3 373.42
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 16 32 48 64 80 SE +/- 0.95, N = 15 SE +/- 1.20, N = 15 SE +/- 1.15, N = 15 67.55 67.83 70.12 1. (CC) gcc options: -O2 -ldl -lz -lpthread
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 5K 10K 15K 20K 25K SE +/- 226.07, N = 3 SE +/- 158.09, N = 3 SE +/- 112.65, N = 3 21291 20269 19876 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Classroom - Compute: CPU-Only xanmod-kernel-optimized-blender 60 120 180 240 300 SE +/- 0.13, N = 3 296.71
KeyDB OpenBenchmarking.org Ops/sec, More Is Better KeyDB 5.3.1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 30K 60K 90K 120K 150K SE +/- 1668.86, N = 3 SE +/- 1088.54, N = 15 SE +/- 1002.40, N = 15 130220.58 123834.22 119368.92 1. (CXX) g++ options: -O2 -levent -lpthread -lz -lpcre
Core-Latency Average Latency Between CPU Cores OpenBenchmarking.org ns, Fewer Is Better Core-Latency Average Latency Between CPU Cores xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 30 60 90 120 150 139.73 142.26 144.04 MIN: 42.74 / MAX: 166.14 MIN: 42.6 / MAX: 169.15 MIN: 42.79 / MAX: 172.06 1. (CXX) g++ options: -std=c++11 -pthread -O3
WireGuard + Linux Networking Stack Stress Test OpenBenchmarking.org Seconds, Fewer Is Better WireGuard + Linux Networking Stack Stress Test xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 50 100 150 200 250 SE +/- 0.86, N = 3 SE +/- 1.10, N = 3 SE +/- 0.93, N = 3 184.80 191.28 230.36
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized 1.2819 2.5638 3.8457 5.1276 6.4095 SE +/- 0.00163, N = 3 SE +/- 0.01502, N = 3 SE +/- 0.00226, N = 3 5.69740 5.68363 5.54178 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 200 400 600 800 1000 SE +/- 0.21, N = 3 SE +/- 0.07, N = 3 SE +/- 0.12, N = 3 812.44 787.40 782.03 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Fishy Cat - Compute: CPU-Only xanmod-kernel-optimized-blender 40 80 120 160 200 SE +/- 0.32, N = 3 163.16
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 14K 28K 42K 56K 70K SE +/- 755.52, N = 3 SE +/- 14.57, N = 3 SE +/- 273.27, N = 3 63966.48 62834.86 61924.05 MIN: 55607.63 / MAX: 68405.98 MIN: 55279.71 / MAX: 66668.02 MIN: 54980.84 / MAX: 66137.05 1. (CC) gcc options: -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 7K 14K 21K 28K 35K SE +/- 384.10, N = 3 SE +/- 426.99, N = 3 SE +/- 348.21, N = 3 32511.70 32183.81 32004.96 MIN: 27980.81 / MAX: 34688.08 MIN: 27584.78 / MAX: 34495.84 MIN: 27420.12 / MAX: 34639.58 1. (CC) gcc options: -lrt
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 700 1400 2100 2800 3500 SE +/- 8.27, N = 3 SE +/- 8.65, N = 3 SE +/- 7.16, N = 3 3050.10 3025.46 3014.52 MIN: 3027.65 / MAX: 3065.82 MIN: 3006.36 / MAX: 3036.42 MIN: 2973.27 / MAX: 3043.12 1. (CC) gcc options: -lrt
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: BMW27 - Compute: CPU-Only xanmod-kernel-optimized-blender 20 40 60 80 100 SE +/- 0.33, N = 3 111.00
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 172.89, N = 3 SE +/- 178.89, N = 3 SE +/- 55.35, N = 3 30353.28 30132.56 29948.81 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
RAMspeed SMP Type: Triad - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Integer xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 7K 14K 21K 28K 35K SE +/- 382.19, N = 5 SE +/- 439.89, N = 4 SE +/- 460.40, N = 4 31765.58 31510.40 31334.76 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 7K 14K 21K 28K 35K SE +/- 468.24, N = 3 SE +/- 414.95, N = 5 SE +/- 21.34, N = 3 31234.97 31078.60 30294.84 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 64.13, N = 3 SE +/- 447.67, N = 4 SE +/- 453.79, N = 4 31894.48 31350.57 30949.80 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Triad - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 461.90, N = 4 SE +/- 7.15, N = 3 SE +/- 2.05, N = 3 31043.71 30172.84 30019.67 1. (CC) gcc options: -O3 -march=native
NAS Parallel Benchmarks Test / Class: SP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3K 6K 9K 12K 15K SE +/- 151.16, N = 3 SE +/- 102.39, N = 15 SE +/- 89.96, N = 3 13080.84 13058.58 12922.63 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 19.64, N = 3 SE +/- 5.41, N = 3 SE +/- 31.44, N = 3 33654.86 33564.28 33335.37 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-stock-v2 6K 12K 18K 24K 30K SE +/- 342.02, N = 3 SE +/- 182.76, N = 3 SE +/- 3.83, N = 3 27448.34 26780.83 26628.92 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 6K 12K 18K 24K 30K SE +/- 465.21, N = 3 SE +/- 460.72, N = 3 SE +/- 426.05, N = 3 28894.30 28801.17 28792.55 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Copy - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Integer xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 6K 12K 18K 24K 30K SE +/- 408.95, N = 3 SE +/- 337.97, N = 3 SE +/- 8.98, N = 3 27440.96 26885.94 26675.04 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Copy - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 6K 12K 18K 24K 30K SE +/- 392.11, N = 3 SE +/- 405.20, N = 3 SE +/- 368.45, N = 3 27445.78 27024.57 26768.43 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 6K 12K 18K 24K 30K SE +/- 405.62, N = 3 SE +/- 351.33, N = 3 SE +/- 27.11, N = 3 27390.70 27120.62 26941.81 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 6K 12K 18K 24K 30K SE +/- 78.82, N = 3 SE +/- 477.47, N = 3 SE +/- 493.83, N = 3 29355.41 29306.84 28902.08 1. (CC) gcc options: -O3 -march=native
oneDNN Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 23.81 24.51 24.62 MIN: 23.16 MIN: 23.86 MIN: 24.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch All - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 41.59 42.46 42.52 MIN: 40.37 MIN: 41.45 MIN: 41.57 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 11 22 33 44 55 SE +/- 0.36, N = 3 SE +/- 0.10, N = 3 46.00 46.48 47.37
Java SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Composite xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 700 1400 2100 2800 3500 SE +/- 34.71, N = 4 SE +/- 7.44, N = 4 SE +/- 29.80, N = 4 3075.13 3069.36 3039.19
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 13 26 39 52 65 SE +/- 0.16, N = 3 SE +/- 0.04, N = 3 SE +/- 0.25, N = 3 56.25 56.37 56.37
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 42.04 42.75 42.94 1. (CC) gcc options: -lm -lpthread -O3
GnuPG 2GB File Encryption OpenBenchmarking.org Seconds, Fewer Is Better GnuPG 1.4.22 2GB File Encryption stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3 6 9 12 15 SE +/- 0.16, N = 3 SE +/- 0.12, N = 15 SE +/- 0.13, N = 15 11.19 11.24 11.31 1. (CC) gcc options: -O2 -MT -MD -MP -MF
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 7.4.2 Time To Compile xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 9 18 27 36 45 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 38.15 38.53 39.10
SQLite Threads / Copies: 1 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 1 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-stock 9 18 27 36 45 SE +/- 0.27, N = 3 SE +/- 0.23, N = 3 SE +/- 0.56, N = 3 SE +/- 0.57, N = 3 38.42 39.24 40.31 40.61 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 50 100 150 200 250 SE +/- 0.29, N = 3 SE +/- 0.44, N = 3 SE +/- 0.29, N = 3 200.19 205.19 207.25 MIN: 193.59 MIN: 198.4 MIN: 201.06 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 10 20 30 40 50 SE +/- 0.30, N = 3 SE +/- 0.42, N = 3 SE +/- 0.23, N = 3 40.53 41.53 42.51 MIN: 38.63 MIN: 39.45 MIN: 40.78 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized 3K 6K 9K 12K 15K SE +/- 24.79, N = 3 SE +/- 41.50, N = 3 SE +/- 7.96, N = 3 12492.07 12485.72 12085.26 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
7-Zip Compression Compress Speed Test OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 16.02 Compress Speed Test xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 20K 40K 60K 80K 100K SE +/- 190.50, N = 2 SE +/- 584.03, N = 3 SE +/- 703.40, N = 3 83170 81679 81231 1. (CXX) g++ options: -pipe -lpthread
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 SE +/- 0.13, N = 3 31.83 32.12 32.49
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 800 1600 2400 3200 4000 SE +/- 14.89, N = 3 SE +/- 3.23, N = 3 SE +/- 6.31, N = 3 3595.2 3506.5 3494.0 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 0.1812 0.3624 0.5436 0.7248 0.906 SE +/- 0.002024, N = 3 SE +/- 0.001743, N = 3 SE +/- 0.001986, N = 3 0.784978 0.803284 0.805316 MIN: 0.71 MIN: 0.75 MIN: 0.76 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 0.4367 0.8734 1.3101 1.7468 2.1835 SE +/- 0.00095, N = 3 SE +/- 0.00229, N = 3 SE +/- 0.00175, N = 3 1.88292 1.93922 1.94085 MIN: 1.79 MIN: 1.87 MIN: 1.87 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 200 400 600 800 1000 SE +/- 2.42, N = 3 SE +/- 0.09, N = 3 SE +/- 0.25, N = 3 823.86 788.37 785.50 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized 4K 8K 12K 16K 20K SE +/- 3.96, N = 3 SE +/- 3.42, N = 3 SE +/- 3.88, N = 3 16879.80 16862.70 16422.81 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
rays1bench Large Scene OpenBenchmarking.org mrays/s, More Is Better rays1bench 2020-01-09 Large Scene xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 20 40 60 80 100 SE +/- 0.20, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 90.07 88.31 88.05
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Boat - Acceleration: CPU-only stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 2 4 6 8 10 SE +/- 0.006, N = 3 SE +/- 0.006, N = 3 SE +/- 0.012, N = 3 8.401 8.402 8.545
x265 H.265 1080p Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x265 3.1.2 H.265 1080p Video Encoding xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 16 32 48 64 80 SE +/- 0.23, N = 3 SE +/- 0.08, N = 3 SE +/- 0.14, N = 3 70.97 68.90 68.73 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFTW Build: Float + SSE - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-stock-v2 12K 24K 36K 48K 60K SE +/- 677.30, N = 5 SE +/- 728.33, N = 3 SE +/- 917.95, N = 3 57181 56232 55567 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 13.15 13.15 13.53 MIN: 12.78 MIN: 12.72 MIN: 13.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.07 12.07 12.52 MIN: 11.48 MIN: 11.66 MIN: 11.97 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Masskrug - Acceleration: CPU-only xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 1.008 2.016 3.024 4.032 5.04 SE +/- 0.016, N = 3 SE +/- 0.020, N = 3 SE +/- 0.017, N = 3 4.441 4.453 4.480
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 4.0.2 H.264 HD To NTSC DV xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 1.2155 2.431 3.6465 4.862 6.0775 SE +/- 0.001, N = 3 SE +/- 0.042, N = 3 SE +/- 0.034, N = 3 5.171 5.289 5.402 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm -lxcb -pthread -lbz2 -llzma -std=c11 -fomit-frame-pointer -fPIC -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Server Room - Acceleration: CPU-only xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 0.6829 1.3658 2.0487 2.7316 3.4145 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 SE +/- 0.008, N = 3 2.998 3.015 3.035
FFTW Build: Float + SSE - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 10K 20K 30K 40K 50K SE +/- 91.12, N = 3 SE +/- 78.67, N = 3 SE +/- 77.66, N = 3 45825 45621 45476 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 3K 6K 9K 12K 15K SE +/- 58.71, N = 3 SE +/- 216.56, N = 4 SE +/- 150.35, N = 3 14968 14923 14673 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 0.8125 1.625 2.4375 3.25 4.0625 SE +/- 0.00233, N = 3 SE +/- 0.00029, N = 3 SE +/- 0.00224, N = 3 3.47507 3.60538 3.61101 MIN: 3.35 MIN: 3.52 MIN: 3.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 1.0549 2.1098 3.1647 4.2196 5.2745 SE +/- 0.00986, N = 3 SE +/- 0.00522, N = 3 SE +/- 0.00217, N = 3 4.51250 4.66477 4.68846 MIN: 4.29 MIN: 4.52 MIN: 4.55 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Server Rack - Acceleration: CPU-only xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized 0.0322 0.0644 0.0966 0.1288 0.161 SE +/- 0.000, N = 3 SE +/- 0.002, N = 4 SE +/- 0.002, N = 3 0.138 0.140 0.143
Java SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 400 800 1200 1600 2000 SE +/- 2.63, N = 4 SE +/- 26.34, N = 4 SE +/- 31.69, N = 4 1975.61 1972.21 1965.14
Java SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Dense LU Matrix Factorization xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 1500 3000 4500 6000 7500 SE +/- 80.90, N = 4 SE +/- 15.70, N = 4 SE +/- 72.44, N = 4 6940.61 6892.47 6768.17
Java SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Sparse Matrix Multiply stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 600 1200 1800 2400 3000 SE +/- 26.42, N = 4 SE +/- 37.81, N = 4 SE +/- 12.34, N = 4 2720.27 2701.65 2700.25
Java SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Fast Fourier Transform xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 400 800 1200 1600 2000 SE +/- 3.49, N = 4 SE +/- 22.09, N = 4 SE +/- 6.68, N = 4 2076.80 2067.65 2030.91
Java SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Monte Carlo stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-stock-v2 400 800 1200 1600 2000 SE +/- 20.11, N = 4 SE +/- 3.71, N = 4 SE +/- 19.26, N = 4 1704.38 1701.68 1700.61
Phoronix Test Suite v10.8.4