xanmod-kernel-v3 AMD Ryzen 9 3900X 12-Core testing with a MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.61 BIOS) and MSI NVIDIA GeForce GT 1030 on Debian stable-updates via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2008021-NE-XANMODKER88&grs .
xanmod-kernel-v3 Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Display Server Display Driver Compiler File-System stock-linux-kernel xanmod-kernel-stock xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-blender AMD Ryzen 9 3900X 12-Core (12 Cores / 24 Threads) MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.61 BIOS) AMD Starship/Matisse 4 x 16384 MB DDR4-3200MT/s CMK32GX4M2D3000C16 1000GB Samsung SSD 970 EVO 1TB + 8002GB Western Digital WD80EMAZ-00W MSI NVIDIA GeForce GT 1030 NVIDIA GP108 HD Audio Realtek RTL8111/8168/8411 Debian testing 5.7.6-050706-lowlatency (x86_64) X Server 1.20.8 modesetting 1.20.8 GCC 9.3.0 + Clang 9.0.1-13 + LLVM 9.0.1 ext4 5.7.10-xanmod1 (x86_64) 4 x 16384 MB DDR4-3000MT/s CMK32GX4M2D3000C16 Debian stable-updates OpenBenchmarking.org Environment Details - stock-linux-kernel: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-stock: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-stock-v2: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-optimized: RADV_PERFTEST=aco - xanmod-kernel-optimized-blender: RADV_PERFTEST=aco Compiler Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-0xEOmg/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized: NONE / errors=remount-ro,relatime,rw Processor Details - CPU Microcode: 0x8701013 Java Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized: OpenJDK Runtime Environment (build 11.0.7+10-post-Debian-3deb10u1) Python Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized: Python 2.7.18 + Python 3.8.5 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
xanmod-kernel-v3 wireguard: keydb: fftw: Float + SSE - 2D FFT Size 4096 sqlite: 1 onednn: Recurrent Neural Network Inference - f32 - CPU npb: EP.C ffmpeg: H.264 HD To NTSC DV onednn: Deconvolution Batch deconv_3d - u8s8f32 - CPU onednn: Deconvolution Batch deconv_3d - f32 - CPU npb: EP.D onednn: Convolution Batch Shapes Auto - f32 - CPU darktable: Server Rack - CPU-only onednn: Recurrent Neural Network Training - f32 - CPU ramspeed: Triad - Floating Point onednn: IP Batch All - u8s8f32 - CPU npb: FT.C cachebench: Read / Modify / Write x265: H.265 1080p Video Encoding ramspeed: Add - Integer core-latency: Average Latency Between CPU Cores ramspeed: Scale - Integer onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU ramspeed: Add - Floating Point build-linux-kernel: Time To Compile fftw: Float + SSE - 1D FFT Size 4096 onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU openssl: RSA 4096-bit Performance ramspeed: Copy - Integer hpcg: npb: MG.C onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU java-scimark2: Dense LU Matrix Factorization ramspeed: Copy - Floating Point build-php: Time To Compile compress-7zip: Compress Speed Test rays1bench: Large Scene java-scimark2: Fast Fourier Transform onednn: IP Batch All - f32 - CPU c-ray: Total Time - 4K, 16 Rays Per Pixel build-ffmpeg: Time To Compile fftw: Float + SSE - 1D FFT Size 32 darktable: Boat - CPU-only ramspeed: Scale - Floating Point cachebench: Write ramspeed: Average - Floating Point ramspeed: Triad - Integer npb: BT.C darktable: Server Room - CPU-only npb: SP.B java-scimark2: Composite cachebench: Read gnupg: 2GB File Encryption npb: LU.C darktable: Masskrug - CPU-only fftw: Float + SSE - 2D FFT Size 32 java-scimark2: Sparse Matrix Multiply java-scimark2: Jacobi Successive Over-Relaxation ramspeed: Average - Integer sqlite: 128 java-scimark2: Monte Carlo deepspeech: CPU blender: Pabellon Barcelona - CPU-Only blender: Barbershop - CPU-Only blender: Fishy Cat - CPU-Only blender: Classroom - CPU-Only blender: BMW27 - CPU-Only sqlite-speedtest: Timed Time - Size 1,000 mysqlslap: 1 stock-linux-kernel xanmod-kernel-stock xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-blender 230.356 119368.92 21291 39.236 42.5089 785.50 5.289 3.61101 4.68846 782.03 12.0726 0.140 207.249 31043.71 24.6152 12485.72 61924.050792619 68.73 31078.60 142.26 27448.34 1.93922 31350.57 47.366 57181 13.1453 3494.0 26675.04 5.68363 16862.70 0.805316 6768.17 27445.78 39.101 81231 88.05 2030.91 42.5231 42.935 32.487 14673 8.401 26941.81 32183.805464159 29306.84 31334.76 30353.28 3.035 13080.84 3039.19 3025.46 11.190 33564.28 4.453 45476 2720.27 1972.21 28792.55 2449.769 1704.38 56.36584 70.115 253 40.608 191.279 123834.22 20269 40.313 41.5252 788.37 5.402 3.60538 4.66477 787.40 12.0672 0.138 205.187 30172.84 24.5144 12492.07 63966.482233318 68.90 30294.84 144.04 26628.92 1.94085 31894.48 46.484 55567 13.1457 3506.5 27440.96 5.69740 16879.80 0.803284 6940.61 27024.57 38.528 81679 88.31 2067.65 42.4609 42.747 32.118 14968 8.402 27390.70 32004.956297381 28902.08 31765.58 30132.56 3.015 13058.58 3075.13 3014.52 11.243 33654.86 4.480 45621 2701.65 1965.14 28894.30 2456.356 1700.61 56.36727 67.550 259 184.804 130220.58 19876 38.417 40.5261 823.86 5.171 3.47507 4.51250 812.44 12.5181 0.143 200.186 30019.67 23.8092 12085.26 62834.856835413 70.97 31234.97 139.73 26780.83 1.88292 30949.80 46.003 56232 13.5271 3595.2 26885.94 5.54178 16422.81 0.784978 6892.47 26768.43 38.145 83170 90.07 2076.80 41.5866 42.044 31.827 14923 8.545 27120.62 32511.703295762 29355.41 31510.40 29948.81 2.998 12922.63 3069.36 3050.10 11.311 33335.37 4.441 45825 2700.25 1975.61 28801.17 1701.68 56.25400 67.829 275 373.42 444.76 163.16 296.71 111.00 OpenBenchmarking.org
WireGuard + Linux Networking Stack Stress Test OpenBenchmarking.org Seconds, Fewer Is Better WireGuard + Linux Networking Stack Stress Test stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 50 100 150 200 250 SE +/- 0.93, N = 3 SE +/- 1.10, N = 3 SE +/- 0.86, N = 3 230.36 191.28 184.80
KeyDB OpenBenchmarking.org Ops/sec, More Is Better KeyDB 5.3.1 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 30K 60K 90K 120K 150K SE +/- 1002.40, N = 15 SE +/- 1088.54, N = 15 SE +/- 1668.86, N = 3 119368.92 123834.22 130220.58 1. (CXX) g++ options: -O2 -levent -lpthread -lz -lpcre
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 5K 10K 15K 20K 25K SE +/- 226.07, N = 3 SE +/- 158.09, N = 3 SE +/- 112.65, N = 3 21291 20269 19876 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
SQLite Threads / Copies: 1 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 1 stock-linux-kernel xanmod-kernel-stock xanmod-kernel-stock-v2 xanmod-kernel-optimized 9 18 27 36 45 SE +/- 0.23, N = 3 SE +/- 0.57, N = 3 SE +/- 0.56, N = 3 SE +/- 0.27, N = 3 39.24 40.61 40.31 38.42 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 10 20 30 40 50 SE +/- 0.23, N = 3 SE +/- 0.42, N = 3 SE +/- 0.30, N = 3 42.51 41.53 40.53 MIN: 40.78 MIN: 39.45 MIN: 38.63 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 200 400 600 800 1000 SE +/- 0.25, N = 3 SE +/- 0.09, N = 3 SE +/- 2.42, N = 3 785.50 788.37 823.86 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 4.0.2 H.264 HD To NTSC DV stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 1.2155 2.431 3.6465 4.862 6.0775 SE +/- 0.042, N = 3 SE +/- 0.034, N = 3 SE +/- 0.001, N = 3 5.289 5.402 5.171 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm -lxcb -pthread -lbz2 -llzma -std=c11 -fomit-frame-pointer -fPIC -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 0.8125 1.625 2.4375 3.25 4.0625 SE +/- 0.00224, N = 3 SE +/- 0.00029, N = 3 SE +/- 0.00233, N = 3 3.61101 3.60538 3.47507 MIN: 3.52 MIN: 3.52 MIN: 3.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 1.0549 2.1098 3.1647 4.2196 5.2745 SE +/- 0.00217, N = 3 SE +/- 0.00522, N = 3 SE +/- 0.00986, N = 3 4.68846 4.66477 4.51250 MIN: 4.55 MIN: 4.52 MIN: 4.29 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 200 400 600 800 1000 SE +/- 0.12, N = 3 SE +/- 0.07, N = 3 SE +/- 0.21, N = 3 782.03 787.40 812.44 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 12.07 12.07 12.52 MIN: 11.66 MIN: 11.48 MIN: 11.97 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Server Rack - Acceleration: CPU-only stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 0.0322 0.0644 0.0966 0.1288 0.161 SE +/- 0.002, N = 4 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 0.140 0.138 0.143
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 50 100 150 200 250 SE +/- 0.29, N = 3 SE +/- 0.44, N = 3 SE +/- 0.29, N = 3 207.25 205.19 200.19 MIN: 201.06 MIN: 198.4 MIN: 193.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
RAMspeed SMP Type: Triad - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 461.90, N = 4 SE +/- 7.15, N = 3 SE +/- 2.05, N = 3 31043.71 30172.84 30019.67 1. (CC) gcc options: -O3 -march=native
oneDNN Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 24.62 24.51 23.81 MIN: 24.12 MIN: 23.86 MIN: 23.16 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3K 6K 9K 12K 15K SE +/- 41.50, N = 3 SE +/- 24.79, N = 3 SE +/- 7.96, N = 3 12485.72 12492.07 12085.26 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 14K 28K 42K 56K 70K SE +/- 273.27, N = 3 SE +/- 755.52, N = 3 SE +/- 14.57, N = 3 61924.05 63966.48 62834.86 MIN: 54980.84 / MAX: 66137.05 MIN: 55607.63 / MAX: 68405.98 MIN: 55279.71 / MAX: 66668.02 1. (CC) gcc options: -lrt
x265 H.265 1080p Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x265 3.1.2 H.265 1080p Video Encoding stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 16 32 48 64 80 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 SE +/- 0.23, N = 3 68.73 68.90 70.97 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 414.95, N = 5 SE +/- 21.34, N = 3 SE +/- 468.24, N = 3 31078.60 30294.84 31234.97 1. (CC) gcc options: -O3 -march=native
Core-Latency Average Latency Between CPU Cores OpenBenchmarking.org ns, Fewer Is Better Core-Latency Average Latency Between CPU Cores stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 30 60 90 120 150 142.26 144.04 139.73 MIN: 42.6 / MAX: 169.15 MIN: 42.79 / MAX: 172.06 MIN: 42.74 / MAX: 166.14 1. (CXX) g++ options: -std=c++11 -pthread -O3
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 6K 12K 18K 24K 30K SE +/- 342.02, N = 3 SE +/- 3.83, N = 3 SE +/- 182.76, N = 3 27448.34 26628.92 26780.83 1. (CC) gcc options: -O3 -march=native
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 0.4367 0.8734 1.3101 1.7468 2.1835 SE +/- 0.00229, N = 3 SE +/- 0.00175, N = 3 SE +/- 0.00095, N = 3 1.93922 1.94085 1.88292 MIN: 1.87 MIN: 1.87 MIN: 1.79 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 447.67, N = 4 SE +/- 64.13, N = 3 SE +/- 453.79, N = 4 31350.57 31894.48 30949.80 1. (CC) gcc options: -O3 -march=native
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 11 22 33 44 55 SE +/- 0.10, N = 3 SE +/- 0.36, N = 3 47.37 46.48 46.00
FFTW Build: Float + SSE - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 12K 24K 36K 48K 60K SE +/- 677.30, N = 5 SE +/- 917.95, N = 3 SE +/- 728.33, N = 3 57181 55567 56232 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 13.15 13.15 13.53 MIN: 12.78 MIN: 12.72 MIN: 13.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 800 1600 2400 3200 4000 SE +/- 6.31, N = 3 SE +/- 3.23, N = 3 SE +/- 14.89, N = 3 3494.0 3506.5 3595.2 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
RAMspeed SMP Type: Copy - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Integer stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 6K 12K 18K 24K 30K SE +/- 8.98, N = 3 SE +/- 408.95, N = 3 SE +/- 337.97, N = 3 26675.04 27440.96 26885.94 1. (CC) gcc options: -O3 -march=native
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 1.2819 2.5638 3.8457 5.1276 6.4095 SE +/- 0.01502, N = 3 SE +/- 0.00163, N = 3 SE +/- 0.00226, N = 3 5.68363 5.69740 5.54178 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 4K 8K 12K 16K 20K SE +/- 3.42, N = 3 SE +/- 3.96, N = 3 SE +/- 3.88, N = 3 16862.70 16879.80 16422.81 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 0.1812 0.3624 0.5436 0.7248 0.906 SE +/- 0.001986, N = 3 SE +/- 0.001743, N = 3 SE +/- 0.002024, N = 3 0.805316 0.803284 0.784978 MIN: 0.76 MIN: 0.75 MIN: 0.71 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Java SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Dense LU Matrix Factorization stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 1500 3000 4500 6000 7500 SE +/- 72.44, N = 4 SE +/- 80.90, N = 4 SE +/- 15.70, N = 4 6768.17 6940.61 6892.47
RAMspeed SMP Type: Copy - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 6K 12K 18K 24K 30K SE +/- 392.11, N = 3 SE +/- 405.20, N = 3 SE +/- 368.45, N = 3 27445.78 27024.57 26768.43 1. (CC) gcc options: -O3 -march=native
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 7.4.2 Time To Compile stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 9 18 27 36 45 SE +/- 0.10, N = 3 SE +/- 0.04, N = 3 SE +/- 0.11, N = 3 39.10 38.53 38.15
7-Zip Compression Compress Speed Test OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 16.02 Compress Speed Test stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 20K 40K 60K 80K 100K SE +/- 703.40, N = 3 SE +/- 584.03, N = 3 SE +/- 190.50, N = 2 81231 81679 83170 1. (CXX) g++ options: -pipe -lpthread
rays1bench Large Scene OpenBenchmarking.org mrays/s, More Is Better rays1bench 2020-01-09 Large Scene stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.20, N = 3 88.05 88.31 90.07
Java SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Fast Fourier Transform stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 400 800 1200 1600 2000 SE +/- 6.68, N = 4 SE +/- 22.09, N = 4 SE +/- 3.49, N = 4 2030.91 2067.65 2076.80
oneDNN Harness: IP Batch All - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 42.52 42.46 41.59 MIN: 41.57 MIN: 41.45 MIN: 40.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 42.94 42.75 42.04 1. (CC) gcc options: -lm -lpthread -O3
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 8 16 24 32 40 SE +/- 0.13, N = 3 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 32.49 32.12 31.83
FFTW Build: Float + SSE - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3K 6K 9K 12K 15K SE +/- 150.35, N = 3 SE +/- 58.71, N = 3 SE +/- 216.56, N = 4 14673 14968 14923 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Boat - Acceleration: CPU-only stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 2 4 6 8 10 SE +/- 0.006, N = 3 SE +/- 0.006, N = 3 SE +/- 0.012, N = 3 8.401 8.402 8.545
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 6K 12K 18K 24K 30K SE +/- 27.11, N = 3 SE +/- 405.62, N = 3 SE +/- 351.33, N = 3 26941.81 27390.70 27120.62 1. (CC) gcc options: -O3 -march=native
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 426.99, N = 3 SE +/- 348.21, N = 3 SE +/- 384.10, N = 3 32183.81 32004.96 32511.70 MIN: 27584.78 / MAX: 34495.84 MIN: 27420.12 / MAX: 34639.58 MIN: 27980.81 / MAX: 34688.08 1. (CC) gcc options: -lrt
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 6K 12K 18K 24K 30K SE +/- 477.47, N = 3 SE +/- 493.83, N = 3 SE +/- 78.82, N = 3 29306.84 28902.08 29355.41 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Triad - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Integer stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 460.40, N = 4 SE +/- 382.19, N = 5 SE +/- 439.89, N = 4 31334.76 31765.58 31510.40 1. (CC) gcc options: -O3 -march=native
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 172.89, N = 3 SE +/- 178.89, N = 3 SE +/- 55.35, N = 3 30353.28 30132.56 29948.81 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Server Room - Acceleration: CPU-only stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 0.6829 1.3658 2.0487 2.7316 3.4145 SE +/- 0.008, N = 3 SE +/- 0.006, N = 3 SE +/- 0.004, N = 3 3.035 3.015 2.998
NAS Parallel Benchmarks Test / Class: SP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3K 6K 9K 12K 15K SE +/- 151.16, N = 3 SE +/- 102.39, N = 15 SE +/- 89.96, N = 3 13080.84 13058.58 12922.63 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
Java SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Composite stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 700 1400 2100 2800 3500 SE +/- 29.80, N = 4 SE +/- 34.71, N = 4 SE +/- 7.44, N = 4 3039.19 3075.13 3069.36
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 700 1400 2100 2800 3500 SE +/- 8.65, N = 3 SE +/- 7.16, N = 3 SE +/- 8.27, N = 3 3025.46 3014.52 3050.10 MIN: 3006.36 / MAX: 3036.42 MIN: 2973.27 / MAX: 3043.12 MIN: 3027.65 / MAX: 3065.82 1. (CC) gcc options: -lrt
GnuPG 2GB File Encryption OpenBenchmarking.org Seconds, Fewer Is Better GnuPG 1.4.22 2GB File Encryption stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 3 6 9 12 15 SE +/- 0.16, N = 3 SE +/- 0.12, N = 15 SE +/- 0.13, N = 15 11.19 11.24 11.31 1. (CC) gcc options: -O2 -MT -MD -MP -MF
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 7K 14K 21K 28K 35K SE +/- 5.41, N = 3 SE +/- 19.64, N = 3 SE +/- 31.44, N = 3 33564.28 33654.86 33335.37 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Masskrug - Acceleration: CPU-only stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 1.008 2.016 3.024 4.032 5.04 SE +/- 0.020, N = 3 SE +/- 0.017, N = 3 SE +/- 0.016, N = 3 4.453 4.480 4.441
FFTW Build: Float + SSE - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 10K 20K 30K 40K 50K SE +/- 77.66, N = 3 SE +/- 78.67, N = 3 SE +/- 91.12, N = 3 45476 45621 45825 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
Java SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Sparse Matrix Multiply stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 600 1200 1800 2400 3000 SE +/- 26.42, N = 4 SE +/- 37.81, N = 4 SE +/- 12.34, N = 4 2720.27 2701.65 2700.25
Java SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 400 800 1200 1600 2000 SE +/- 26.34, N = 4 SE +/- 31.69, N = 4 SE +/- 2.63, N = 4 1972.21 1965.14 1975.61
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 6K 12K 18K 24K 30K SE +/- 426.05, N = 3 SE +/- 465.21, N = 3 SE +/- 460.72, N = 3 28792.55 28894.30 28801.17 1. (CC) gcc options: -O3 -march=native
SQLite Threads / Copies: 128 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 128 stock-linux-kernel xanmod-kernel-stock-v2 500 1000 1500 2000 2500 SE +/- 2.38, N = 3 SE +/- 10.49, N = 3 2449.77 2456.36 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
Java SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Monte Carlo stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 400 800 1200 1600 2000 SE +/- 20.11, N = 4 SE +/- 19.26, N = 4 SE +/- 3.71, N = 4 1704.38 1700.61 1701.68
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 13 26 39 52 65 SE +/- 0.04, N = 3 SE +/- 0.25, N = 3 SE +/- 0.16, N = 3 56.37 56.37 56.25
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Pabellon Barcelona - Compute: CPU-Only xanmod-kernel-optimized-blender 80 160 240 320 400 SE +/- 0.81, N = 3 373.42
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Barbershop - Compute: CPU-Only xanmod-kernel-optimized-blender 100 200 300 400 500 SE +/- 0.49, N = 3 444.76
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Fishy Cat - Compute: CPU-Only xanmod-kernel-optimized-blender 40 80 120 160 200 SE +/- 0.32, N = 3 163.16
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Classroom - Compute: CPU-Only xanmod-kernel-optimized-blender 60 120 180 240 300 SE +/- 0.13, N = 3 296.71
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: BMW27 - Compute: CPU-Only xanmod-kernel-optimized-blender 20 40 60 80 100 SE +/- 0.33, N = 3 111.00
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 16 32 48 64 80 SE +/- 1.15, N = 15 SE +/- 0.95, N = 15 SE +/- 1.20, N = 15 70.12 67.55 67.83 1. (CC) gcc options: -O2 -ldl -lz -lpthread
MariaDB Clients: 1 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 1 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 60 120 180 240 300 SE +/- 5.99, N = 9 SE +/- 4.19, N = 9 SE +/- 4.27, N = 9 253 259 275 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -llz4 -llzma -lbz2 -laio -lnuma -lpcre2-8 -lcrypt -lz -lm -lssl -lcrypto -ldl
Phoronix Test Suite v10.8.4