xanmod-kernel-v3 AMD Ryzen 9 3900X 12-Core testing with a MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.61 BIOS) and MSI NVIDIA GeForce GT 1030 on Debian stable-updates via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2008138-NE-XANMODKER25&grr&sor .
xanmod-kernel-v3 Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Display Server Display Driver Compiler File-System stock-linux-kernel xanmod-kernel-stock xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-blender xanmod-kernel-optimized-5.8.1 AMD Ryzen 9 3900X 12-Core (12 Cores / 24 Threads) MSI MPG X570 GAMING PLUS (MS-7C37) v2.0 (A.61 BIOS) AMD Starship/Matisse 4 x 16384 MB DDR4-3200MT/s CMK32GX4M2D3000C16 1000GB Samsung SSD 970 EVO 1TB + 8002GB Western Digital WD80EMAZ-00W MSI NVIDIA GeForce GT 1030 NVIDIA GP108 HD Audio Realtek RTL8111/8168/8411 Debian testing 5.7.6-050706-lowlatency (x86_64) X Server 1.20.8 modesetting 1.20.8 GCC 9.3.0 + Clang 9.0.1-13 + LLVM 9.0.1 ext4 5.7.10-xanmod1 (x86_64) 4 x 16384 MB DDR4-3000MT/s CMK32GX4M2D3000C16 Debian stable-updates 5.8.1-xanmod1 (x86_64) OpenBenchmarking.org Environment Details - stock-linux-kernel: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-stock: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-stock-v2: RADV_PERFTEST=aco NVM_CD_FLAGS= - xanmod-kernel-optimized: RADV_PERFTEST=aco - xanmod-kernel-optimized-blender: RADV_PERFTEST=aco - xanmod-kernel-optimized-5.8.1: RADV_PERFTEST=aco Compiler Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized, xanmod-kernel-optimized-5.8.1: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-0xEOmg/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized, xanmod-kernel-optimized-5.8.1: NONE / errors=remount-ro,relatime,rw Processor Details - CPU Microcode: 0x8701013 Java Details - stock-linux-kernel: OpenJDK Runtime Environment (build 11.0.7+10-post-Debian-3deb10u1) - xanmod-kernel-stock: OpenJDK Runtime Environment (build 11.0.7+10-post-Debian-3deb10u1) - xanmod-kernel-stock-v2: OpenJDK Runtime Environment (build 11.0.7+10-post-Debian-3deb10u1) - xanmod-kernel-optimized: OpenJDK Runtime Environment (build 11.0.7+10-post-Debian-3deb10u1) - xanmod-kernel-optimized-5.8.1: OpenJDK Runtime Environment (build 11.0.8+10-post-Debian-1) Python Details - stock-linux-kernel, xanmod-kernel-stock, xanmod-kernel-stock-v2, xanmod-kernel-optimized, xanmod-kernel-optimized-5.8.1: Python 2.7.18 + Python 3.8.5 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
xanmod-kernel-v3 sqlite: 128 mysqlslap: 1 blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only fftw: Float + SSE - 2D FFT Size 4096 blender: Classroom - CPU-Only sqlite-speedtest: Timed Time - Size 1,000 wireguard: core-latency: Average Latency Between CPU Cores keydb: hpcg: npb: EP.D blender: Fishy Cat - CPU-Only cachebench: Read / Modify / Write cachebench: Write cachebench: Read blender: BMW27 - CPU-Only npb: BT.C ramspeed: Triad - Integer ramspeed: Add - Integer ramspeed: Add - Floating Point ramspeed: Triad - Floating Point npb: LU.C ramspeed: Average - Integer ramspeed: Scale - Integer ramspeed: Copy - Floating Point ramspeed: Average - Floating Point ramspeed: Copy - Integer ramspeed: Scale - Floating Point onednn: IP Batch All - u8s8f32 - CPU onednn: IP Batch All - f32 - CPU npb: SP.B build-linux-kernel: Time To Compile java-scimark2: Composite deepspeech: CPU c-ray: Total Time - 4K, 16 Rays Per Pixel build-php: Time To Compile sqlite: 1 onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU compress-gzip: Linux Source Tree Archiving To .tar.gz npb: FT.C gnupg: 2GB File Encryption compress-7zip: Compress Speed Test build-ffmpeg: Time To Compile openssl: RSA 4096-bit Performance onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU npb: EP.C npb: MG.C rays1bench: Large Scene darktable: Boat - CPU-only x265: H.265 1080p Video Encoding fftw: Float + SSE - 1D FFT Size 4096 onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch deconv_3d - f32 - CPU darktable: Masskrug - CPU-only ffmpeg: H.264 HD To NTSC DV darktable: Server Room - CPU-only fftw: Float + SSE - 2D FFT Size 32 fftw: Float + SSE - 1D FFT Size 32 onednn: Deconvolution Batch deconv_3d - u8s8f32 - CPU darktable: Server Rack - CPU-only java-scimark2: Jacobi Successive Over-Relaxation java-scimark2: Dense LU Matrix Factorization java-scimark2: Sparse Matrix Multiply java-scimark2: Fast Fourier Transform java-scimark2: Monte Carlo stock-linux-kernel xanmod-kernel-stock xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-blender xanmod-kernel-optimized-5.8.1 2449.769 253 21291 70.115 230.356 142.26 119368.92 5.68363 782.03 61924.050792619 32183.805464159 3025.46 30353.28 31334.76 31078.60 31350.57 31043.71 33564.28 28792.55 27448.34 27445.78 29306.84 26675.04 26941.81 24.6152 42.5231 13080.84 47.366 3039.19 56.36584 42.935 39.101 39.236 207.249 42.5089 12485.72 11.190 81231 32.487 3494.0 0.805316 1.93922 785.50 16862.70 88.05 8.401 68.73 57181 13.1453 12.0726 4.68846 4.453 5.289 3.035 45476 14673 3.61101 0.140 1972.21 6768.17 2720.27 2030.91 1704.38 40.608 2456.356 259 20269 67.550 191.279 144.04 123834.22 5.69740 787.40 63966.482233318 32004.956297381 3014.52 30132.56 31765.58 30294.84 31894.48 30172.84 33654.86 28894.30 26628.92 27024.57 28902.08 27440.96 27390.70 24.5144 42.4609 13058.58 46.484 3075.13 56.36727 42.747 38.528 40.313 205.187 41.5252 12492.07 11.243 81679 32.118 3506.5 0.803284 1.94085 788.37 16879.80 88.31 8.402 68.90 55567 13.1457 12.0672 4.66477 4.480 5.402 3.015 45621 14968 3.60538 0.138 1965.14 6940.61 2701.65 2067.65 1700.61 275 19876 67.829 184.804 139.73 130220.58 5.54178 812.44 62834.856835413 32511.703295762 3050.10 29948.81 31510.40 31234.97 30949.80 30019.67 33335.37 28801.17 26780.83 26768.43 29355.41 26885.94 27120.62 23.8092 41.5866 12922.63 46.003 3069.36 56.25400 42.044 38.145 38.417 200.186 40.5261 12085.26 11.311 83170 31.827 3595.2 0.784978 1.88292 823.86 16422.81 90.07 8.545 70.97 56232 13.5271 12.5181 4.51250 4.441 5.171 2.998 45825 14923 3.47507 0.143 1975.61 6892.47 2700.25 2076.80 1701.68 444.76 373.42 296.71 163.16 111.00 251 438.20 370.01 20066 293.04 60.073 272.875 145.34 104557.03 5.32718 815.59 160.30 62931.304859746 32095.995467714 3055.11 109.40 28987.88 28920.85 30110.54 29505.18 29725.39 32544.33 27473.60 26100.68 26056.41 27359.97 26185.05 26050.56 24.1808 43.0229 12493.97 46.130 3112.83 58.40887 42.141 38.484 39.334 207.774 41.8483 35.926 11733.26 11.264 83181 31.877 3586.2 0.807644 1.93985 822.27 15800.53 90.17 8.908 70.40 56545 13.9714 12.9836 4.75770 4.586 5.114 3.112 45844 15293 3.57059 0.147 1998.82 7009.79 2739.93 2099.41 1716.20 OpenBenchmarking.org
SQLite Threads / Copies: 128 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 128 stock-linux-kernel xanmod-kernel-stock-v2 500 1000 1500 2000 2500 SE +/- 2.38, N = 3 SE +/- 10.49, N = 3 2449.77 2456.36 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
MariaDB Clients: 1 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 10.5.2 Clients: 1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized-5.8.1 60 120 180 240 300 SE +/- 4.27, N = 9 SE +/- 4.19, N = 9 SE +/- 5.99, N = 9 SE +/- 3.04, N = 3 275 259 253 251 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O2 -lpthread -llz4 -llzma -lbz2 -laio -lnuma -lpcre2-8 -lcrypt -lz -lm -lssl -lcrypto -ldl
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Barbershop - Compute: CPU-Only xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized-blender 100 200 300 400 500 SE +/- 0.27, N = 3 SE +/- 0.49, N = 3 438.20 444.76
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Pabellon Barcelona - Compute: CPU-Only xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized-blender 80 160 240 320 400 SE +/- 0.55, N = 3 SE +/- 0.81, N = 3 370.01 373.42
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized 5K 10K 15K 20K 25K SE +/- 226.07, N = 3 SE +/- 158.09, N = 3 SE +/- 193.73, N = 3 SE +/- 112.65, N = 3 21291 20269 20066 19876 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Classroom - Compute: CPU-Only xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized-blender 60 120 180 240 300 SE +/- 0.27, N = 3 SE +/- 0.13, N = 3 293.04 296.71
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 16 32 48 64 80 SE +/- 0.25, N = 3 SE +/- 0.95, N = 15 SE +/- 1.20, N = 15 SE +/- 1.15, N = 15 60.07 67.55 67.83 70.12 1. (CC) gcc options: -O2 -ldl -lz -lpthread
WireGuard + Linux Networking Stack Stress Test OpenBenchmarking.org Seconds, Fewer Is Better WireGuard + Linux Networking Stack Stress Test xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized-5.8.1 60 120 180 240 300 SE +/- 0.86, N = 3 SE +/- 1.10, N = 3 SE +/- 0.93, N = 3 SE +/- 3.79, N = 4 184.80 191.28 230.36 272.88
Core-Latency Average Latency Between CPU Cores OpenBenchmarking.org ns, Fewer Is Better Core-Latency Average Latency Between CPU Cores xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 30 60 90 120 150 139.73 142.26 144.04 145.34 MIN: 42.74 / MAX: 166.14 MIN: 42.6 / MAX: 169.15 MIN: 42.79 / MAX: 172.06 MIN: 42.59 / MAX: 173.79 1. (CXX) g++ options: -std=c++11 -pthread -O3
KeyDB OpenBenchmarking.org Ops/sec, More Is Better KeyDB 5.3.1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized-5.8.1 30K 60K 90K 120K 150K SE +/- 1668.86, N = 3 SE +/- 1088.54, N = 15 SE +/- 1002.40, N = 15 SE +/- 1460.29, N = 3 130220.58 123834.22 119368.92 104557.03 1. (CXX) g++ options: -O2 -levent -lpthread -lz -lpcre
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 1.2819 2.5638 3.8457 5.1276 6.4095 SE +/- 0.00163, N = 3 SE +/- 0.01502, N = 3 SE +/- 0.00226, N = 3 SE +/- 0.00344, N = 3 5.69740 5.68363 5.54178 5.32718 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 200 400 600 800 1000 SE +/- 1.56, N = 3 SE +/- 0.21, N = 3 SE +/- 0.07, N = 3 SE +/- 0.12, N = 3 815.59 812.44 787.40 782.03 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: Fishy Cat - Compute: CPU-Only xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized-blender 40 80 120 160 200 SE +/- 0.04, N = 3 SE +/- 0.32, N = 3 160.30 163.16
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized stock-linux-kernel 14K 28K 42K 56K 70K SE +/- 755.52, N = 3 SE +/- 118.29, N = 3 SE +/- 14.57, N = 3 SE +/- 273.27, N = 3 63966.48 62931.30 62834.86 61924.05 MIN: 55607.63 / MAX: 68405.98 MIN: 56272.34 / MAX: 66648.24 MIN: 55279.71 / MAX: 66668.02 MIN: 54980.84 / MAX: 66137.05 1. (CC) gcc options: -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 7K 14K 21K 28K 35K SE +/- 384.10, N = 3 SE +/- 426.99, N = 3 SE +/- 83.37, N = 3 SE +/- 348.21, N = 3 32511.70 32183.81 32096.00 32004.96 MIN: 27980.81 / MAX: 34688.08 MIN: 27584.78 / MAX: 34495.84 MIN: 27761.57 / MAX: 33587.88 MIN: 27420.12 / MAX: 34639.58 1. (CC) gcc options: -lrt
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 700 1400 2100 2800 3500 SE +/- 5.08, N = 3 SE +/- 8.27, N = 3 SE +/- 8.65, N = 3 SE +/- 7.16, N = 3 3055.11 3050.10 3025.46 3014.52 MIN: 3024.36 / MAX: 3066.09 MIN: 3027.65 / MAX: 3065.82 MIN: 3006.36 / MAX: 3036.42 MIN: 2973.27 / MAX: 3043.12 1. (CC) gcc options: -lrt
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.82 Blend File: BMW27 - Compute: CPU-Only xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized-blender 20 40 60 80 100 SE +/- 0.28, N = 3 SE +/- 0.33, N = 3 109.40 111.00
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 7K 14K 21K 28K 35K SE +/- 172.89, N = 3 SE +/- 178.89, N = 3 SE +/- 55.35, N = 3 SE +/- 169.29, N = 3 30353.28 30132.56 29948.81 28987.88 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
RAMspeed SMP Type: Triad - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Integer xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-optimized-5.8.1 7K 14K 21K 28K 35K SE +/- 382.19, N = 5 SE +/- 439.89, N = 4 SE +/- 460.40, N = 4 SE +/- 47.96, N = 3 31765.58 31510.40 31334.76 28920.85 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 7K 14K 21K 28K 35K SE +/- 468.24, N = 3 SE +/- 414.95, N = 5 SE +/- 21.34, N = 3 SE +/- 90.12, N = 3 31234.97 31078.60 30294.84 30110.54 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 7K 14K 21K 28K 35K SE +/- 64.13, N = 3 SE +/- 447.67, N = 4 SE +/- 453.79, N = 4 SE +/- 399.19, N = 3 31894.48 31350.57 30949.80 29505.18 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Triad - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 7K 14K 21K 28K 35K SE +/- 461.90, N = 4 SE +/- 7.15, N = 3 SE +/- 2.05, N = 3 SE +/- 480.54, N = 3 31043.71 30172.84 30019.67 29725.39 1. (CC) gcc options: -O3 -march=native
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 7K 14K 21K 28K 35K SE +/- 19.64, N = 3 SE +/- 5.41, N = 3 SE +/- 31.44, N = 3 SE +/- 8.69, N = 3 33654.86 33564.28 33335.37 32544.33 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-optimized-5.8.1 6K 12K 18K 24K 30K SE +/- 465.21, N = 3 SE +/- 460.72, N = 3 SE +/- 426.05, N = 3 SE +/- 262.81, N = 3 28894.30 28801.17 28792.55 27473.60 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 6K 12K 18K 24K 30K SE +/- 342.02, N = 3 SE +/- 182.76, N = 3 SE +/- 3.83, N = 3 SE +/- 319.94, N = 3 27448.34 26780.83 26628.92 26100.68 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Copy - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Floating Point stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 6K 12K 18K 24K 30K SE +/- 392.11, N = 3 SE +/- 405.20, N = 3 SE +/- 368.45, N = 3 SE +/- 304.41, N = 3 27445.78 27024.57 26768.43 26056.41 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 6K 12K 18K 24K 30K SE +/- 78.82, N = 3 SE +/- 477.47, N = 3 SE +/- 493.83, N = 3 SE +/- 140.28, N = 3 29355.41 29306.84 28902.08 27359.97 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Copy - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Integer xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-optimized-5.8.1 6K 12K 18K 24K 30K SE +/- 408.95, N = 3 SE +/- 337.97, N = 3 SE +/- 8.98, N = 3 SE +/- 101.22, N = 3 27440.96 26885.94 26675.04 26185.05 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-optimized-5.8.1 6K 12K 18K 24K 30K SE +/- 405.62, N = 3 SE +/- 351.33, N = 3 SE +/- 27.11, N = 3 SE +/- 371.32, N = 3 27390.70 27120.62 26941.81 26050.56 1. (CC) gcc options: -O3 -march=native
oneDNN Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 23.81 24.18 24.51 24.62 MIN: 23.16 MIN: 23.35 MIN: 23.86 MIN: 24.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch All - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized-5.8.1 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 41.59 42.46 42.52 43.02 MIN: 40.37 MIN: 41.45 MIN: 41.57 MIN: 41.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
NAS Parallel Benchmarks Test / Class: SP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 3K 6K 9K 12K 15K SE +/- 151.16, N = 3 SE +/- 102.39, N = 15 SE +/- 89.96, N = 3 SE +/- 113.85, N = 3 13080.84 13058.58 12922.63 12493.97 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 11 22 33 44 55 SE +/- 0.50, N = 3 SE +/- 0.36, N = 3 SE +/- 0.10, N = 3 46.00 46.13 46.48 47.37
Java SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Composite xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 700 1400 2100 2800 3500 SE +/- 22.84, N = 4 SE +/- 34.71, N = 4 SE +/- 7.44, N = 4 SE +/- 29.80, N = 4 3112.83 3075.13 3069.36 3039.19
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 13 26 39 52 65 SE +/- 0.16, N = 3 SE +/- 0.04, N = 3 SE +/- 0.25, N = 3 SE +/- 0.26, N = 3 56.25 56.37 56.37 58.41
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 42.04 42.14 42.75 42.94 1. (CC) gcc options: -lm -lpthread -O3
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 7.4.2 Time To Compile xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 9 18 27 36 45 SE +/- 0.11, N = 3 SE +/- 0.13, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 38.15 38.48 38.53 39.10
SQLite Threads / Copies: 1 OpenBenchmarking.org Seconds, Fewer Is Better SQLite 3.30.1 Threads / Copies: 1 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 xanmod-kernel-stock 9 18 27 36 45 SE +/- 0.27, N = 3 SE +/- 0.23, N = 3 SE +/- 0.13, N = 3 SE +/- 0.56, N = 3 SE +/- 0.57, N = 3 38.42 39.24 39.33 40.31 40.61 1. (CC) gcc options: -O2 -lz -lm -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized-5.8.1 50 100 150 200 250 SE +/- 0.29, N = 3 SE +/- 0.44, N = 3 SE +/- 0.29, N = 3 SE +/- 0.18, N = 3 200.19 205.19 207.25 207.77 MIN: 193.59 MIN: 198.4 MIN: 201.06 MIN: 198.01 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 stock-linux-kernel 10 20 30 40 50 SE +/- 0.30, N = 3 SE +/- 0.42, N = 3 SE +/- 0.18, N = 3 SE +/- 0.23, N = 3 40.53 41.53 41.85 42.51 MIN: 38.63 MIN: 39.45 MIN: 39.51 MIN: 40.78 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Gzip Compression Linux Source Tree Archiving To .tar.gz OpenBenchmarking.org Seconds, Fewer Is Better Gzip Compression Linux Source Tree Archiving To .tar.gz xanmod-kernel-optimized-5.8.1 8 16 24 32 40 SE +/- 0.09, N = 3 35.93
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 3K 6K 9K 12K 15K SE +/- 24.79, N = 3 SE +/- 41.50, N = 3 SE +/- 7.96, N = 3 SE +/- 17.38, N = 3 12492.07 12485.72 12085.26 11733.26 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
GnuPG 2GB File Encryption OpenBenchmarking.org Seconds, Fewer Is Better GnuPG 1.4.22 2GB File Encryption stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized 3 6 9 12 15 SE +/- 0.16, N = 3 SE +/- 0.12, N = 15 SE +/- 0.10, N = 3 SE +/- 0.13, N = 15 11.19 11.24 11.26 11.31 1. (CC) gcc options: -O2 -MT -MD -MP -MF
7-Zip Compression Compress Speed Test OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 16.02 Compress Speed Test xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 20K 40K 60K 80K 100K SE +/- 124.77, N = 3 SE +/- 190.50, N = 2 SE +/- 584.03, N = 3 SE +/- 703.40, N = 3 83181 83170 81679 81231 1. (CXX) g++ options: -pipe -lpthread
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.16, N = 3 SE +/- 0.12, N = 3 SE +/- 0.13, N = 3 31.83 31.88 32.12 32.49
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 800 1600 2400 3200 4000 SE +/- 14.89, N = 3 SE +/- 13.64, N = 3 SE +/- 3.23, N = 3 SE +/- 6.31, N = 3 3595.2 3586.2 3506.5 3494.0 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized-5.8.1 0.1817 0.3634 0.5451 0.7268 0.9085 SE +/- 0.002024, N = 3 SE +/- 0.001743, N = 3 SE +/- 0.001986, N = 3 SE +/- 0.006252, N = 3 0.784978 0.803284 0.805316 0.807644 MIN: 0.71 MIN: 0.75 MIN: 0.76 MIN: 0.72 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 0.4367 0.8734 1.3101 1.7468 2.1835 SE +/- 0.00095, N = 3 SE +/- 0.00229, N = 3 SE +/- 0.01046, N = 3 SE +/- 0.00175, N = 3 1.88292 1.93922 1.93985 1.94085 MIN: 1.79 MIN: 1.87 MIN: 1.79 MIN: 1.87 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 200 400 600 800 1000 SE +/- 2.42, N = 3 SE +/- 2.27, N = 3 SE +/- 0.09, N = 3 SE +/- 0.25, N = 3 823.86 822.27 788.37 785.50 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 4K 8K 12K 16K 20K SE +/- 3.96, N = 3 SE +/- 3.42, N = 3 SE +/- 3.88, N = 3 SE +/- 2.85, N = 3 16879.80 16862.70 16422.81 15800.53 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent_core -levent_pthreads -lutil -lm -lrt -lz 2. Open MPI 4.0.4
rays1bench Large Scene OpenBenchmarking.org mrays/s, More Is Better rays1bench 2020-01-09 Large Scene xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.20, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 90.17 90.07 88.31 88.05
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Boat - Acceleration: CPU-only stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 2 4 6 8 10 SE +/- 0.006, N = 3 SE +/- 0.006, N = 3 SE +/- 0.012, N = 3 SE +/- 0.038, N = 3 8.401 8.402 8.545 8.908
x265 H.265 1080p Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x265 3.1.2 H.265 1080p Video Encoding xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 16 32 48 64 80 SE +/- 0.23, N = 3 SE +/- 0.24, N = 3 SE +/- 0.08, N = 3 SE +/- 0.14, N = 3 70.97 70.40 68.90 68.73 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFTW Build: Float + SSE - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 stock-linux-kernel xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized xanmod-kernel-stock-v2 12K 24K 36K 48K 60K SE +/- 677.30, N = 5 SE +/- 192.28, N = 3 SE +/- 728.33, N = 3 SE +/- 917.95, N = 3 57181 56545 56232 55567 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 13.15 13.15 13.53 13.97 MIN: 12.78 MIN: 12.72 MIN: 13.12 MIN: 13.4 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 12.07 12.07 12.52 12.98 MIN: 11.48 MIN: 11.66 MIN: 11.97 MIN: 12.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized-5.8.1 1.0705 2.141 3.2115 4.282 5.3525 SE +/- 0.00986, N = 3 SE +/- 0.00522, N = 3 SE +/- 0.00217, N = 3 SE +/- 0.09789, N = 15 4.51250 4.66477 4.68846 4.75770 MIN: 4.29 MIN: 4.52 MIN: 4.55 MIN: 4.31 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Masskrug - Acceleration: CPU-only xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized-5.8.1 1.0319 2.0638 3.0957 4.1276 5.1595 SE +/- 0.016, N = 3 SE +/- 0.020, N = 3 SE +/- 0.017, N = 3 SE +/- 0.048, N = 3 4.441 4.453 4.480 4.586
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 4.0.2 H.264 HD To NTSC DV xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 1.2155 2.431 3.6465 4.862 6.0775 SE +/- 0.047, N = 3 SE +/- 0.001, N = 3 SE +/- 0.042, N = 3 SE +/- 0.034, N = 3 5.114 5.171 5.289 5.402 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lm -lxcb -pthread -lbz2 -llzma -std=c11 -fomit-frame-pointer -fPIC -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Server Room - Acceleration: CPU-only xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized-5.8.1 0.7002 1.4004 2.1006 2.8008 3.501 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 SE +/- 0.008, N = 3 SE +/- 0.020, N = 3 2.998 3.015 3.035 3.112
FFTW Build: Float + SSE - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 10K 20K 30K 40K 50K SE +/- 87.29, N = 3 SE +/- 91.12, N = 3 SE +/- 78.67, N = 3 SE +/- 77.66, N = 3 45844 45825 45621 45476 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 32 xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 3K 6K 9K 12K 15K SE +/- 227.93, N = 4 SE +/- 58.71, N = 3 SE +/- 216.56, N = 4 SE +/- 150.35, N = 3 15293 14968 14923 14673 1. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 stock-linux-kernel 0.8125 1.625 2.4375 3.25 4.0625 SE +/- 0.00233, N = 3 SE +/- 0.03101, N = 3 SE +/- 0.00029, N = 3 SE +/- 0.00224, N = 3 3.47507 3.57059 3.60538 3.61101 MIN: 3.35 MIN: 3.38 MIN: 3.52 MIN: 3.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.2 Test: Server Rack - Acceleration: CPU-only xanmod-kernel-stock-v2 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-optimized-5.8.1 0.0331 0.0662 0.0993 0.1324 0.1655 SE +/- 0.000, N = 3 SE +/- 0.002, N = 4 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 0.138 0.140 0.143 0.147
Java SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized stock-linux-kernel xanmod-kernel-stock-v2 400 800 1200 1600 2000 SE +/- 16.86, N = 4 SE +/- 2.63, N = 4 SE +/- 26.34, N = 4 SE +/- 31.69, N = 4 1998.82 1975.61 1972.21 1965.14
Java SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Dense LU Matrix Factorization xanmod-kernel-optimized-5.8.1 xanmod-kernel-stock-v2 xanmod-kernel-optimized stock-linux-kernel 1500 3000 4500 6000 7500 SE +/- 75.78, N = 4 SE +/- 80.90, N = 4 SE +/- 15.70, N = 4 SE +/- 72.44, N = 4 7009.79 6940.61 6892.47 6768.17
Java SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Sparse Matrix Multiply xanmod-kernel-optimized-5.8.1 stock-linux-kernel xanmod-kernel-stock-v2 xanmod-kernel-optimized 600 1200 1800 2400 3000 SE +/- 21.92, N = 4 SE +/- 26.42, N = 4 SE +/- 37.81, N = 4 SE +/- 12.34, N = 4 2739.93 2720.27 2701.65 2700.25
Java SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Fast Fourier Transform xanmod-kernel-optimized-5.8.1 xanmod-kernel-optimized xanmod-kernel-stock-v2 stock-linux-kernel 500 1000 1500 2000 2500 SE +/- 14.11, N = 4 SE +/- 3.49, N = 4 SE +/- 22.09, N = 4 SE +/- 6.68, N = 4 2099.41 2076.80 2067.65 2030.91
Java SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better Java SciMark 2.0 Computational Test: Monte Carlo xanmod-kernel-optimized-5.8.1 stock-linux-kernel xanmod-kernel-optimized xanmod-kernel-stock-v2 400 800 1200 1600 2000 SE +/- 15.99, N = 4 SE +/- 20.11, N = 4 SE +/- 3.71, N = 4 SE +/- 19.26, N = 4 1716.20 1704.38 1701.68 1700.61
Phoronix Test Suite v10.8.5